Call Data Record Generation: Options and Considerations Executive Summary

Call Data Record Generation: Options and
Considerations
An Industry Whitepaper
Contents
Executive Summary ................................... 1
Introduction to Call Data Record Generation .... 2
Standards-based vs. Industry-Standard......... 2
Standards-based Record Generation ............... 3
sFlow, NetFlow and IPFIX ......................... 3
sFlow .............................................. 3
NetFlow ........................................... 3
IPFIX ............................................... 4
Resource Utilization Monitoring .............. 5
Interface, Deployment and Performance
Considerations ...................................... 5
Deployment Options ............................ 6
IPDR ................................................... 7
Freeform Record Generation........................ 8
How Use Case affects Transaction Rates ......... 9
Mediation for IPFIX+DPI vs. Freeform+DPI ..... 9
Executive Summary
Network record generation offers information to help with
understanding network usage and security risks, as well as
metrics that can be used to optimize network performance,
business systems, and quality of service.
This paper explores the various method used to generate call
data records, both standards-based and proprietary.
A record generation solution focuses on extracting network
traffic records by targeting IP flow information elements defined
in open standards and/or proprietary user documentation. A
collector or mediating element receives the stream of IP flow
records where they can be processed into the desired output
format and, for some use cases, forwarded on to another system.
Whether standards-based or not, the choice of which method
and deployment type to use when extracting data from the
network depends on the specific use case and approach to record
generation. In the end it all comes down to mediation – how the
solution platform manages and processes an overwhelming
landscape of data into a subset of targeted information for use
by a downstream application.
IPFIX .............................................. 10
Freeform......................................... 10
Conclusion ............................................. 10
Summary of Record Generation Techniques . 10
Related Resources ................................ 11
Version 2.0
Network Record Generation
Introduction to Call Data Record Generation
When communications service providers (CSPs) look at the tremendous amount of data that flows
through their networks on a daily basis and think about extracting relevant records, the issue is clearly
one of “big data”. CSPs extract basic Layer-3 information from their networks as call data records
(CDRs). In today’s modern networks, the term call data record (CDR) is interchangeable with charging
data record (CDR) or usage data record (UDR), especially when describing Layer-7 use cases affecting
Internet applications. Big data can mean different things in different contexts, but for Internet service
delivery the issue of record generation is about what to extract, how to extract it, and why the
information is needed in the first place.
The ability to generate call data records offers information to help with understanding subscriber usage
and security risks, as well as metrics that can be used to optimize network performance, business
systems, and quality of service. Depending on the implementation, these records can also be used for
billing and auditing purposes in support of charging applications. At the highest level, a CSP extracts
data records about network traffic for export to other systems in support of the following use cases:
•
•
•
•
•
Generating business intelligence reports for insight into network optimization
Providing records in support of auditing (i.e., bill verification)
Communicating usage updates to an offline charging system (OFCS) for post-paid charging
Maintaining a parallel stream of usage records in case prepaid billing systems fail
Network security monitoring and mitigation
Whether standards-based or proprietary 1, a record generation solution focuses on extracting network
traffic records by targeting IP flow information elements defined in open standards and/or proprietary
user documentation. 2 A collector or mediating element receives the stream of IP flow records where
they can be processed into the desired output format and, for some use cases, forwarded on to another
system. The type of flow information contained in network records depends on the use case, and the
following are some typical examples:
•
•
•
•
•
Subscriber ID
Session ID
Source and destination IP address
Application type
Device Type
•
•
•
•
•
Service type
Rating group
Vendor ID
Total upstream bytes
Total downstream bytes
This paper focuses on the standards-based and freeform methods currently available for network
record generation with an examination of the characteristics of each method. The choice of which
method and deployment type to use depends on the specific use case and the impact its
implementation will have on network performance and dimensioning.
Standards-based vs. Industry-Standard
“Standards-based” is defined in this paper as a protocol openly described in RFCs and IETF-endorsed
documents. 3 Records generated from the network are often exported to external systems 4 and from
there can take many forms both standards-based and proprietary. 5
1
Currently, the only solution with fully-integrated record generation uses a freeform policy model.
An information element can be thought of as “a fact about a particular IP service flow at a particular point in time”.
3
Within this paper’s context, a standards-based protocol is an openly-described method of obtaining network records and does
not necessarily have to be an industry standard. For example, NetFlow version 9 is not an industry standard protocol but is based
2
2
Network Record Generation
Standards-based Record Generation
A standards-based system of record generation employs a method of record extraction described in
official documents often hosted or endorsed by industry standards bodies such as the IETF. A standardsbased system offers a predetermined set of configuration points to generate records for supported use
cases.
sFlow, NetFlow and IPFIX
These three are the most common record generation protocols that are not specific to a particular
access type (i.e., IPDR in cable) and leverage a described standard or RFC to configure both end points
of the solution.
sFlow
Short for “sampled flow”, sFlow is a protocol for extracting packet records at Layer-2 of the OSI model,
and is mainly used to sample information for basic monitoring use cases. The sFlow protocol is not an
industry standard and was originally developed by InMon corporation 6, and is now sold as a feature in
network transport equipment from many different manufacturers. 7 Unlike NetFlow and IPFIX, sFlow has
no notion of service flows and only offers periodic sampling of flows (packets at layer-2) and samples of
counters (periodic time-based measurement).
NetFlow
NetFlow is a protocol for generating flow records originally developed by Cisco Systems as a caching
system, and is now widely used to collect statistics on IP traffic information. NetFlow has never been
an official industry standard 8; however, many would agree that NetFlow has become the unofficial
industry standard due to its widespread use. Version 9 of NetFlow RFC 3954 9 forms the basis for the
IPFIX protocol which is described in RFCs as an IETF standard. There are also many NetFlow equivalents
sold as proprietary features in network transport equipment from several vendors. 10
In the most common implementation, when enabled and configured on a switch or router, NetFlow
collects statistics on the IP traffic passing through that device. The flow data can then be exported to a
mediation/collection system.
NetFlow Record Information
As the name suggests, NetFlow aggregates packet statistics to report specifically on IP flow information
such as source and destination IP addresses, IP protocol, source and destination ports, and Type of
Service. Additional information can be tracked per flow, including Inbound interface and up to 79 other
field types for information elements described in RFC 3954. 11
on the open informational document RFC 3954, which fully describes the parameters, configuration and limitations of the
protocol for use by third parties, making NetFlow a “standards-based” protocol.
4
Records can also be used for real-time traffic management use cases such as QoS control and network security.
5
For example, as a standardized output format such as CSV (comma-separated values) or as graphs in a proprietary GUI that
displays network business intelligence reports.
6
See opening comments from RFC 3176.
7
Wikipedia offers a comprehensive list of sFlow vendors.
8
See opening comments from RFC 3954.
9
An RFC that describes a standard that isn’t endorsed is often called an “informational” document.
10
Wikipedia offers a comprehensive list of what it calls NetFlow equivalents. These are essentially proprietary implementations
of the NetFlow standard.
11
See section 5 of RFC 3954.
3
Network Record Generation
A NetFlow record reports a wealth of information about traffic in a given flow 12 in the purely
standards-based implementation up to Layer-3. NetFlow version 9 has extensibility to include fields not
described in the RFC. This provides an opportunity to support use cases not covered by the official
protocol description. However, a custom solution configuration and interoperability effort comparable
to a freeform solution, since the “custom” aspect is often a proprietary method developed in-house or
an outside vendor. 13
Wikipedia offers the following example of a NetFlow record showing three flows:
Src IP addr. | Dst IP addr.
| Number | Number
198.168.1.12 | 10.5.12.254
192.168.1.27 | 10.5.12.23
192.168.1.56 | 10.5.12.65
| Next Hop addr. | Packet | Bytes
| 192.168.1.1
| 192.168.1.1
| 192.168.1.1
| 5009
| 748
| 5
| 5344385
| 388934
| 6534
Figure 1 – Example NetFlow record
IPFIX
The IP Flow Information eXport protocol (IPFIX) is the first common and universal industry standard of
export for Internet flow information from routers, probes and other transport devices, and is defined
by several RFCs 14.
IPFIX Record Information
Based on the NetFlow protocol, the field export format described in RFC 3954 has evolved with IPFIX
into the 238 Information Element field types defined in the RFC standards documents. Many of these
field types are defined as “reserved” to maintain compatibility with NetFlow version 9 and must be
referenced in RFC 3954 15. IPFIX has the extensibility of NetFlow version 9 with a vendor ID information
element that specifies a custom application of the IPFIX protocol. The following example record is
provided by Wikipedia:
Source
Destination
Packets
-----------------------------------------192.168.0.201 192.168.0.1
235
192.168.0.202 192.168.0.1
42
12
See Wikipedia for an overview of the Internet OSI model.
Detailed information about Internet traffic is configured using the fields described in RFC 3954.
14
RFCs 3917, 5101 and 5102 and 5103, 5472, 7011 – 7015.
15
IPFIX can be thought of as backwards compatible to NetFlow version 9.
13
4
Network Record Generation
Figure 2 – Example IPFIX record
Resource Utilization Monitoring
NetFlow and IPFIX are accurate enough to perform resource utilization monitoring, but they cannot be
solely relied upon in charging applications for end user billing and, by extension, bill verification and
auditing. 16 The standards do not offer sufficient resiliency and safeguards to ensure reliable data
export to meet the billing accuracy requirements described in RFC 3917 17.
Interface, Deployment and Performance Considerations
Enabling purely standards-based record generation on a device is usually a simple configuration change
on the network transport device requiring very little effort.
Record generation with sFlow, NetFlow and IPFIX is typically undertaken by routers and switches as
part of the production network. The sFlow protocol is designed by nature to have a minimal impact on
network transport equipment because it only samples IP flows periodically in support of basic
monitoring use cases. NetFlow and IPFIX do have an impact on the devices where they are enabled,
with the severity depending on the specific use case(s). The processor and memory load can cause
16
17
See section 4.2 of RFC 5472.
See section 5 and 6 of RFC 3917.
5
Network Record Generation
severe service degradation, normally measured as an increase in the device CPU and memory
utilization to track and report on specific metrics. 18
NetFlow and IPFIX can be enabled on a per-interface basis to limit load on transport elements. IP filters
can also limit which packet types can be observed by NetFlow to further reduce the strain. To further
reduce the performance impact, Cisco introduced a sampling feature for NetFlow on certain transport
equipment products. 19 IPFIX also allows sampling and adds the ability to specify variable length fields.
Deployment Options
When a CSP wants to generate records from the network, the most common implementation is to use
pre-existing internet data transport devices such as routers and switches to gather and forward metrics
to one or more monitoring stations for offline processing as shown by Figure 3 20.
Figure 3 – Standards-based record generation using switches and routers
However, as shown by Figure 4 there is another interface approach where an offline device observes or
taps the network data flow and then generates records using the NetFlow or IPFIX protocol standards.
This has the obvious advantage of avoiding a performance hit to network transport elements, although
it requires a separate offline element that only generates or displays records with no ability to directly
manage traffic 21. The offline element may also serve as the collector device that processes the records
into a desired reporting format.
Figure 4 – Standards-based record generation using network tap and offline element
18
See the Wikipedia entry for NetFlow, and in particular the talk page for the NetFlow entry.
See Sampled NetFlow in the Wikipedia entry.
20
The Mediator/Collector element can be integrated with operational support and billing systems.
21
All of the things you might want to do with a PCEF or TDF element for policy control, for example.
19
6
Network Record Generation
Figure 5 shows a third implementation where NetFlow records are fed from transport elements to an
inline data plane device directly or to a control plane device that can signal the inline device to
perform traffic management. 22 Inline devices offer the ability to directly manage traffic based on realtime NetFlow record information for such use cases as high-level QoS control and network security
monitoring and mitigation.
Figure 5 – Standards-based record generation for inline element
Record Mediator/Collector
In all three deployments shown above there must be a mediator/collector element that processes
records into one or more final formats.
IPDR
An IP Detail Record (IPDR) provides information about IP-based service usage and other activities that
can be used by Operational Support Systems (OSS) and Business Support Systems (BSS). IPDR is overseen
by the TM Forum, a non-profit industry standards organization primarily for service providers working
with cable networks. The IPDR specifications include requirements for record collection, encoding, and
the transport protocols used to exchange IPDR records. According to specifications, IPDR can be used
for business intelligence reporting, network configuration, health monitoring, service assurance and bill
resolution.
Figure 6 – IPDR Record Generation in Cable Networks
22
This includes Policy Charging and Control (PCC) implementations. In a PCC implementation the mediator element could be a
DPI-based PCEF or TDF and the collector element could be a PCRF. The inline element cannot typically act as the collector for
practical performance reasons (i.e., the CPU is needed elsewhere).
7
Network Record Generation
Freeform Record Generation
The most obvious example of a freeform record generation solution is one that leverages existing DPIbased elements that intersect network traffic and uses a proprietary model completely separate from
the typical standards-based description. Such elements, such as a PCEF or TDF, meter traffic according
to strictly laid PCC standards. Such elements offer a direct link to deep information, including
application data, to create and export network records up to Layer-7. This approach moves concerns
about performance and dimensioning from transport elements to the DPI element.
As shown by Figure 7, the mediation function is subsumed into the intersecting device, which generates
records directly from the data stream that can be exported for various use cases. As with the inline
deployment shown in the previous section, the same information that is used to generate records can
also be used to perform real-time traffic management functions such as QoS control and network
security. However, since the records are generated from the PCC standards-defined metering that
ensures accurate charging, they can be used in support of bill verification use cases.
Figure 7 – Inline, DPI-based Record Generation with PCEF & PCRF elements
Freeform Record Information
Given the right policy model, a freeform solution offers the ability to freely configure records and
leverage the pre-existing exposure of detailed Layer-7 application information for record generation
use cases.
The usage data records (UDRs) for freeform record generation do not follow a binary format as seen
with the NetFlow and IPFIX examples. Instead, a compressed CSV human-readable format is used,
which is much simpler to manipulate and consume by IT systems. Because the solution is not based on
any written standard, some effort is required to configure its operation by referencing proprietary
documentation. However, since record extraction and output are highly configurable, what would be
considered “custom records” in a NetFlow environment require no special effort when using freeform
policy with fully-integrated data record generation.
Figure 8 shows an example of custom records to collect subscriber volume usage (total, sent and
received) on a per-session basis for bill dispute resolution related to postpaid charging. The freeform
script language specifies Information Elements for extraction in flow records that, in this example, are
efficiently grouped by subscriber.
8
Network Record Generation
RecordType,RecordStatus,RecordNumber,StartTime,EndTime,AcctSessionId,Subscrib
erId,FramedIp,ServiceId,TotalBytes,TransmittedBytes,ReceivedBytes
[For IP 72.12.156.99]:
session_start,0,0,2011:3:25:16:44:14,,1208786019~130108585,001311B8A12E,72.12
.156.99,[0],0,0,0,0
usage_start,0,0,2011:3:25:16:44:15,2011:3:25:16:44:15,1208786019~130108585,00
1311B8A12E,72.12.156.99,[30],88,0,88,0
usage_stop,4,*,2011:3:25:17:42:54,2011:3:25:17:42:54,1208786019~130108585,001
311B8A12E,72.12.156.99,[30],0,0,0,0
session_stop,2,*,2011:3:25:16:44:14,2011:3:25:17:42:54,1208786019~130108585,0
01311B8A12E,72.12.156.99,[0],0,0,0,0
[For IP 72.12.141.157]:
usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143,
001596260DCC,72.12.141.157,[5],604,132,472,0
usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143,
001596260DCC,72.12.141.157,[5],66,0,66,0
usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143,
001596260DCC,72.12.141.157,[5],76,76,0,0
usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143,
001596260DCC,72.12.141.157,[5],76,76,0,0
usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143,
001596260DCC,72.12.141.157,[5],66,66,0,0
usage_stop,4,*,2011:3:25:17:43:15,2011:3:25:17:43:15,1208782237~130108143,001
596260DCC,72.12.141.157,[9],0,0,0,0
usage_stop,4,*,2011:3:25:17:43:15,2011:3:25:17:43:15,1208782237~130108143,001
596260DCC,72.12.141.157,[5],0,0,0,0
usage_stop,4,*,2011:3:25:17:43:15,2011:3:25:17:43:15,1208782237~130108143,001
596260DCC,72.12.141.157,[30],0,0,0,0
session_stop,2,*,2011:3:25:15:30:39,2011:3:25:17:43:15,1208782237~130108143,0
01596260DCC,72.12.141.157,[0],0,0,0,0
Figure 8 – Example records created through custom generation solution
How Use Case affects Transaction Rates
As noted at the beginning of this paper, a CSP’s ability to extract desired data from the network will
depend on the specific use case and approach to record generation. In the end it all comes down to
mediation – how the solution platform manages and processes an overwhelming landscape of data into
a subset of targeted information for use downstream. 23
Use cases become much more interesting when a DPI device is integrated to support data records for
Layer-7. In the case of NetFlow and IPFIX, the DPI solution uses the extensibility option in the standards
to create a custom set of Layer-7 records. With a proprietary, freeform solution, the DPI solution
supports fully-integrated data records built directly out of the product framework. 24
Mediation for IPFIX+DPI vs. Freeform+DPI
Consider the following scenario: A mobile operator wants to generate subscriber-based Layer-7 records
with mobile device information where the average packet core traffic is 2Gbps. Raw records are
23
For a full exploration of this issue, see the Vanilla Plus article Style, substance and big data.
In other words there is no interoperability or custom standards work to make records work with the existing set of Layer-7
records – such features are built into the product.
24
9
Network Record Generation
processed by a mediation system into a final output format for a customer experience management
(CEM) solution. Let’s examine the transaction rates and raw record output for a solution that uses a
DPI element following the IPFIX approach versus a proprietary solution where the DPI element uses
freeform policy to group all flows by subscriber in its state engine.
IPFIX
Since IPFIX is essentially a flow-based record reporting standard, one would expect one record to be
generated per flow (5-tuple, Layer-3). The concept is that IPFIX reports information on all active flows
within a pre-defined interval – in this case let’s make the interval 60 seconds. The IMEI data point
which identifies a specific subscriber device would be populated using the standard’s extensibility
feature to create the desired record. To set a baseline, assume 1Gbps of mobile network data carries
about 7,000 flows per second.
In this case the solution would generate about 840,000 records per minute, or 12,600,000 records every
15 minutes, with each record providing 5- tuple, Layer-3 information with the addition of the IMEI
custom field. Over a 24 hour period the solution would generate 1.2 billion raw records for processing
by the mediation system.
Freeform
A freeform record generation solution that can generate one record per subscriber and device does not
need to generate a record for every IP flow because it generates records for flows grouped by
subscriber, not simply by IP source and destination pairs. In this case, 1Gbps of network throughput
would typically indicate about 500,000 subscribers.
Since the records generated are tied to individual subscribers rather than every single IP flow, over a
24 hour period about 48,000,000 raw records are generated for processing by the mediation system.
Conclusion
At the highest level, a CSP extracts data records about network traffic for export to other systems in
support of one or more of the following use cases:
•
•
•
•
•
Generating business intelligence reports for insight into network optimization
Providing records in support of auditing (i.e., bill verification)
Communicating usage updates to an OFCS for post-paid charging
Maintaining a parallel stream of usage records in case prepaid billing systems fail
Network security monitoring and mitigation
When choosing a record generation solution, CSPs must weigh the desired quality and quantity of
information against the cost of implementation in terms of performance and dimensioning.
Summary of Record Generation Techniques
The following table summarizes the details of the record generation techniques presented by this
paper:
10
Network Record Generation
Technique
Standards-based,
transport equipment
record generation
Standards-based,
offline record
generation
Standards-based,
inline, DPI-based,
record generation
Description
-Easy to enable and configure
-Transport element dimensioning considerations
-Transport element performance impacted
-Generally easy to enable and configure
-Some custom work, proprietary documentation
-Requires additional element
-Offline element dimensioning considerations
-No transport element performance impact
-Large effort to enable and configure
-A lot of custom work, proprietary feature
configuration
-DPI and transport element dimensioning
consideration
-Requires inline element
-Can leverage existing DPI solution or PCC setup
-Transport element performance impacted
Sandards-based, IPDR
record generation
-Specific to cable networks
-Transport element dimensioning consideration
-Transport element performance impacted
Proprietary, inline, DPIbased record
generation
-Proprietary feature configuration
-Requires inline element
-Inline element dimensioning consideration
-Leverages existing DPI solution or PCC setup
-No transport element performance impact
Use Cases
-Reporting up to Layer-3
-Reporting up to Layer-3
-Reporting and use cases
up to Layer-7
-Network security
-QoS control
-Reporting up to Layer-?
-Service assurance
-Configuration &
monitoring
-Bill auditing/verification
-Reporting and use cases
up to Layer-7
-Network security
-QoS control
-Charging support
-Bill auditing/verification
Related Resources
See the Sandvine technology showcase Meaningful Data Records with Minimal Overhead. Please also
see the Sandvine technology showcase SandScript - The Advantage of Freeform Policy.
11