Thread: [ipt-netflow] FIRST_SWTICHED being reset after export of active flow
NetFlow iptables module for Linux kernel
Brought to you by:
aabc
From: Phillip R. <ph...@ya...> - 2015-08-14 21:10:55
|
Before I get into my question, what I'm ultimately trying to do is find the best way to count new flows started per minute. The question below is regarding a snag I hit when trying to solve this using what I thought was the most direct approach. But I'm open to other suggestions. General Netflow docs all describe how each flow can be uniquely identified by the tuple of IP source/dest address+IP source/dest port+etc. It's also advised how the default 30-minute timeout for exporting data about active flows may be sub-optimal since it means a giant burst of traffic will be exported at the conclusion of a long-lived flow, with the collector oblivious up until that point of the long-lived flow's existence. Therefore, I've lowered the active timeout in my environment to 1-minute. With these 1-minute updates now coming in for long-lived flows, I was hoping to have a way in my collector to group the records that make up the same long-lived flow, so that way I'd not be counting each flow received at the start of a new minute as a "new flow". I noticed the "FIRST_SWITCHED" field, and this showed great potential since I'd have hoped it would be set to the single timestamp when the flow was first observed in the router, then even in subsequent updates in later minutes that timestamp would be the same. Unfortunately, with the ipt-netflow, I'm not finding this to be the case: For my long-lived flow, the FIRST_SWITCHED timestamp is being updated to the beginning of each new 1-minute interval. Is this really desired behavior? I unfortunately don't have a Cisco router at my disposal to see if it does the same thing, but I'd be interested to hear if anyone else can confirm if this behavior is universal. --Phil |
From: ABC <ab...@te...> - 2015-08-14 22:40:35
|
Hello Phillip, That's interesting interpretation of FIRST_SWITCHED. The behavior ipt-netflow you described is intentional, and there is my arguments: 1. Currently, exported flow is interpreted same as Expired Flow (historically, it was expiring from CEF cache), so measurement is restarted for new packets. IF, instead of reporting timestamp of first packet of when measurement is (re)started, probe will report some other time in the past, then many statistical properties of the flow data will be lost or become hard(er) to calculate. For example, it will not be possible (or become much harder) to correctly measure data rate of the flow, because some arbitrary time will be added to it's duration. (Yes, it will be possible reconstruct correct rate if we collect all intermediate active flow exports for this flow and subtract them from last measurement, but this could be thousands and thousands flow records for many hours or even days worth data. For example, for some always active persistent connection.) 2. RFC 3954 states: 3.2. Flow Expiration [...] 3. For long-lasting Flows, the Exporter SHOULD export the Flow Records on a regular basis. This timeout SHOULD be configurable at the Exporter. This is implicitly confirms that Exported and Expired flow is synonymous for Cisco. 3. RFC 5102 5.11.3 states: The Flow was terminated for reporting purposes while it was still active, for example, after the maximum lifetime of unreported Flows was reached. While this is about IPFIX, there is draft-claise-ipfix-eval-netflow-04 stating important differences between IPFIX and NetFlow v9, which does not say there is any differenrcte on expiring/terminating flow for v9 and IPFIX. If flow is (semantically) "terminated", then what will be next is "new" flow, and not some continuation of older flow. -abc On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote: > Before I get into my question, what I'm ultimately trying to do is find the best way to count new flows started per minute. The question below is regarding a snag I hit when trying to solve this using what I thought was the most direct approach. But I'm open to other suggestions. > General Netflow docs all describe how each flow can be uniquely identified by the tuple of IP source/dest address+IP source/dest port+etc. It's also advised how the default 30-minute timeout for exporting data about active flows may be sub-optimal since it means a giant burst of traffic will be exported at the conclusion of a long-lived flow, with the collector oblivious up until that point of the long-lived flow's existence. Therefore, I've lowered the active timeout in my environment to 1-minute. > With these 1-minute updates now coming in for long-lived flows, I was hoping to have a way in my collector to group the records that make up the same long-lived flow, so that way I'd not be counting each flow received at the start of a new minute as a "new flow". I noticed the "FIRST_SWITCHED" field, and this showed great potential since I'd have hoped it would be set to the single timestamp when the flow was first observed in the router, then even in subsequent updates in later minutes that timestamp would be the same. > Unfortunately, with the ipt-netflow, I'm not finding this to be the case: For my long-lived flow, the FIRST_SWITCHED timestamp is being updated to the beginning of each new 1-minute interval. Is this really desired behavior? I unfortunately don't have a Cisco router at my disposal to see if it does the same thing, but I'd be interested to hear if anyone else can confirm if this behavior is universal. > --Phil > ------------------------------------------------------------------------------ > _______________________________________________ > ipt-netflow-users mailing list > ipt...@li... > https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users |
From: Michael K. <mic...@pl...> - 2015-08-14 23:35:00
|
I can confirm that Cisco does time stamp based on active timeouts in all cases. The flow tuple should be used to stitch long lived flows together during longer periods of time. As a general rule the flow cache "forgets" everything that was observed when it is exported. Times and flags are all reset among other things that might be considered "stateful". If you query across a large data set, group on the 7-tuple and use the minimum start interval for start time and the max end interval for the end. This will always give you the flow duration for the period specified. -Mike Krygeris Sent from my iPhone > On Aug 14, 2015, at 6:40 PM, ABC <ab...@te...> wrote: > > Hello Phillip, > > That's interesting interpretation of FIRST_SWITCHED. The behavior ipt-netflow > you described is intentional, and there is my arguments: > > 1. Currently, exported flow is interpreted same as Expired Flow (historically, > it was expiring from CEF cache), so measurement is restarted for new packets. > IF, instead of reporting timestamp of first packet of when measurement is > (re)started, probe will report some other time in the past, then many > statistical properties of the flow data will be lost or become hard(er) to > calculate. > > For example, it will not be possible (or become much harder) to correctly > measure data rate of the flow, because some arbitrary time will be added to > it's duration. (Yes, it will be possible reconstruct correct rate if we collect > all intermediate active flow exports for this flow and subtract them from last > measurement, but this could be thousands and thousands flow records for many > hours or even days worth data. For example, for some always active persistent > connection.) > > > 2. RFC 3954 states: > > 3.2. Flow Expiration > [...] > 3. For long-lasting Flows, the Exporter SHOULD export the Flow > Records on a regular basis. This timeout SHOULD be > configurable at the Exporter. > > This is implicitly confirms that Exported and Expired flow is synonymous for > Cisco. > > > 3. RFC 5102 5.11.3 states: > > The Flow was terminated for reporting purposes while it was > still active, for example, after the maximum lifetime of > unreported Flows was reached. > > While this is about IPFIX, there is draft-claise-ipfix-eval-netflow-04 > stating important differences between IPFIX and NetFlow v9, which does not say > there is any differenrcte on expiring/terminating flow for v9 and IPFIX. > > If flow is (semantically) "terminated", then what will be next is "new" flow, > and not some continuation of older flow. > > > -abc > >> On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote: >> Before I get into my question, what I'm ultimately trying to do is find the best way to count new flows started per minute. The question below is regarding a snag I hit when trying to solve this using what I thought was the most direct approach. But I'm open to other suggestions. >> General Netflow docs all describe how each flow can be uniquely identified by the tuple of IP source/dest address+IP source/dest port+etc. It's also advised how the default 30-minute timeout for exporting data about active flows may be sub-optimal since it means a giant burst of traffic will be exported at the conclusion of a long-lived flow, with the collector oblivious up until that point of the long-lived flow's existence. Therefore, I've lowered the active timeout in my environment to 1-minute. >> With these 1-minute updates now coming in for long-lived flows, I was hoping to have a way in my collector to group the records that make up the same long-lived flow, so that way I'd not be counting each flow received at the start of a new minute as a "new flow". I noticed the "FIRST_SWITCHED" field, and this showed great potential since I'd have hoped it would be set to the single timestamp when the flow was first observed in the router, then even in subsequent updates in later minutes that timestamp would be the same. >> Unfortunately, with the ipt-netflow, I'm not finding this to be the case: For my long-lived flow, the FIRST_SWITCHED timestamp is being updated to the beginning of each new 1-minute interval. Is this really desired behavior? I unfortunately don't have a Cisco router at my disposal to see if it does the same thing, but I'd be interested to hear if anyone else can confirm if this behavior is universal. >> --Phil > >> ------------------------------------------------------------------------------ > >> _______________________________________________ >> ipt-netflow-users mailing list >> ipt...@li... >> https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users > > > ------------------------------------------------------------------------------ > _______________________________________________ > ipt-netflow-users mailing list > ipt...@li... > https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users |
From: ABC <ab...@te...> - 2015-08-15 10:45:43
|
Phillip, On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote: > Before I get into my question, what I'm ultimately trying to do is > find the best way to count new flows started per minute. [...] But I'm > open to other suggestions. If you are interested only in TCP flows, you can analyse TCP_FLAGS(6) Element for presence of SYN flag. As you should know, first packet of TCP stream is marked with SYN bit. So you need only change your approach to counting only flows that is SYN marked. -abc |
From: Michael K. <mic...@pl...> - 2015-08-15 14:28:39
|
If you are looking for UDP as well, it becomes a bit harder because there is nothing that can guarantee the flow is new. If you query the data from a database and group by IPs, ports and protocols then select the minimum start time stamp for the flows, you should get a relatively accurate count of flows. You can use a modulus to normalize the timestamps into 1 minute buckets as a second step. Pseudo-sql Select srcIP,dstIP,srcPort,dstPort,protocol,min(flowstart) from database.flowsTable Group by srcIP,dstIP,srcPort,dstPort,protocol This will give you the unique TCP,UDP(and ICMP sort of) conversations and their start times. Adding a modulo function to the flow start timestamp will allow you to convert it to 1 minute resolution. Once you have done that, you will be able to count the flows in each bucket. You may have to do some work to figure out the "real" minutes of the flow because they generally are milliseconds since the system started, not absolute. -Mike Krygeris > On Aug 15, 2015, at 6:45 AM, ABC <ab...@te...> wrote: > > Phillip, > >> On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote: >> Before I get into my question, what I'm ultimately trying to do is >> find the best way to count new flows started per minute. [...] But I'm >> open to other suggestions. > > If you are interested only in TCP flows, you can analyse TCP_FLAGS(6) > Element for presence of SYN flag. As you should know, first packet of > TCP stream is marked with SYN bit. So you need only change your approach > to counting only flows that is SYN marked. > > -abc > > ------------------------------------------------------------------------------ > _______________________________________________ > ipt-netflow-users mailing list > ipt...@li... > https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users |