Thread: [ipt-netflow] FIRST_SWTICHED being reset after export of active flow

NetFlow iptables module for Linux kernel

Brought to you by: aabc

ipt-netflow-users

[ipt-netflow] FIRST_SWTICHED being reset after export of active flow

From: Phillip R. <ph...@ya...> - 2015-08-14 21:10:55

Before I get into my question, what I'm ultimately trying to do is find the best way to count new flows started per minute. The question below is regarding a snag I hit when trying to solve this using what I thought was the most direct approach. But I'm open to other suggestions.
General Netflow docs all describe how each flow can be uniquely identified by the tuple of IP source/dest address+IP source/dest port+etc. It's also advised how the default 30-minute timeout for exporting data about active flows may be sub-optimal since it means a giant burst of traffic will be exported at the conclusion of a long-lived flow, with the collector oblivious up until that point of the long-lived flow's existence. Therefore, I've lowered the active timeout in my environment to 1-minute.
With these 1-minute updates now coming in for long-lived flows, I was hoping to have a way in my collector to group the records that make up the same long-lived flow, so that way I'd not be counting each flow received at the start of a new minute as a "new flow". I noticed the "FIRST_SWITCHED" field, and this showed great potential since I'd have hoped it would be set to the single timestamp when the flow was first observed in the router, then even in subsequent updates in later minutes that timestamp would be the same.
Unfortunately, with the ipt-netflow, I'm not finding this to be the case: For my long-lived flow, the FIRST_SWITCHED timestamp is being updated to the beginning of each new 1-minute interval. Is this really desired behavior? I unfortunately don't have a Cisco router at my disposal to see if it does the same thing, but I'd be interested to hear if anyone else can confirm if this behavior is universal.
--Phil

Re: [ipt-netflow] FIRST_SWTICHED being reset after export of active flow

From: ABC <ab...@te...> - 2015-08-14 22:40:35

Hello Phillip,

That's interesting interpretation of FIRST_SWITCHED. The behavior ipt-netflow
you described is intentional, and there is my arguments:

  1. Currently, exported flow is interpreted same as Expired Flow (historically,
  it was expiring from CEF cache), so measurement is restarted for new packets.
  IF, instead of reporting timestamp of first packet of when measurement is
  (re)started, probe will report some other time in the past, then many
  statistical properties of the flow data will be lost or become hard(er) to
  calculate.

  For example, it will not be possible (or become much harder) to correctly
  measure data rate of the flow, because some arbitrary time will be added to
  it's duration. (Yes, it will be possible reconstruct correct rate if we collect
  all intermediate active flow exports for this flow and subtract them from last
  measurement, but this could be thousands and thousands flow records for many
  hours or even days worth data. For example, for some always active persistent
  connection.)

  2. RFC 3954 states:

    3.2. Flow Expiration
      [...]
      3. For long-lasting Flows, the Exporter SHOULD export the Flow
         Records on a regular basis.  This timeout SHOULD be
         configurable at the Exporter.

  This is implicitly confirms that Exported and Expired flow is synonymous for
  Cisco.

  3. RFC 5102 5.11.3 states:

    The Flow was terminated for reporting purposes while it was
    still active, for example, after the maximum lifetime of
    unreported Flows was reached.

  While this is about IPFIX, there is draft-claise-ipfix-eval-netflow-04
  stating important differences between IPFIX and NetFlow v9, which does not say
  there is any differenrcte on expiring/terminating flow for v9 and IPFIX.

  If flow is (semantically) "terminated", then what will be next is "new" flow,
  and not some continuation of older flow.

-abc

On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote:
> Before I get into my question, what I'm ultimately trying to do is find the best way to count new flows started per minute. The question below is regarding a snag I hit when trying to solve this using what I thought was the most direct approach. But I'm open to other suggestions.
> General Netflow docs all describe how each flow can be uniquely identified by the tuple of IP source/dest address+IP source/dest port+etc. It's also advised how the default 30-minute timeout for exporting data about active flows may be sub-optimal since it means a giant burst of traffic will be exported at the conclusion of a long-lived flow, with the collector oblivious up until that point of the long-lived flow's existence. Therefore, I've lowered the active timeout in my environment to 1-minute.
> With these 1-minute updates now coming in for long-lived flows, I was hoping to have a way in my collector to group the records that make up the same long-lived flow, so that way I'd not be counting each flow received at the start of a new minute as a "new flow". I noticed the "FIRST_SWITCHED" field, and this showed great potential since I'd have hoped it would be set to the single timestamp when the flow was first observed in the router, then even in subsequent updates in later minutes that timestamp would be the same.
> Unfortunately, with the ipt-netflow, I'm not finding this to be the case: For my long-lived flow, the FIRST_SWITCHED timestamp is being updated to the beginning of each new 1-minute interval. Is this really desired behavior? I unfortunately don't have a Cisco router at my disposal to see if it does the same thing, but I'd be interested to hear if anyone else can confirm if this behavior is universal.
> --Phil

> ------------------------------------------------------------------------------

> _______________________________________________
> ipt-netflow-users mailing list
> ipt...@li...
> https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users

Re: [ipt-netflow] FIRST_SWTICHED being reset after export of active flow

From: Michael K. <mic...@pl...> - 2015-08-14 23:35:00

I can confirm that Cisco does time stamp based on active timeouts in all cases.
The flow tuple should be used to stitch long lived flows together during longer periods of time. 
As a general rule the flow cache "forgets" everything that was observed when it is exported. Times and flags are all reset among other things that might be considered "stateful".
If you query across a large data set, group on the 7-tuple and use the minimum start interval for start time and the max end interval for the end.
This will always give you the flow duration for the period specified.

-Mike Krygeris


Sent from my iPhone

> On Aug 14, 2015, at 6:40 PM, ABC <ab...@te...> wrote:
> 
> Hello Phillip,
> 
> That's interesting interpretation of FIRST_SWITCHED. The behavior ipt-netflow
> you described is intentional, and there is my arguments:
> 
>  1. Currently, exported flow is interpreted same as Expired Flow (historically,
>  it was expiring from CEF cache), so measurement is restarted for new packets.
>  IF, instead of reporting timestamp of first packet of when measurement is
>  (re)started, probe will report some other time in the past, then many
>  statistical properties of the flow data will be lost or become hard(er) to
>  calculate.
> 
>  For example, it will not be possible (or become much harder) to correctly
>  measure data rate of the flow, because some arbitrary time will be added to
>  it's duration. (Yes, it will be possible reconstruct correct rate if we collect
>  all intermediate active flow exports for this flow and subtract them from last
>  measurement, but this could be thousands and thousands flow records for many
>  hours or even days worth data. For example, for some always active persistent
>  connection.)
> 
> 
>  2. RFC 3954 states:
> 
>    3.2. Flow Expiration
>      [...]
>      3. For long-lasting Flows, the Exporter SHOULD export the Flow
>         Records on a regular basis.  This timeout SHOULD be
>         configurable at the Exporter.
> 
>  This is implicitly confirms that Exported and Expired flow is synonymous for
>  Cisco.
> 
> 
>  3. RFC 5102 5.11.3 states:
> 
>    The Flow was terminated for reporting purposes while it was
>    still active, for example, after the maximum lifetime of
>    unreported Flows was reached.
> 
>  While this is about IPFIX, there is draft-claise-ipfix-eval-netflow-04
>  stating important differences between IPFIX and NetFlow v9, which does not say
>  there is any differenrcte on expiring/terminating flow for v9 and IPFIX.
> 
>  If flow is (semantically) "terminated", then what will be next is "new" flow,
>  and not some continuation of older flow.
> 
> 
> -abc
> 
>> On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote:
>> Before I get into my question, what I'm ultimately trying to do is find the best way to count new flows started per minute. The question below is regarding a snag I hit when trying to solve this using what I thought was the most direct approach. But I'm open to other suggestions.
>> General Netflow docs all describe how each flow can be uniquely identified by the tuple of IP source/dest address+IP source/dest port+etc. It's also advised how the default 30-minute timeout for exporting data about active flows may be sub-optimal since it means a giant burst of traffic will be exported at the conclusion of a long-lived flow, with the collector oblivious up until that point of the long-lived flow's existence. Therefore, I've lowered the active timeout in my environment to 1-minute.
>> With these 1-minute updates now coming in for long-lived flows, I was hoping to have a way in my collector to group the records that make up the same long-lived flow, so that way I'd not be counting each flow received at the start of a new minute as a "new flow". I noticed the "FIRST_SWITCHED" field, and this showed great potential since I'd have hoped it would be set to the single timestamp when the flow was first observed in the router, then even in subsequent updates in later minutes that timestamp would be the same.
>> Unfortunately, with the ipt-netflow, I'm not finding this to be the case: For my long-lived flow, the FIRST_SWITCHED timestamp is being updated to the beginning of each new 1-minute interval. Is this really desired behavior? I unfortunately don't have a Cisco router at my disposal to see if it does the same thing, but I'd be interested to hear if anyone else can confirm if this behavior is universal.
>> --Phil
> 
>> ------------------------------------------------------------------------------
> 
>> _______________________________________________
>> ipt-netflow-users mailing list
>> ipt...@li...
>> https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> ipt-netflow-users mailing list
> ipt...@li...
> https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users

Re: [ipt-netflow] FIRST_SWTICHED being reset after export of active flow

From: ABC <ab...@te...> - 2015-08-15 10:45:43

Phillip,

On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote:
> Before I get into my question, what I'm ultimately trying to do is
> find the best way to count new flows started per minute. [...] But I'm
> open to other suggestions.

If you are interested only in TCP flows, you can analyse TCP_FLAGS(6)
Element for presence of SYN flag. As you should know, first packet of
TCP stream is marked with SYN bit. So you need only change your approach
to counting only flows that is SYN marked.

-abc

Re: [ipt-netflow] FIRST_SWTICHED being reset after export of active flow

From: Michael K. <mic...@pl...> - 2015-08-15 14:28:39

If you are looking for UDP as well, it becomes a bit harder because there is nothing that can guarantee the flow is new.
If you query the data from a database and group by IPs, ports and protocols then select the minimum start time stamp for the flows, you should get a relatively accurate count of flows.
You can use a modulus to normalize the timestamps into 1 minute buckets as a second step.
Pseudo-sql
Select srcIP,dstIP,srcPort,dstPort,protocol,min(flowstart) 
from database.flowsTable
Group by
srcIP,dstIP,srcPort,dstPort,protocol

This will give you the unique TCP,UDP(and ICMP sort of) conversations and their start times.
Adding a modulo function to the flow start timestamp will allow you to convert it to 1 minute resolution. 
Once you have done that, you will be able to count the flows in each bucket. 
You may have to do some work to figure out the "real" minutes of the flow because they generally are milliseconds since the system started, not absolute.

-Mike Krygeris 

> On Aug 15, 2015, at 6:45 AM, ABC <ab...@te...> wrote:
> 
> Phillip,
> 
>> On Fri, Aug 14, 2015 at 09:07:48PM +0000, Phillip Rzewski wrote:
>> Before I get into my question, what I'm ultimately trying to do is
>> find the best way to count new flows started per minute. [...] But I'm
>> open to other suggestions.
> 
> If you are interested only in TCP flows, you can analyse TCP_FLAGS(6)
> Element for presence of SYN flag. As you should know, first packet of
> TCP stream is marked with SYN bit. So you need only change your approach
> to counting only flows that is SYN marked.
> 
> -abc
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> ipt-netflow-users mailing list
> ipt...@li...
> https://lists.sourceforge.net/lists/listinfo/ipt-netflow-users