This documentation is a wiki. Feel free to edit or comment!
Nfsight is a client/server discovery tool running on top of Netflow. The goal is to help administrators to gain visibility over their network by automatically listing every new servers detected passively within a perimeter of interest (typically the organization network where flows are collected). The program has two components: a back-end algorithm written in Perl to analyze Netflow data, and a front-end web interface written in PHP to help administrators browsing the server activity. The back-end is a plugin for the NfSen framework (http://nfsen.sf.net).
Netflow is made of unidirectional flows. The goal of the backend [Algorithm] is to merge pairs of unidirectional flows to generate bidirectional flows. The main challenge addressed by the backend algorithm is to give the correct orientation to the bidirectional flows generated. For this, the backend algorithm uses a set of heuristics combined with Bayesian inference. As such, the algorithm learns over time and increase the accuracy of server detection.
Every unidirectional and bidirectional flows have two end points:
Note: For source end point, the Destination Port is recorded, not the Source Port which is assumed to be randomly selected. End points are labeled according to the following rules:
Note: The validity of a bidirectional flow applies only for TCP flows and consists in checking the coherence in TCP flags and number of packets.
Note: End points are also classified according to their location: internal or external (defined by the perimeter of the organization network). To save on storage space, not all end points are reported to the front-end. Currently, only internal servers and internal/external scanners are stored.
The backend algorithm works on 5 minute flow files. End points detected after each 5 minute batch are stored in a database and aggregated. There are three levels of data granularity:
When running a query on the server discovery data, a visualization table is displayed. Each line represent an end point and the table is divided in 3 sections:
Source (client/scanner) and destination (server) end points are differentiated according to the background color of the port number. Gray indicates server, while red indicates client/scanner.
Time series of end point activity are made of colored table cells. Each cell represents, according to the scale: 5 minute, 1 hour or 1 day. The saturation of the color of each cell indicates the number of flows. The cells are colored according to the following rules:
The proportion of red/blue or red/green in each square indicates the number of invalid/valid bidirectional flows collected for the represented end point and time slot.
Hovering over a table cell will pop up a summary of activity during that time window. Clicking on the cell will bring up a detailed dump of the Netflow which comprise the measurement for that period.
Users can add comments on three different objects:
A knowledge base of network services is currently under development and can be accessed on the page Settings.
Documentation: Algorithm
Documentation: Functionalities
Documentation: Installation
Anonymous