Re: [wallfire-users] Needing help with $count filter option

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Thu, Sep 15, 2005 at 02:36:45PM -0600, James Lay wrote:

 Hi,

> If it were me, I would have it position dependent,i.e. :

> wflogs -f ''$start_time >=3D [00:00:00] && $count > 6'

> filters before count, and

> wflogs -f '$count > 6 && $start_time >=3D [00:00:00]'

> will count before filter.

Wow, seems like there's been some misunderstanding somewhere.

Well... the filtering expression is a mathematical expression, so it
has to obey to fundamental logic (and its operators, like "and",
"or", etc.) which among others implies that a && b is equivalent to
b && a, for example.

Besides, here is the current wflogs process:
- parse the logs (for netfilter logs lines, there are only one packet
  per line, so count =3D=3D 1 for each log entry/packet)
- logs entries are filtered (-f)
- logs entries are processed/changed (summarized, obfuscated, sorted,
  etc.). So it entries are summarized, after that process, similar entrie=
s
  are grouped into one entry, with a count reflecting the number of
  original entries (packets) that formed the group
- logs entries are "printed"

So there is not really a counting process by itself. Rather a process
that may update counts accordingly, among other things.
Take ipfilter logs, for example. Ipfilter log lines can have
a count > 1, which means that ipfilter can aggregate similar packets
in one log line already.
What you call counting in wflogs is in fact aggregating similar entries
(and increment the count accordingly, of course).
Filtering is definitely a clearly separated operation, which can take
place before or after (both make sense, and are different) this aggregati=
on.

Not only I'm unable to think of any algorithm that would enable to mix
the two operations (aggregating and filtering), but even if it was
technically possible, I would consider it very confusing to introduce
position dependent concepts in things that are position independant
(a && b <=3D> b && a) by essence.

The names you propose (--count-before-filter and
--filter-before-count) reflect your (biased) view, which seems to be
very "count-centric", whereas the packet counter is only one of
the many parameters one would want to base one's filter on.

> Just some thoughts.

I was proposing --filter-after-parsing and --filter-before-output
but it doesn't really reflect that it occurs _just after_ parsing
(before any additional processing) and _just before_ output (after
processing, if there actually is one).
That can be documented in the man page, though.

Or maybe --filter-before-processing and --filter-after-processing
would be more explicit... Or --filter-before-mangling and
--filter-after-mangling, I don't know.

Anyway, I'll release a version with these two options soon, like I
said. I guess you'll be interested by the second. Thanks for having
inspired this future improvment.

Cheers,

 Herv=E9

> On Thu, 15 Sep 2005 20:38:15 +0200
> Herve Eychenne <rv...@wa...> wrote:

> > On Thu, Sep 15, 2005 at 02:03:26AM -0600, James Lay wrote:
> >=20
> >  Hi!
> >=20
> > > Here's what I'm trying to do:
> >=20
> > > wflogs -i netfilter -f '$start_time >=3D [00:00:00] && $count > 1' =
-o
> > > html --sort=3Ddport,-count --resolve=3D0 --whois=3D0  /var/log/kern=
el >
> > > test.html
> >=20
> > > The above yields nothing at all :(  If I remove the $count > 1 then
> > > I get all sorts of info...including a lot of things that have count=
s
> > > above one.  Am I missing something?  Help!
> >=20
> > Oh, yes.
> > Filtering currently takes place before summary.
> > So, as netfilter logs lines concern only one packet at a time, $count
> > is always equal to 1 (for netfilter).
> >=20
> > I guess filtering _after_ summary would make sense too...
> > so we should probably enable both.
> >=20
> > Now, the question is : how would we name the long options so that it
> > is clear that
> > - the first filter is done before summary (or any other operation suc=
h
> >   as sort, obfusctation, etc...). In fact, it is done just after
> >   parsing, so maybe a name like --filter-after-parsing would be good
> > - the second filter is done after summary (and all), so a name like
> >   --filter-before-output would be good.
> >=20
> > Now, we must keep backward compatibility, by keeping the old -f
> > letter.
> >=20
> > So once --filter-before-output is implemented (which I intend to do i=
n
> > the very next days as it is only a few lines of code), I'll have to
> > choose if it would be better to assign -f to --filter-after-parsing
> > (which it was until now) or to --filter-before-output (-F been create=
d
> > and assigned to the other).
> > Assigning it to --filter-after-parsing would ensure strict
> > compatibility, but I don't know if it would be that clever, as that
> > would not make so much difference for most users to assign -f to
> > --filter-before-output, and it's probably what people would
> > intuitively expect (as most of them want to use the summary option).
> >=20
> > What do you think?
> >=20
> >  Herve
> >=20
> > --=20
> >  _
> > (=B0=3D  Herv=E9 Eychenne
> > //)
> > v_/_ WallFire project:  http://www.wallfire.org/

 Herve

--=20
 _
(=B0=3D  Herv=E9 Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/