bogofilter makes bad headers with -p -e for -vv+
Fast Bayesian spam filter along lines suggested by Paul Graham
Brought to you by:
m-a
Using bogofilter 1.0.1 in passthrough mode, using
"bogofilter -p -e -l" works fine; it also works fine
with -v.
However, if that same command is used, but with
either -vv, -vvv, or -vvvv the header is created
incorrectly and then some of the verbose output that
should be in the header essentially gets mangled into
the body of the message.
It looks like a blank line is being inserted part way
through the verbose output, but I haven't tracked it
down more than that.
Logged In: YES
user_id=30510
Hi Wesley,
I'm unable to reproduce this problem. Using one of the test
messages in src/tests/good.mbx, I ran bogofilter with
different numbers of verbose flags:
for V in v vv vvv vvvv ; do
bogofilter -C -p -e -l -$V -I good.d/msg.n.04.txt >
msg.n.04.$V ; ls -l msg.n.04.$V ; done
-rw-r--r-- 1 relson relson 2645 Jan 30 07:37 msg.n.04.v
-rw-r--r-- 1 relson relson 13496 Jan 30 07:37 msg.n.04.vv
-rw-r--r-- 1 relson relson 13496 Jan 30 07:37 msg.n.04.vvv
-rw-r--r-- 1 relson relson 13496 Jan 30 07:37 msg.n.04.vvvv
As you can see, the same file was generated with 2, 3, or 4
v's. If you gzip and email your problem message to me (and
your bogofilter.cf file), I'll take a look at the problem.
Regards,
David
Logged In: YES
user_id=2788
I cannot reproduce this either.
Can you show your configuration of bogofilter's ("bogofilter
-QQ" prints it) and give details how exactly you call
bogofilter? Such as .procmailrc snippet, .mailfilter
snippet, .forward excerpt (you may mask addresses) and similar?
What MTA are you using?
Any shell or Perl scripts involved in delivery?
Logged In: YES
user_id=2788
moving to support requests where it apparently belongs
Logged In: YES
user_id=38984
Guys, I'm not sure you understood my report. But maybe
it's my fault for not being clear about the exact cause.
=) Here is more information.
Try this just at a shell against a working bogofilter
database:
# bogofilter -p -e -vv
Header: this a header
body body body
^D
Now, does the X-Bogosity header output look correct? Is it
correctly whitespace folded? It's not on mine if I use
-vv, -vvv, or -vvvv, so that's why I reported this bug. =)
Anyway, here is a copy-paste of a shell session that shows
this problem. Lines with "***" are my comments. I suppose
this may wrap funny in the little sourceforge text box,
but you should be able to see what I'm talking about:
# bogofilter -p -e -v
Header: this is a header
body goes here
Header: this is a header
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000,
version=1.0.1
int cnt prob spamicity histogram
0.00 2 0.000040 0.000040 ##
0.10 0 0.000000 0.000040
0.20 0 0.000000 0.000040
0.30 0 0.000000 0.000040
0.40 0 0.000000 0.000040
0.50 0 0.000000 0.000040
0.60 0 0.000000 0.000040
0.70 0 0.000000 0.000040
0.80 0 0.000000 0.000040
0.90 0 0.000000 0.000040
body goes here
**** Okay, this one (above) is correct.
# bogofilter -p -e -vv
Header: this is a header
body goes here
Header: this is a header
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000,
version=1.0.1
n pgood pbad
fw U
"head:this" 247 0.030584
0.000000 0.000037 +
"goes" 222 0.027489
0.000000 0.000042 +
"head:header" 433 0.051634
0.014981 0.224903 -
"here" 1223 0.136578
0.112360 0.451358 -
"head:Header" 0 0.000000
0.000000 0.520000 -
"body" 90 0.008668
0.018727 0.683563 -
N_P_Q_S_s_x_md 2 1.000000
0.000000 0.000000
0.017800
0.520000 0.375000
body goes here
*** This one here is invalid; it generates a header that
violates RFC 2822, because the lines that follow the
second line of the X-Bogosity header are not correctly
whitespace folded. They have non-whitespace at the
beginning of the line, so are parsed as separate header
lines instead of a continuation of the X-Bogosity header.
# bogofilter -p -e -vvv
Header: wow, a header
body body body
Header: wow, a header
X-Bogosity: Unsure, tests=bogofilter, spamicity=0.520000,
version=1.0.1
n pgood pbad
fw U
"head:header" 433 0.051634
0.014981 0.224903 -
"head:Header" 0 0.000000
0.000000 0.520000 -
"head:wow" 0 0.000000
0.000000 0.520000 -
"body" 90 0.008668
0.018727 0.683563 -
N_P_Q_S_s_x_md 0 0.000000
0.000000 0.520000
0.017800
0.520000 0.375000
body body body
*** Here this shows it again, but just with a higher
verbosity level.
Logged In: YES
user_id=38984
Poking around in the source, I see in rstats.c that for
the first kind of stats (-v) all lines are printed with
stats_prefix in rstats_print_histogram, e.g.:
(void)fprintf(fpo, "%s%3.2f %4lu %f %f ", stats_prefix,
beg, (unsigned long)cnt, prob, h->spamicity );
However, for rstats_print_rtable, stats_prefix is NOT
used, e.g.:
void)fprintf(fpo, "\"%*s %6lu %8.6f %8.6f %8.6f",
len, " ", (unsigned long)
(cur->good + cur->bad),
(double)cur->good /
cur->msgs_good,
(double)cur->bad /
cur->msgs_bad,
fw);
stats_prefix would do the trick here, as defined in
bogoconfig.c:
stats_prefix= stats_in_header ? " " : "# ";
So since these are stats in a header, it would add the
necessary whitespace to make a valid header continuation.
Logged In: YES
user_id=30510
Wesley,
Your further explanation is appreciated. You are, of
course, correct. Evidently I wasn't thinking clearly either
when that bit got coded or when I reproduced your problem
(and didn't recognize the flaw).
The fix has been committed to CVS and a patch is attached
for your testing.
Thanks for reporting the problem!
David
Logged In: YES
user_id=38984
Thanks, David!
Sorry my initial message was a little vague--that's what I
get for reporting bugs when it's past my bedtime. ;)
(I actually don't see a patch attached to this bug, but I
will try pulling it from CVS and let you know if I see any
more problems.)