Hi. I'm using bogofilter on a daily basis for something
like 2K messages/day. It works perfectly but when the
database reaches 51200000 bytes it just stops working
and for what I can see indices get corrupted (I get
lots of messages with regard to them when I run
db(3|4)_verify on that file).
I'm running FreeBSD 5.
Bogofilter is run from procmail with these rules:
:0fw
| bogofilter -u -e -p
:0e
{ EXITCODE=75 HOST }
and the error messages I get on Postfix mail queue are:
[godoy@wintermute ~]$ mailq
-Queue ID- --Size-- ----Arrival Time----
-Sender/Recipient-------
19C84413C 12228 Mon Mar 24 12:36:33
retorno@fiscosoft.com.br
(temporary failure. Command output: bogofilter: (db)
db_getvalue( '_ltimos' ), err: -30988,
DB_PAGE_NOTFOUND: Requested page not found procmail:
Program failure (2) of "bogofilter" procmail: Rescue of
unfiltered data succeeded )
godoy@godoy.homeip.net
-- 12 Kbytes in 1 Request.
[godoy@wintermute ~]$
If I remove the bogofilter instructions from
.procmailrc, the message goes thru nicely. I've seen
this behaviour in version 0.10.something and now on
0.11.3 (both installed from ports, i.e., compiled on my
machine).
[godoy@wintermute ~]$ bogofilter -V
bogofilter version 0.11.1.3
Copyright (C) 2002 Eric S. Raymond
(...)
I get my mail with fetchmail, which sends it to Postfix
that uses procmail as my MDA. As above, procmail calls
bogofilter on each and every message. I update the spam
database regularly with 'bogofilter -s', using Gnus'
spam.el.
If I can help finding out what is happening, just drop
me a message.
It seems that when the database reaches a certain size,
some data gets dropped out, but indices and references
to them doesn't get updated. This is just a guess...
TIA.
Godoy.
Logged In: YES
user_id=30510
Godoy,
From what you're saying, the problem seems to be in db3/4.
Have you reported the problem to SleepyCat? What was the
response?
David
Logged In: YES
user_id=100502
No I haven't tried contacting them. I think their website is
lot confusing... I'll look for some contact information
there and try submitting it as a bug report.
Are you closing this or should I return their response?
Thanks.
Logged In: YES
user_id=30510
Godoy,
I'm leaving this open as there are other members of the
bogofilter team who are more knowledgeable about db3/4 than
am I. They may more information, questions, or suggestions
for you.
Also I recommend that you subscribe to the bogofilter
mailing list and post your problems there. That may elicit
some information that will help you.
David
Logged In: YES
user_id=230355
I've had nearly the same problem (bogofilter was dying when
writing to the database files which exceeded a certain size. It
can be solved by setting the mailbox size limit in the postfix
config to unlimited (the default is 51200000 - 50MB):
/etc/postfix/main.cf:
--8<--
mailbox_size_limit = 0
--8<--
Seems like Postfix sets the corresponding ulimits to the size
set there.
Logged In: YES
user_id=100502
No I haven't tried contacting them. I think their website is
lot confusing... I'll look for some contact information
there and try submitting it as a bug report.
Are you closing this or should I return their response?
Thanks.
Logged In: YES
user_id=100502
It seems to have worked, even though I still had one of these:
Mar 25 07:57:40 wintermute postfix/local[48441]: D4F05469E:
to=<godoy@godoy.homeip.net>, relay=local, delay=2,
status=deferred (temporary failure. Command output:
bogofilter: (db) db_getvalue( '____' ), err: -30988,
DB_PAGE_NOTFOUND: Requested page not found procmail: Program
failure (2) of "bogofilter" procmail: Rescue of unfiltered
data succeeded )
I believe that I need to reset my database... Or, even
better: do you have some tool to cleanup unreferenced words?
I mean those words that link to a position where there's no
entry or an invalid entry... Removing them from the database
would be very good instead of restarting it again for the
third time... :-)
Logged In: YES
user_id=30510
Godoy,
bogoutil can be used to dump/load wordlists. see the man
page for more info. Unfortunately, I can't say how well
it'll do with a corrupt database. Its wordlist maintenance
functions can be used to delete tokens with low reference
counts.
The tools that come with db3/4 are likely to do the right
thing. Have you looked at db_dump, db_recover, etc?
David
Logged In: YES
user_id=100502
bogoutil aborts at undefined references. It can dump only
about 5000 entries from my goodlist.db. The spamlist has
more than 150K entries.
I've also tried using db_* before filing this bug, but the
same happens.
I've recreated the goodlist.db from some messages of mine
and it is, again, giving good results (it has near 50K
entries in it).
Thanks for your help. I consider my problem solved, if you
wish closing the bug. It was Postfix limiting the size of
files that could be created/used by the MDA and consequently
bogofilter.
The database corruption was a side effect and not easy to solve.
Thanks.