#73 err:22, Invalid Argument showing up during bulk load

closed-works-for-me
nobody
None
5
2004-01-19
2004-01-18
No

I'm bulk processing 10,000 SPAM messages with

formail -s bogofilter -s -v < mbox

and seeing sporadic errors of the form:

bogofilter: (db) db_get_dbvalue( 'Any' ), err: 22, Invalid
argument

where 'Any' is replaced sometimes with another word that
starts with "A". Is this some kind of artifact of concurrent
access? That is, if I'm also using bogofilter via procmail, are
those errors related to blocking?

Discussion

  • Paul R. Brown

    Paul R. Brown - 2004-01-18

    Logged In: YES
    user_id=532486

    More information: bogofilter 0.16.3 running on MacOS X 10.3.2;
    built with Berkeley DB 4.2. bogofilter is running with -u in the
    procmail script -- thus possibility of two registrations attempted to
    run at the same time.

     
  • David Relson

    David Relson - 2004-01-18

    Logged In: YES
    user_id=30510

    Paul,

    If you're bulk registering messages, there's no need for
    formail. Bogofilter recognizes mailboxes during
    registration and you can simply use "bogofilter -s -v -M <
    mbox".

    Do you have separate wordlists (spamlist.db and goodlist.db)
    or a combined wordlist (wordlist.db)?

    David

     
  • Paul R. Brown

    Paul R. Brown - 2004-01-19

    Logged In: YES
    user_id=532486

    I have a combined wordlist. I ran db_verify and it turned out that
    there was some corruption. I have used formail after getting
    some inconsistent results with the -M flag. (bogofilter didn't
    recognize message boundaries properly.)

    I've started fresh with a new database but would like to
    understand concurrency constraints.

     
  • David Relson

    David Relson - 2004-01-19

    Logged In: YES
    user_id=30510

    Paul,

    You mention "bogofilter didn't recognize message
    boundaries". AFAIK that's been working properly for over a
    year. If you've got a mailbox that does that, please gzip
    it and email it to me (relson@users.sourceforge.net).

    As to concurrency, the bogofilter/BerkeleyDB combination is
    working well on many Unix systems (and others too). Running
    "make check" after building bogofilter will test the locking
    of your environment (specifically tests t.lock1 and t.lock2,
    which are the last two). Other people have it running fine
    on MacOS-X, so there's something different in your environment.

    I'd suggest subscribing to bogofilter@aotto.com and posting
    your problem there. Likely one of the MacOS X users will
    have ideas that can help you.

    You mention procmail. Are you using lockfiles? I don't
    know if they're necessary, but they might help you.

    David

     
  • David Relson

    David Relson - 2004-01-19

    Logged In: YES
    user_id=30510

    Paul,

    You mention "bogofilter didn't recognize message
    boundaries". AFAIK that's been working properly for over a
    year. If you've got a mailbox that does that, please gzip
    it and email it to me (relson@users.sourceforge.net).

    As to concurrency, the bogofilter/BerkeleyDB combination is
    working well on many Unix systems (and others too). Running
    "make check" after building bogofilter will test the locking
    of your environment (specifically tests t.lock1 and t.lock2,
    which are the last two). Other people have it running fine
    on MacOS-X, so there's something different in your environment.

    I'd suggest subscribing to bogofilter@aotto.com and posting
    your problem there. Likely one of the MacOS X users will
    have ideas that can help you.

    You mention procmail. Are you using lockfiles? I don't
    know if they're necessary, but they might help you.

    David

     
  • Paul R. Brown

    Paul R. Brown - 2004-01-19

    Logged In: YES
    user_id=532486

    I'll see if I can figure out which mbox was causing the problem; I
    have quite a few...

    As for locking, t.lock1 and t.lock2 both pass on make check. I
    think I'll just be careful and see if it recurs. Thanks for your help.

     
  • Paul R. Brown

    Paul R. Brown - 2004-01-19
    • status: open --> closed-works-for-me
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks