#66 concurrent access to database


My sendmail interface with bogofilter has ability to
register spam/ham on user's requests.

Multiply parallel instances of 'bogofilter -l -s' locks
(Berkely DB) completely (for read and for write), until
all of register requests aborted, so any other milter
requests timed out.

Checked bogofilter version 0.15.9, 0.15.10, builded on
Linux (non staticaly linked) from sorce tree.
gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
Berkely DB 4.0.14, installed from rpm.

Should i prevent possibility of parallel access to
bogofilter with -s/-n?


  • David Relson

    David Relson - 2003-12-10

    Logged In: YES

    The "-s" flag is only used to register spam in the database.
    Normal usage of bogofilter is to read the database to get
    word scores. Reading doesn't lock the database. Why do
    you have multiple instances of bogofilter registering spam?

  • Andrew Mironov

    Andrew Mironov - 2003-12-11

    Logged In: YES

    Okay, a little bit more detailed.
    I've write small milter (mail filter via sendmail API),
    actualy as frontend to bogofilter. (BTW, revised milter
    source code sent to m-a (Matthias Andree))

    My trusted users can send (generaly, at unpredictable time) to
    special e-mail address messages misclassified as ham to
    register it with bogofilter -s as spam.

    For example, user have to register 3 messages. Mail client
    ('The Bat!') opening 3 simultaneous connection to sendmail MTA.
    Filter accept connections and starts one by one within short
    time period 3 instances of bogofilter.

    1st task exited normaly, other -s tasks will never exited,
    milter timed out. Other requests to bogofilter to get
    word scores blocked too, until 2 remained 'bogofilter -s'
    After abort of register tasks, filter continued normal flow.

    Parallel requests to bogofilter -s/-n can be easily queued
    in my filter with mutex for example, but IMHO its a
    bogofilter bug. ("deadly embrace"?)

  • Matthias Andree

    Matthias Andree - 2003-12-11

    Logged In: YES

    I have received your milter code (with its new name) but haven't yet
    managed to a) compile sendmail, b) integrate your code, c) test drive it.

    As far as I know, there is no way for bogofilter to "deadlock" or
    something to that extent. With "combined wordlists" mode which is now
    default (i. e. wordlist.db), we're only ever locking one file. WIth "separate
    wordlists" mode we'll lock the first file in blocking mode (fcntl with
    F_SETLKW) and all will lock the other file in non-blocking mode and
    unlock both files and wait if the 2nd lock cannot be obtained, to avoid
    dead lock.

    The question is if the milter interface has synchronization requirements
    itself and how the processes are spawned. Can you attach strace or gdb
    to one of the "hanging" bogofilter processes and obtain a stack
    backtrace, possibly with local data?

    If there are pipes involved, how large are the mails? Pipes tend to block
    writes some times when the reader doesn't go ahead, which may lock
    the sendmail side; and there is also a possibility that a reader blocks out
    a writer and vice versa. So if your setup depends on being able to
    classify while registering, that's something that cannot work with the
    current code. We'd need to enable the transactional module of
    BerkeleyDB -- which will not happen before 0.17 because I'm not going
    to add support for multiple data base files for that that will then be
    removed two weeks later.

  • Andrew Mironov

    Andrew Mironov - 2003-12-11

    Logged In: YES

    Backtrace stack of waiting 'bogofilter -s' processes
    (points to possible some pipe problem ?)
    3 short letters, 745 bytes.

    Process 1 exited normaly
    Process 2
    (gdb) bt
    #0 0xffffe002 in ?? ()

    #1 0x420d1843 in read () from /lib/tls/libc.so.6

    #2 0x42130a14 in data.0 () from /lib/tls/libc.so.6

    #3 0x4206dfe2 in _IO_file_read_internal () from
    #4 0x4206d2dc in _IO_new_file_underflow () from
    #5 0x4206fa7d in _IO_default_uflow_internal () from
    #6 0x4206f72d in __uflow () from /lib/tls/libc.so.6

    #7 0x4206948e in getc () from /lib/tls/libc.so.6

    #8 0x0804e9f7 in xfgetsl (buf=0x80baaa0 "
    max_size=8192, in=0x4212ecc0, no_nul_terminate=1) at fgetsl.c:38
    #9 0x0804cf00 in buff_fgetsl (self=0xbffff560,
    in=0x4212ecc0) at buff.c:63
    #10 0x0804c8b7 in mailbox_getline (buff=0xbffff560) at
    #11 0x0805674e in yy_get_new_line (buff=0xbffff560) at
    #12 0x08056840 in get_decoded_line (buff=0xbffff560) at
    #13 0x080569ed in yyinput (buf=0x80baaa0 "
    max_size=8192) at lexer.c:206
    #14 0x0804ff05 in yy_get_next_buffer () at

    #15 0x0804fce8 in lexer_v3_lex () at lexer_v3.c.new:2046

    #16 0x0805364f in get_token () at token.c:79

    #17 0x0804d465 in collect_words (wh=0x809d4f0) at
    #18 0x08049b5e in bogofilter (argc=0, argv=0xbffff6e0) at
    #19 0x08049dd3 in main (argc=0, argv=0xbffff6e0) at

    Process 3
    (gdb) bt
    #0 0xffffe002 in ?? ()
    #1 0x420d7c8d in select () from /lib/tls/libc.so.6

    I have to collect much more data to analize. Due to long
    weekend, it delayed.
    2Matthias Andree: Is it better to continue "investigation"
    via e-mail?

  • Matthias Andree

    Matthias Andree - 2004-08-16

    Logged In: YES

    Does the problem persist in 0.92.4?

  • Matthias Andree

    Matthias Andree - 2004-08-16
    • status: open --> closed-out-of-date

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks