#33 Sig11 core dump on classifying


Spamprobe deterministically core-dumps with signal 11
on Linux (x86) and AIX (rs6000), both with BerkeleyDB
and PDB.

This happens on learning mail with spamprobe 1.4 and 1.4a.

I attached a small test email to trigger the bug.

I think the problem is related to base64 decoding, as
the bug is not triggered after removing the
corresponding line from the test email.


  • bug triggering email

  • Logged In: NO

    To trigger the bug, do the following:

    $ > spamprobe -v train-spam /path/to/Spam2
    caught signal 11: quitting

  • Logged In: NO

    This happens to me as well, with verbose mode off with a
    ~46meg spam file. Seems to work fine processing mailbox
    files up to around 25 megs, though.

  • Logged In: NO

    Bug seems to be fixed with 1.4b, thanks Brian!

  • Logged In: NO

    MimeDecoder.cc holds this (around line 88):

    unsigned int index = (unsigned)ch;
    if (BASE64_CHARS[index] >= 0) {

    This code is *wrong*, as negative values in the signed char 'ch' will be sign-extended to a very very large unsigned integer.

    The wrong code is seen both in 1.2a (which is where I saw the problem) and in 1.4d. 1.4d works around the problem by having the signed char implicitly converted to a signed int (by means of the bitwise and with 0xff), thereafter chopping off the high bits (the bitwise and), and then casting this to an unsigned int.

    I implemented the following fix in 1.2a, which solves the problem simply and directly, without the use of bitwise operations. If there is a point to the bitwise and, other than getting an implicit conversion that you could just as well have gotten explicitly, please let me know. Otherwise, I'll assume that my fix is simpler and therefore better :)

    My fix simply casts to unsigned char, then (implicitly) to unsigned int:

    unsigned int index = (unsigned char)ch;
    if (BASE64_CHARS[index] >= 0) {

    Thank you,