#108 Bogofilter cannot parse msg if body is base64-encoded

closed-fixed
nobody
None
5
2009-08-04
2009-03-13
No

If message text is base64-encoded (or uuencoded), bogofilter cannot parse such message. Parsed tokens will be undecoded garbage.

The message have following format

====================================
More-Header-Lines
Content-Type: text/plain; charset="windows-1251"
Content-Transfer-Encoding: base64
Content-Disposition: inline
More-Header-Lines

BASE64-ENCODED-TEXT

The message is attached.

After a quick debugging, it look like a big logic problem with lexx parser which prefetches lines in advance. When parsers detects "end of message header" event, next line of message was already fetched and buffered by lexx. Since bogofilter still was in "header" mode at the point of this fetch, line was buffered as is, without base64 decoding.

Discussion

  • Roman Trunov

    Roman Trunov - 2009-03-13

    Example of base64-encoded spam message

     
  • Matthias Andree

    Matthias Andree - 2009-08-04
    • status: open --> pending-works-for-me
     
  • Matthias Andree

    Matthias Andree - 2009-08-04

    Hi Roman,

    this bug is supposed to be fixed in bogofilter version 1.2.1. Can you please try the new bogofilter version and let us know if the problem persists?

    It appears fixed for me. If I run bogolexer -p on your message, I get this (in the hopes that SourceForge does not trash it) which looks like proper Cyrillic script to me (I don't understand Slavic languages though):

    ...
    head:Content-Disposition
    head:inline
    head:Message-Id
    Привет
    Вам
    Необычное
    Приглашение!
    ...
    для
    лучших
    друзей!

    Thanks for taking the time to report this.

     
  • Matthias Andree

    Matthias Andree - 2009-08-04

    Oh, and I've indeed made sure that the lexer does not read ahead; I adjusted some rules so that the lexer itself needs not look past \n (line feed), and I made the lexer "interactive", so it does not read ahead voluntarily (i. e. unless it must).

     
  • David Relson

    David Relson - 2009-08-04

    Fixed in bogofilter-1.2.1 (released on 1 Aug 2009)

     
  • David Relson

    David Relson - 2009-08-04
    • status: pending-works-for-me --> closed-fixed
     
  • Matthias Andree

    Matthias Andree - 2009-08-05

    Roman has responded off-tracker (comment posting was already closed) this:
    "[...] the fix works fine. I tested Bogofilter
    1.2.1 on my collection of such one-liners and all of them
    were decoded properly. Thank you for fixing this."

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks