#115 bogofilter tokenizes parts from a MIME body's base64 stream

v1.0_(example)
open
None
5
2013-11-11
2013-11-11
No

Matt Garretson assembly.state.ny.us!mattg reported to the bogofilter user's list on 2013-10-30 what might be a MIME decoder bug in 1.2.4 at first glance. In order to retain his information while the pastebin data is about to expire in two days, I am uploading his data here for later inspection.

I occasionally see binary attachments that bogofilter (1.2.4) will tokenize, resulting in tons of BASE64 string fragments in the token stream. Here's an example message (cleaned a bit):

http://pastebin.com/FEvjRp9F (line 38+)
(that is the first nonempty line below application/applefile)

and the output of bogolexer on it:

http://pastebin.com/tabnXZSP (line 94+)
(that is the line starting with get_token: 1 "AAEW")

Since the attachments are not of a text type, I would expect bogofilter to skip over them.

Any ideas what is happening here? Is there a debug flag that would help narrow it down?

Thanks...

2 Attachments

Discussion


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks