When testing MIME attachments I get the following error
which occurs using arch to convert a mailbox or when
posting to the archive:
Pickling archive state into
/usr/local/mailman/archives/private/opsarchtest/pipermail.pck
Traceback (most recent call last):
File "bin/arch", line 160, in ?
main()
File "bin/arch", line 148, in main
archiver.processUnixMailbox(fp, Article, start,
end)
File
"/usr/local/mailman/Mailman/Archiver/pipermail.py",
line 545, in processUnixMailbox
m = mbox.next()
File "/usr/local/lib/python2.2/mailbox.py", line 34,
in next
return self.factory(_Subfile(self.fp, start, stop))
File "/usr/local/mailman/Mailman/Mailbox.py", line
69, in scrubber
return mailbox.scrub(msg)
File "/usr/local/mailman/Mailman/Mailbox.py", line
89, in scrub
return self._scrubber(self._mlist, msg)
File
"/usr/local/mailman/Mailman/Handlers/Scrubber.py", line
114, in process
url = save_attachment(mlist, part, filter_html=0)
File
"/usr/local/mailman/Mailman/Handlers/Scrubber.py", line
217, in save_attachment
decodedpayload = msg.get_payload(decode=1)
File "/usr/local/mailman/pythonlib/email/Message.py",
line 177, in get_payload
return Utils._bdecode(payload)
File "/usr/local/mailman/pythonlib/email/Utils.py",
line 71, in _bdecode
value = base64.decodestring(s)
File "/usr/local/lib/python2.2/base64.py", line 44,
in decodestring
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
I've narrowed this down to HTML attachments that
include tables. I've attached a mbox that will give
this error when used with arch. This only happens when
base64 is used.
I've also found that even when messages with HTML
attachments are archived, clicking on the attachment
doesn't display the page but the source instead. Or in
some cases it can be garbage. I can supply more info on
this if you wish.
Mbox file. Base64 attachment. HTML table.
Logged In: YES
user_id=594846
Whoops. Attached the html file which was used as attachment
instead of the mbox file....
...which I'm gonna upload now.
The real mbox file.
Logged In: YES
user_id=12800
Thanks for the report and the mbox file, this was indeed a
bug, which will be fixed in beta4.
As for you second question, what you're seeing could be
caused by this bug, but it's more likely just the behavior
specified by ARCHIVE_HTML_SANTIZER in Defaults.py/mm_cfg.py.
To prohibit evil stuff like cross-site scripting, or web
bugs and virus from affecting your archive readers, html is
typically santized to be harmless.
Logged In: YES
user_id=27555
When I monitored what strings were causing this to barf I
found that there were two scenarios - one when the mailman
generated html descibing the file was being sent to this
function (I have no idea why it is doing this)
Ex. A non-text attachment was scrubbed...
The second scenario that was truly a padding issue I got
around by doing the following in the _bdecode funtion:
value = base64.decodestring(s+'=========')
let me know if you want a full patch of my trashy hack.