Menu

#8 non-7-bit-ascii subject is not decoded

open
nobody
None
5
2001-10-05
2001-10-05
No

If subject have non-7-bit-ascii (for example, Thai
language), Outlook always encodes it using base64 and
adds both prefixes and suffixes as below example.

=?windows-874?B?
Rnc6ILXpzaeh0sOkucPRurfTIGpvYiC06Me5IOCn1Lm01SCq6Menu9S
0?= =?windows-874?B?4LfNwSAoSmF2YSkgKDIp?=

windows-874 is Thai charset. It maybe others charset
in Europe.

Here what I try.

subjpat = re.compile('(=\?[a-z0-9\-]+\?.\?)([^\?]*)(\?
=)(\s*)')

def decode_subject(chunks):
subj = ''
for prefix,chunk,suffix,space in chunks:
out = StringIO()
base64.decode(StringIO(chunk),out)
subj = subj+out.getvalue()+space
return subj

<--- snipped --->
m = subjpat.findall(self["Subject"])
if len(m) > 0:
self.Subject = decode_subject(m)
else:
self.Subject = self["Subject"]

Discussion


Log in to post a comment.

MongoDB Logo MongoDB