[Pyparsing] PayPal IPN message parsing
Brought to you by:
ptmcg
From: Werner F. B. <wer...@fr...> - 2011-01-06 14:10:18
|
I am having some problems decoding these messages. The data comes in as an email message with a defined content type as "Content-Type: text/plain", however it is really Content-Type: text/plain; charset="windows-1252", so I read it in with thisfile = codecs.open(regFile, "r", "windows-1252"). The parsing works fine except on things like: address_name = Göran Petterson Which I parse with: alphanums = pyp.Word(pyp.alphanums) # address str_add_name = pyp.Literal("address_name =").suppress() +\ alphanums + pyp.restOfLine add_name = str_add_name.setParseAction(self.str_add_nameAction) But I get in str_add_nameAction: ([u'G', u'\xf6ran Petterson\r'], {}) The raw data at this point is "address_name = G\xf6ran Petterson" What am I doing wrong in all this? I tried using pyp.printables instead of alphanums but with the same result. A tip would be very much appreciated. Werner P.S. Happy New Year to you all. |