[Rest2web-develop] user-value not in ascii
Brought to you by:
mjfoord
From: Gael V. <gae...@no...> - 2006-09-30 15:51:09
|
Hi, r2w so far does a great job of dealing with non ASCII encodings. But I found a glitch: I have a user-value in a page that uses a non ascii character. Python isn't to happy about that: the line print node_title + ": " + title Now if node_tile contains, say "=EB", I get this Traceback (most recent call last): [err] File "./r2w.py", line 185, in ? [err] count =3D main(options, config) [err] File "./r2w.py", line 94, in main [err] return processor.walk() [err] File "/home/varoquau/www/rest2web/rest2web/restprocessor.py", line 385, in walk [err] self.buildsection() [err] File "/home/varoquau/www/rest2web/rest2web/restprocessor.py", line 1325, in buildsection [err] uservalues =3D enc_uni_dict(page['uservalues'], final_encoding) [err] File "/home/varoquau/www/rest2web/rest2web/restutils.py", line 227, in enc_uni_dict [err] val =3D uni_dict[entry].encode(encoding) [err] File "/usr/lib/python2.4/encodings/iso8859_1.py", line 18, in encode [err] return codecs.charmap_encode(input,errors,encoding_map) [err] UnicodeDecodeError: 'ascii' codec can't decode byte 0xeb in position 2: ordinal not in range(128) [err] The encoding of this string is most probably the encoding of the page, therefore Latin1. A fix would probably to have the user-value be translated from Latin1 to unicode when it is read, as I recon the parser knows what the encoding of the page is, at this point. On a side note, when r2w fails with such an error it still return a return value of 0, with means success in the Unix world, and is used in makefile, or building script a lot. On my website I check for the return value of r2w before propagating the website, but I cannot trap the rror, as it is not reported. Cheers, Ga=EBl =20 |