Thread: [Rest2web-develop] Encoding & Uservalues
Brought to you by:
mjfoord
From: Fuzzyman <fuz...@vo...> - 2006-04-05 13:13:11
|
Hello all, I've finished the first shot at verbosity levels. It should be fairly straightforward. The next feature I'm working on is allowing you to set uservalues in the config file. These can be over-ridden in the individual pages by explicit uservalues. This means you can supply values in your config file - and then have conditional logic in your templates/pages that use them. After this is implemented I will also allow the passing of uservalues at the command line. I have to work out how to handle encodings of uservalues passed via the config file. Uservalues in the page are decoded to unicode (guessing the encoding if necessary), and then re-encoded in the output encoding. This is useful because it means if you have a latin-1 document (for example) you can still specify that all your output HTML should be in UTF8. So how should rest2web handle uservalues passed in the config file and at the command line ? First of all the config file. There are three options : 1) Ignore encodings and assume that the config file values are ascii or in the appropriate encoding 2) Decode to unicode immediately and restrict values to ascii only 3) Guess the encoding of the config file, using the same technique as is used for guessing the encoding of pages (annoying to implement but well possible) 4) Allow a magic value '__encoding__' to specify the encoding in use My personal preference is 4). I then need to choose between 1), 2) and 3) as a fallback if that value is missing. I have a similar problem with values passed at the command line - however in this case I can check ``sys.stdin.encoding``. This works (on windows) even if stdin has already been closed. I will probably have to force ascii only if that information isn't available. All the best, Fuzzyman http://www.voidspace.org.uk/python/index.shtml |
From: Andrew I. <and...@us...> - 2006-04-06 06:48:34
|
I vote for 4) and then 1) as a fall-back, since you allow the developer to specify exactly what encoding they want. If they do not take advantage of that, just use the lowest-common denominator. Andrew > -----Original Message----- > From: res...@li... > [mailto:res...@li...]On Behalf Of > Fuzzyman > Sent: Wednesday, April 05, 2006 6:12 AM > To: Rest2web Dev List > Subject: [Rest2web-develop] Encoding & Uservalues > > > Hello all, > [snip] > > So how should rest2web handle uservalues passed in the config file and > at the command line ? > > First of all the config file. There are three options : > > 1) Ignore encodings and assume that the config file values are ascii or > in the appropriate encoding > 2) Decode to unicode immediately and restrict values to ascii only > 3) Guess the encoding of the config file, using the same technique as is > used for guessing the encoding of pages (annoying to implement but well > possible) > 4) Allow a magic value '__encoding__' to specify the encoding in use > > My personal preference is 4). I then need to choose between 1), 2) and > 3) as a fallback if that value is missing. > > I have a similar problem with values passed at the command line - > however in this case I can check ``sys.stdin.encoding``. This works (on > windows) even if stdin has already been closed. I will probably have to > force ascii only if that information isn't available. > > All the best, > > Fuzzyman > http://www.voidspace.org.uk/python/index.shtml |
From: Fuzzyman <fuz...@vo...> - 2006-04-06 07:16:48
|
Andrew Ittner wrote: > I vote for 4) and then 1) as a fall-back, since you allow the developer to > specify exactly what encoding they want. If they do not take advantage of > that, just use the lowest-common denominator. > > Right - in which case I will use 2) as a fallback and explicitly decode to unicode assuming ascii. Everything else is stored as unicode, and this will raise an error if the programmer uses non-ascii values without specifying an encoding. Hopefully I'll commit this tonight. Thanks for your input. Fuzzyman http://www.voidspace.org.uk/python/index.shtml > Andrew > > >> -----Original Message----- >> From: res...@li... >> [mailto:res...@li...]On Behalf Of >> Fuzzyman >> Sent: Wednesday, April 05, 2006 6:12 AM >> To: Rest2web Dev List >> Subject: [Rest2web-develop] Encoding & Uservalues >> >> >> Hello all, >> >> > [snip] > >> So how should rest2web handle uservalues passed in the config file and >> at the command line ? >> >> First of all the config file. There are three options : >> >> 1) Ignore encodings and assume that the config file values are ascii or >> in the appropriate encoding >> 2) Decode to unicode immediately and restrict values to ascii only >> 3) Guess the encoding of the config file, using the same technique as is >> used for guessing the encoding of pages (annoying to implement but well >> possible) >> 4) Allow a magic value '__encoding__' to specify the encoding in use >> >> My personal preference is 4). I then need to choose between 1), 2) and >> 3) as a fallback if that value is missing. >> >> I have a similar problem with values passed at the command line - >> however in this case I can check ``sys.stdin.encoding``. This works (on >> windows) even if stdin has already been closed. I will probably have to >> force ascii only if that information isn't available. >> >> All the best, >> >> Fuzzyman >> http://www.voidspace.org.uk/python/index.shtml >> > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Rest2web-develop mailing list > Res...@li... > https://lists.sourceforge.net/lists/listinfo/rest2web-develop > > |