From: Kragen S. <kra...@ai...> - 2004-03-22 21:34:46
|
On Tue, Mar 23, 2004 at 07:25:35AM +1100, Robert Leftwich wrote: > At 04:15 AM 23/03/2004, Adam Rifkin wrote: > > >pubsublib.py is stubbed out for a lot of features we were > >one day going to get to, but it was the goal in starting > >that library to make it a reference implementation for other > >client libraries. > > With that in mind, what are the rules for encoding event element names and > values? > > The protocol document says "A server will always escape the following > bytes: colon (U+003A), the equals sign (U+003D), and leading and trailing > spaces (U+0020) in element names and values. Other bytes may be escaped, as > in the above example (where all control characters are escaped.) ". - with > the 'Other bytes *may* be escaped' not being very helpful :-) I guess we could spell it out: "A sender must escape the following bytes: ... A receiver must be able to unescape any byte, not limited to the previous list, but must not assume that any byte not listed in the previous list will be escaped." > The pubsub server code only encodes "=:\n" for the headers and "=\n" for > the value using an internal function. Sounds like a bug! I guess not escaping the : in the value doesn't make things ambiguous, but it doesn't meet the spec. > The pubsub client says " Header name and value, for the named headers, may > contain quoted bytes written quoted-printable style as an equal sign > followed by 2 hexadecimal digits. Specifically, any colons in the header > name, any leading horizonal_whitespace in the value, and any > vertical_whitespace or equal signs in the header or value, must be quoted > this way." Probably the pubsub client should just reference the relevant section of the spec instead of containing its own divergent spec in comments. > and then uses the Python quopri.decode library function which > does things a little differently, i.e. it uses the 'Q' encoding which is > similar to "Quoted-Printable" but is different in significant areas, e.g. > it may represent spaces as underscores, according to the RFC it is based on. quopri.decode only decodes underscores as spaces if an optional third argument is passed, at least in Python 2.2.2. Are there other bugs? |