The escaping for URI's is wrong:
From rfc2396:
"If the data for a URI component would conflict with the reserved
purpose, then the conflicting data must be escaped before forming the
URI."
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
"$" | ","
This implies that if ":" is used for a reserved purpose,
if scheme is defined then
append scheme to result
append ":" to result
, then it should not be escaped.
I'm not sure what the best solution to this problem is. But I think
"quoting" the query portion of a URI, if it exists, is probably all that can
be reasonably done.
-josh
On Jul 6, 2006, at 1:17 PM, Chris Lambacher wrote:
Hi,
I am having some trouble talking to a server that takes an anyURI
argument.
If I use TC.URI the uri gets encoded with urllib.quote, such that
http://host/ ends up as http%3A//host/. The end result is that the
server
tells me that it can't find a scheme in the uri.
I have gotten around this by creating my own URI class that omits the
quote/unquote step in serialization.
Is this behaviour according to standard and the server needs fixing, or
is
TC.URI broken?
Thanks,
Chris
Logged In: YES
user_id=122679
You either need to trust that users are providing properly
formated URIs or you need provide a data type that allows
all the bits to be set independantly and combined with the
correct encoding.
Obviously the first option is a lot easier. All you have to
do is remove the functions overrided from
String(text_to_data and get_formatted_content) and let the
default string versions take over. Some form of validation
could be done on the str as it is processed, raising an
exception if it is not a properly formatted URI.
The other way allows you to ensure that all the bits are
encoded properly. It could be as simple as a dict that has
the required keys scheme, host, path, query and with
optional keys port, user, password. get_formatted_content
could detect if it is a dict or a string and act accordingly.
The current behavior is wrong and looks like it leaves you
only able to send URI types to python servers/clients. It
would be nice if something could be done about this before 2.0.
Logged In: YES
user_id=122679
Originator: NO
Why is this marked as fixed? No change has been applied to the source and Josh apparently agrees with me that this is broken. I would recommend just removing the text_to_data and get_formatted_content methods and let the ones from String take over. That is what I have been using as a monkey patch to resolve the problem for the time being.
Do you want to see a patch for that?