From: Jeroen H. <vex...@gm...> - 2010-02-05 16:48:53
|
Hi, On Fri, Feb 5, 2010 at 17:17, Nelson, Erik - 2 <eri...@ba...> wrote: > In the http server example, the request.uri is still escaped, e.g., has > characters like '%2f' in it... Is there a 'best' way to unescape it > within cpp-netlib? Or should I use another source of un-escaping? > > Thanks > > Erik > The short answer, no cpp-netlib currently doesn't have any features to unescape a URI except for an implementation of unescape itself. Now for the longer answer, as unescaping a URI isn't as easy as it might look at first glance. It is part of the normalisation process normally applied to URI's, and is described in section 6.2.2.2. in the RFC which can be found at http://www.ietf.org/rfc/rfc3986.txt. Each section of the URI has a set of reserved characters, and those should be left escaped whilst any other character can be unescaped safely. Unescaping the reserved characters would change the meaning of the URI, and might even make it invalid. An example would be http://user%40n...@do...d versus http://us...@na...@domain.tld, where the escaped character is in the user_info section, and unescaping it invalidates the URI. The whole normalisation process is something I hope to tackle after I've completed the parser, but I'm unsure when this might be. For the time being you could have a look at google-url, a URI parser written for the chrome browser which can be found at http://code.google.com/p/google-url/. Jeroen |