From: Jeroen H. <vex...@gm...> - 2009-12-14 19:27:28
|
On Mon, Dec 14, 2009 at 17:45, Dean Michael Berris <mik...@gm...> wrote: > Hi Jeroen, > > On Mon, Dec 14, 2009 at 9:12 PM, Jeroen Habraken <vex...@gm...> wrote: >> On Mon, Dec 14, 2009 at 13:03, Dean Michael Berris >> <mik...@gm...> wrote: >>> >>> Cool, please either fork the library on Github or send a git patch >>> later on. I will be freezing the Subversion repository tomorrow. >>> >>> Have a good day! >>> >> >> I've decided to roll an initial patch, please find it attached. It >> fixes the following: >> - stricter RFC compliant parsing of the scheme, in the generic URI >> - It converts the scheme to lower case, as it states the following in >> the RFC, "For resiliency, programs interpreting URI should treat upper >> case letters as equivalent to lower case in scheme names" >> - I've changes the parser of the port to use ushort_ and uint16_t, the >> RFC specifies the port as *digit, but I think it should be limited to >> the valid network ports, thus 0 <= port <= 2**16 >> - The query and fragment are now parsed conform to the RFC I believe, > > Thanks very much for this! > >> I'd like to change this later to parse the query into a >> std::list<std::pair<string_type, string_type> > >> > > I'd say a function that interprets the query part doesn't have to be > part of the uri interface. I'd accept a function which would parse out > the the .query() part of the uri into a > list<pair<string_type,string_type> > that looks like this: > > list<pair<string,string> > query_list_ = query_list(uri); > > This should be available via ADL and only for http uri's. The > simplicity of the HttpUri concept should be preserved. Also, I'd > imagine the query_list(...) function to dispatch to a specific > implementation based on the tag associated with the uri. This is so > that we can change the result type based on the tag associated with > the uri too, so for example instead of a list<pair<string,string> > we > can return a generator function, an input iterator, a > multimap<string,string>, or a tuple. This is indeed a better option, it keeps the URI parser relatively simpler and prevents unnecessary work, whilst we still provide this functionality, consider it on my TODO list. >> Note that the way the current parser works, it guarantees that if the >> URI is valid, the URI decoding can do with a lot less checks, I don't >> know whether this is a good idea though. >> > > I don't understand what the concern is. Is there a problem you're > seeing with the current approach? I should have been more clear here, normally when URI decoding a string quite a few things can go wrong, imagine incorrect characters '%2S', or it simply being too short, '%2'. A URI will not be parsed as valid if such cases occur, making URI decoding relatively trivial as you won't have to deal with such cases. We should provide a fully functional uri_decode function though, capable of handling any input. There actually turns out to be a fine implementation in the Boost.Asio examples already, it's in libs/asio/example/http/server/request_handler.cpp as url_decode. > Also, please modify the patch to include your copyright information at > the top of the files you modify. :) > > Thanks again! You're most welcome. I've just created a github account, and think forking the master and working from there is the easiest option. Jeroen 'VeXocide' Habraken > -- > Dean Michael Berris > blog.cplusplus-soup.com | twitter.com/mikhailberis > linkedin.com/in/mikhailberis | facebook.com/dean.berris | deanberris.com |