From: Jose <jm...@gm...> - 2007-06-04 06:52:36
|
Hi Peter, Thanks for the detailed answer. I think a hand-written parser may be a good idea, maybe just for http uris: user:password@host:port/file so that the the typical (host, file) pair needed for HTTP requests can be obtained as quickly as possible. Additional parsing of the file segment can be provided when needed (e.g. for server apps). What do you think ? regards Jose On 03 Jun 2007 23:17:55 +0200, Peter Simons <si...@cr...> wrote: > > Hi Jose, > > you are right, there is a spirit-based URI parser in mini-httpd. > Unfortunately, it is incomplete insofar as that it understands > HTTP URLs only and doesn't recognize literal IPv6 host names. In > addition, the code is a bit messy because it was written several > years ago, at a time where Spirit didn't have the sophisticated > actor infrastructure it has these days. > > In my experience, the greatest challenge when parsing an URI is > not the parser, it is the resulting data structure. The URI class > mini-httpd uses in fine for mini-httpd, but it certainly is far > from generic. > > RFC 2396 comes with a state-machine for parsing URIs, by the way, > and it's pretty wild: > > | ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? > | 12 3 4 5 6 7 8 9 > > The relevant sub-match states are: > > | scheme = $2 > | authority = $4 > | path = $5 > | query = $7 > | fragment = $9 > > In terms of performance, I doubt that there is much of a > difference between Spirit and Boost.Regex in this context. The > main difference is that Boost.Regex must be linked whereas Spirit > is a header-only library. That may or may not matter to our > users; it's hard to tell. > > One disadvantage of Spirit is that compile-time goes through the > roof even for trivial grammars. Another problem is that Spirit > relies one rather sophisticated magic to be thread-safe. Compiled > regular expressions, however, are immutable and can be used by > any number of threads concurrently without synchronization. > > A hand-written parser might be slightly faster than either Spirit > or Boost.Regex. It's definitely harder to get right, though. :-) > > Best regards, > Peter > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > Cpp-netlib-devel mailing list > Cpp...@li... > https://lists.sourceforge.net/lists/listinfo/cpp-netlib-devel > |