From: Dean M. B. <mik...@gm...> - 2009-08-21 09:10:46
|
Hi Guys, I see that John has done quite a lot of work already in his branch(es) and I'd like to get a release out (0.4) that will have some support for persistent connections and a limited (partially correct) HTTP URL parser, and along with it support for HTTPS. Having said this, I have two major issues to deal with and it has something to do with some major parts of the library moving forward. URI/URL Parsing: I see that there have already been three attempts at a URI/URL parsing library (one from Kim, one from John, and one from me). Kim's attempt was more of an OO approach (please correct me if I'm wrong Kim), something that I felt was too "simple" and could also be done with just static polymorphism. John's approach has been in progress for a while now, uses Spirit.Classic, and adheres to the RFC's almost to the letter from what I've seen (given the EBNF). My approach is almost different from any approach that I've seen taken as far as URL parsing is concerned -- using template functions and template classes and a generic programming approach to specific URL parsing. However mine is not as close to the RFC as I'd like, and is not as well tested as I'd like either. Can us three gentlemen work together towards: 1) Adding better test coverage 2) Implementing the details of the RFC and 3) Merging what we can towards something that works and is release-ready? My criteria for a release-ready URI/URL parsing library are: * Something that can stand on its own and is able to handle HTTP(S) URLs with encoded characters left alone * A URL encoding function/library that will turn a string into a URL encoded string * Well documented library (concepts used, internal implementation, and nice readable user examples). HTTPS handling: John has started implementing persistent connections using a policy-based design that will actually change the innards of the current HTTP client. I can see that his approach (although a bit verbose for my taste) actually works semantically separating the connection management from the client implementation. I do have some issues with this approach breaking the simplicity of the client interface for users. Part of the HTTPS effort includes delegating connection management to a subsystem or component that determines what kind of connection or which persisting connection is used to service a request -- if it sees an https scheme then it's a matter of creating an SSL socket to the specified port on the destination host and then piping the already crafted HTTP request. Now the way I imagine doing this is through some connection_manager type which based on runtime variables will be able to create the appropriate connection type -- and if HTTP 1.1 was to be supported, also maintain connections that do support the default persistent connection in HTTP 1.1. This connection_manager can be encapsulated in the http::client so that the users don't have to worry about it. However, the choice of what specific *kind* of manager to instantiate is a client initialization setting. Something like this in client code: http::client c(http::follow_redirects, http::persistent, http::pipelined); http::response r = c.get("https://cpp-netlib.sourceforge.net/yaddayadda"); Inside the http::client class, would be something like this: template <...> struct client { private: shared_ptr<connection_manager> manager; public: client(...) : manager( connection_manager_factory::create_manager( /* client constructor parameters? */ ) ) {} // ... }; Then from within the connection_manager interface would be something like this: struct connection_manager { virtual shared_ptr<connection> get_connection(host, port); virtual void put_connection(shared_ptr<connection>); }; A connection type would then be the one handling the wire protocol implementation (supporting gzip for example, handling mime types, etc.). Maybe with this design we may even be able to implement stream-like response objects which have support for chunked-reading of data. With C++0x we may be able to get away with code like this on the client side: http::client c(/* init options */); auto stream = c.get("http://some.site.com/streamed", http::streamed) while (true) { auto chunk = read(stream, 1024); // will block read 1KB of the body being streamed if (size(chunk) > 0) { /* deal with the chunk */ } else break; } I'd even like to support Iterator/Range semantics too: http::client c(/* init options */); auto stream = c.get("http://some.site.com/streamed", http::streamed); copy(begin(stream), end(stream), ostream_iterator<char>(cout, "")); At this stage of the game though it's going to need some work to be done to get to this level of expressiveness and simplicity on the client side. Internally though we can go as complex and powerful as we may want, but the premium has to be put on the client code being easy to read and write. Sorry for the long post, but do you guys think we can get something done in this front that we can merge to trunk then have it released as version 0.4? Hope to hear from you soon! (BTW, please feel free to respond and change the subject line to indicate which part of this post you're responding to -- I didn't feel like sending a lot of emails with disparate topics being discussed in different emails because I felt this is all related in some manner. Thanks for understanding. :D) -- Dean Michael Berris blog.cplusplus-soup.com | twitter.com/mikhailberis linkedin.com/in/mikhailberis | facebook.com/dean.berris | deanberris.com |