From: Dean M. B. <mik...@gm...> - 2007-05-21 19:42:25
|
Hi Peter! On 21 May 2007 20:02:02 +0200, Peter Simons <si...@cr...> wrote: > > No, I think rfc2822 is a perfect name for that hierarchy. My > problem is with the "network" part, because the code is about > text parsing, not about networks. The location I'd prefer > would be more along the lines of "boost/spirit/rfc2822", > "boost/parsers/rfc2822", or even "/boost/ietf/rfc/2822/". > Whatever. The point is that putting this code into a network > hierarchy doesn't feel right, which is why I suggested a > different organization that offers more granularity. > Ah, now it makes more sense. :) I agree, this would be a good collection which I think merits itself worthy (I think) of inclusion into Boost either as part of Spirit or as a package in itself. :) > I guess there would be code inside of "boost/network" that > would depend on those parsers, but the parsers don't depend on > "boost/network". So I feel they shouldn't necessarily be a > part of it. > I agree completely. :) > > > [scope of boost.network] > > Alright, thank you for sharing your ideas. From what I can > tell, the things we ought to develop in this project are > general purpose HTTP and SMTP drivers that can be re-used in > clients and servers. I have implemented a (very rough) HTTP > server here: > > http://git.cryp.to/?p=mini-httpd > Yes, you're right. :) I'll go take a look in a while, but basically the idea I'm playing around with in my head is something to the effect of having a class like: namespace http { template < class Base, class ProcessingPolicy=policies::async_processing, class ThreadingPolicy=policies::default_threading > struct server : Base, ProcessingPolicy, ThreadingPolicy { // all bolerplate, and basic HTTP server specific code private: // ... require a handle function from the parent class // which processes a boost::network::message void _call_handle(boost::network::message message_) { ThreadingPolicy::scoped_lock _(*this); Base::handle(message_); }; }; }; // namespace http Which would be used like so: class my_handler { void handle(boost::network::message const & msg) { // manipulate the message however i want to }; }; { http::server<my_handler> handler_instance; boost::thread http_thread(handler_instance); http_thread.join(); } So the goal would be to allow people to make "Custom HTTP Services" without having to worry too much about the details of an HTTP Server implementation (which of course under the hood requires Boost.Asio). > I would very much appreciate feedback concerning that daemon, > by the way. The latest "snapshot" comes with a > Autoconf/Automake build system that should -- in theory -- > work on most Unix machines. My understanding is, however, that > there is at least one more project (libhttpd?) that tries to > do the same thing. > I'd take a look when I get the time... For the week I'm being taken over by the day job... So I doubt I can do that within this week. Maybe others can give Peter a hand at evaluating/profiling/testing the daemon? :-) > Concerning SMTP, I have no working implementation, but I do > have a strong interest in the subject: > > http://postmaster.cryp.to/ > > That daemon is written in Haskell. That was great fun, but I'd > like to turn that design into working C++ code, eventually. > Nice! Pretty much the above sample is what I'd use to do something like this in C++ (using the Parametric Base Class Pattern or PBCP, as how Joel de Guzman puts it in the documentation of Phoenix actors). > > > Components we anticipate are (aside from the message class): > > > > - asynchronous name resolver > > Yes, an asynchronous DNS resolver is badly missed. > Implementing the DNS protocol from scratch is, however, a > significant effort. It might be wise to base our efforts on > one of the existing libraries, e.g.: > > http://www.chiark.greenend.org.uk/~ian/adns/ > http://daniel.haxx.se/projects/c-ares/ > Chris Kohlhoff has proposed a way of using future<> and promise<> objects to implement asynchronous DNS lookups using a Singleton Active Object DNS resolver -- of course, using Boost.Asio still under the hood. So I intend to use that as the (simple) implementation of the asynchronous DNS resolver. > > > - "synchronous timeout reader/writer" > > I'm not quite sure what your description refers to. I've found > that covenience suggests another layer on top of Asio in order > to further separation of application code and i/o code. My > best bet into that direction so far is visible here: > > http://git.cryp.to/?p=mini-httpd;a=blob;f=io.hpp;hb=master > http://git.cryp.to/?p=mini-httpd;a=blob;f=io-driver.cpp;hb=master > The idea is to be able to do a: try { timeout_read(socket_ptr, buffer, boost::posix_time::seconds(30)); } catch (boost::network::read::timeout & e) { // the read has timed out. }; Which is sorely missed in Boost.Asio. I've implemented something similar based on Chris Kohlhoff's advice as well, which I intend to make part of the library. :) > > Concerning the message class, I have to admit to being > reserved about the use of operator<<() to construct headers. > In my humble opinion, the statement > > msg << header("SOME_HEADER", "SOME_CONTENT") ; > > should rather read: > > msg["SOME_HEADER"] = "SOME_CONTENT"; > msg["SOME_HEADER"] += "SOME_MORE_CONTENT"; > It's pretty trivial to make your suggestion above happen, and the only reason I use operator<<() is that it allows me to "pass a directive to a message". In the case of assignment of values as you have described above, this does make sense and I agree we should support that usage as well. However, the particular use case below makes it possible to do pretty funky things with message construction: msg << header("SOME_HEADER", "SOME_CONTENT") << transform(match_header("SOME_HEADER"), boost::bind(boost::to_lower(_1))) // using bind << mutate(body(), arg1 = "Hello, World!") // using phoenix as well ; So yes, though msg["SOME_HEADER"] = "SOME_CONTENT"; might work if the internal representation of a message is a std::map<...>, I'm not entirely sure this makes sense for a multimap (like what we're currently using). The current recommended method of accessing the headers of a message are as described in the test case: headers_range<message>::type range = headers(msg)["SOME_HEADER"]; which is why I'm apprehensive with allowing msg["HEADER"] -- because it doesn't convey that you're manipulating/accessing the headers of the message, rather attributes of the message. Besides, msg["HEADER"] doesn't make sense for a multimap as described above. :) > Use of operator<<() for constructing bodys is fine though. One > thing I'm not quite sure about yet is whether that message API > holds up to incremental construction and consumption, e.g. in > a non-blocking I/O model. > Right... But this is all keeping in mind that we'll be using the construction routines in a spirit action: body_parser_p[ var(msg) << body(construct<std::string>(arg1, arg2)) ]; // using phoenix I'm not sure though about how asynchronous construction will be affected though... In the traditional Boost.Asio asynchronous classes, you typically have handlers which are scheduled/called asynchronously. That means if you have a method "handle_1" and "handle_2", constructing a message asynchronously based on network protocols would be trivial: void handle_1(std::string some_string, message & m) const { m << header("1", some_string); } void handle_2(std::string some_string, message & m) const { m << header("2", some_string); } Now locking would have to be involved somewhere here, which can be taken care of by boost::asio::strand's for handlers that perform message manipulations. > I also feel that usage of the type > > std::basic_string<char, std::char_traits<char>, std::allocator> > > shouldn't be hard-coded in messages. I guess it's fine to > commit to 'char' as a character type (although unnecessary), > but people will definitely want to choose their own > allocators. Specifying different trait classes per message > type might also be useful. RFC headers typically are > case-insensitive, for example. The trait class attached below > might be useful in that context. > Yup, the next stages of refactoring should address this. I guess it should be template parameters to the basic_message class, where reasonable defaults are set. That being said, the other types/directives will have to support the type change to the message class. :D > Best regards, > Peter > Thank you very much for the insights! They definitely open my eyes wide to other issues I've tend to overlook. :) -- Dean Michael C. Berris http://cplusplus-soup.blogspot.com/ mikhailberis AT gmail DOT com +63 928 7291459 |
From: Glyn M. <gly...@gm...> - 2007-05-21 20:02:57
|
Guys, On 21/05/07, Dean Michael Berris <mik...@gm...> wrote: > > Hi Peter! > > On 21 May 2007 20:02:02 +0200, Peter Simons <si...@cr...> wrote: > > > > No, I think rfc2822 is a perfect name for that hierarchy. Ah, now it makes more sense. :) I agree, this would be a good > collection which I think merits itself worthy (I think) of inclusion > into Boost either as part of Spirit or as a package in itself. :) FWIW I'm inclined to agree. > > > [scope of boost.network] > > > > Alright, thank you for sharing your ideas. From what I can > > tell, the things we ought to develop in this project are > > general purpose HTTP and SMTP drivers that can be re-used in > > clients and servers. I have implemented a (very rough) HTTP > > server here: > > > > http://git.cryp.to/?p=mini-httpd > > I've taken a brief look at the code, but I'll have to look at it in more depth to make more comments. But I do notice you're using John Torjo's logging library. I may have missed something but will this be present in a future version of boost? Is it your intention to use this for boost.networktoo? <snip code> > > So the goal would be to allow people to make "Custom HTTP Services" > without having to worry too much about the details of an HTTP Server > implementation (which of course under the hood requires Boost.Asio). That's an admirable goal. I'd take a look when I get the time... For the week I'm being taken > over by the day job... So I doubt I can do that within this week. > > Maybe others can give Peter a hand at evaluating/profiling/testing the > daemon? :-) Yes I intend to, but I will be rather busy this week and i'll be abroad over the weekend so I can't promise too much at the moment. Regards, Glyn |
From: Peter S. <si...@cr...> - 2007-05-22 22:32:33
|
Hi guys! > [The idea for a message class is that it can be used like so:] > > class my_handler { > void handle(boost::network::message & msg) { > // manipulate the message however i want to > }; > }; I wonder at what times during the HTTP transmission this handler function would be invoked? From the server's perspective, the relevant state changes occur here: 1) Complete HTTP header was received. Callback has a chance to say "error response" without waiting for the body to be transmitted. (Errors generally have "Connection: close" and can be written early in the HTTP dialogue.) 2) Body data is received. a) Body still incomplete? Then the handler has the chance to do something with the data we already have. If possible, the handler should take ownership of the memory because bodies may be very large ("upload") and the HTTP doesn't want to buffer all that in its entirety. b) Body complete? Invoke the handler so that it has the chance to put a response into the output buffer. 3) Output buffer is empty. a) Does the handler have more data that needs to be written? Then go back to (3). b) The transaction is complete. Does that make any sense? What is the situation regarding protocols other than HTTP? SMTP, for instance, has no real use for the boost::network::message class because the payload is opaque for the server. There even is an ESMTP extension CHUNKING (a.k.a "Binary MIME") that treats the payload as an 8-bit octet stream without further structure. An e-mail can be represented as a boost::network::message, but an SMTP server doesn't really it. Applications that do need 'message', however, would be mail readers, news readers, and all kinds of mail-to-something-else converters. That is a fairly large market, given what kind of devices can deal with e-mail these days. > http::server<my_handler> handler_instance; > boost::thread http_thread(handler_instance); > http_thread.join(); How would the http::server<T> class mentioned here be used in a single-threaded environment? Not every device that knows how to speak IP also knows how to do multi-threading. In my humble opinion, multi-threading should be an option, but no requirement. How do others feel about this subject? >> An asynchronous DNS resolver is badly missed. Implementing >> the DNS protocol from scratch is, however, a significant >> effort. It might be wise to base our efforts on one of the >> existing libraries, e.g.: >> >> http://www.chiark.greenend.org.uk/~ian/adns/ >> http://daniel.haxx.se/projects/c-ares/ > > Chris Kohlhoff has proposed a way of using future<> and promise<> > objects to implement asynchronous DNS lookups using a Singleton Active > Object DNS resolver. Exactly, a DNS server invokes the proper handler function every time a response is received, and because of the stateless nature of the DNS protocol, there is no need for having more than one resolver per process. What I don't understand is how these insights get anyone any closer to implementing the DNS protocol. Neither Adns nor C-Ares care what I/O main loop drives them; both libraries are concerned with interpreting DNS datagrams. My personal opinion is that it would be an incredible waste of time to re-invent that wheel. I noticed, though, that Chris has a knack for re-inventing perfectly good wheels every know and then. I guess it's actually quite likely that he'll end up re-implementing DNS. I'll let myself be surprised. > try { > timeout_read(socket_ptr, buffer, boost::posix_time::seconds(30)); > } catch (boost::network::read::timeout & e) { > // the read has timed out. > }; I don't think that function can exist in a non-blocking I/O model. Unless, of course, setjmp() / longjmp() counts as a solution. ;-) > msg["SOME_HEADER"] = "SOME_CONTENT" might work if the internal > representation of a message is a std::map<...>, I'm not > entirely sure this makes sense for a multimap (like what we're > currently using). It's difficult to say whether a map or a multi-map is the best choice. The problem is that some protocols require a multi-map and some protocols don't. An e-mail may have any number of "Received:" lines. In HTTP, however, headers are generally unique. Anyway, this is a stimulating conversation. It's fun to think and talk about these topics. Best regards, Peter |
From: Glyn M. <gly...@gm...> - 2007-05-23 08:48:07
|
Guys, On 23 May 2007 00:32:20 +0200, Peter Simons <si...@cr...> wrote: > > How would the http::server<T> class mentioned here be used in a > single-threaded environment? Not every device that knows how to > speak IP also knows how to do multi-threading. In my humble > opinion, multi-threading should be an option, but no requirement. > > How do others feel about this subject? I think it could be possible to implement multi-threading as a policy. I certainly agree with Peter that this is optional. > Anyway, this is a stimulating conversation. It's fun to think and > talk about these topics. It certainly is. I'm getting new insights into network programming. It will be interesting to see what we can produce. G |
From: Peter S. <si...@cr...> - 2007-05-22 22:49:32
|
Hi Glyn, you wondered: > I do notice [mini-httpd uses] John Torjo's logging library. I > may have missed something but will this be present in a future > version of boost? As far as I know, the original Boost.Log library was rejected during the review. A new proposal recently showed up on the developer's mailing list. I don't know much about it, though. Personally, I access logging functions only through macros in my programs to make sure the source code doesn't depend on any concrete library. I hardly ever need sophisticated features. On Unix systems, all I really need is syslog(3). :-) > Is it your intention to use this for boost.network too? No. My preference would be that logging is the responsibility of the users of this library. With the exception of debugging, I wouldn't know what kind of log messages a networking library would need to write. Providing functions that pretty-print data structures into strings, say an Apache-style access log entry, is one thing, but actually writing these strings to some output device is another thing. How to do that is a problem ::main() should solve. Does that sound reasonable? Best regards, Peter |
From: Glyn M. <gly...@gm...> - 2007-05-23 08:48:48
|
Peter, On 23 May 2007 00:49:24 +0200, Peter Simons <si...@cr...> wrote: > > Is it your intention to use this for boost.network too? > > No. My preference would be that logging is the responsibility of > the users of this library. With the exception of debugging, I > wouldn't know what kind of log messages a networking library > would need to write. Providing functions that pretty-print data > structures into strings, say an Apache-style access log entry, is > one thing, but actually writing these strings to some output > device is another thing. How to do that is a problem ::main() > should solve. > > Does that sound reasonable? Yes. I think more generally its important at this stage to determine clearly our policies w.r.t. things like logging, thread-safety, security because we have to ensure that we don't hinder boost.network users from creating what Dean described as "Custom Web Services". Someone is always going to have, for example, complex logging requirements. Do we just provide a minimal system and say "X, Y and Z is up to you" or is it possible to anticipate as far as possible what library users might need? I'd prefer, if possible, the former and I think everyone agrees with this, but it does require that we be careful. Forgive the sometimes basic questions, I am not as experienced with C++ networking as other people on this list and I'm trying to understand how this will work conceptually. G |
From: Dean M. B. <mik...@gm...> - 2007-05-23 14:28:47
|
Hi Glyn! On 5/23/07, Glyn Matthews <gly...@gm...> wrote: > > On 23 May 2007 00:49:24 +0200, Peter Simons < si...@cr...> wrote: > > > Is it your intention to use this for boost.network too? > > > > No. My preference would be that logging is the responsibility of > > the users of this library. With the exception of debugging, I > > wouldn't know what kind of log messages a networking library > > would need to write. Providing functions that pretty-print data > > structures into strings, say an Apache-style access log entry, is > > one thing, but actually writing these strings to some output > > device is another thing. How to do that is a problem ::main() > > should solve. > > > > Does that sound reasonable? > > > Yes. I think more generally its important at this stage to determine > clearly our policies w.r.t. things like logging, thread-safety, security > because we have to ensure that we don't hinder boost.network users from > creating what Dean described as "Custom Web Services". Someone is always > going to have, for example, complex logging requirements. Do we just > provide a minimal system and say "X, Y and Z is up to you" or is it possible > to anticipate as far as possible what library users might need? I'd prefer, > if possible, the former and I think everyone agrees with this, but it does > require that we be careful. > I agree with providing a minimal yet comprehensive network library completely. Logging is not within the realm of network operations -- and I have absolutely no intention of getting into (say, even near) the Boost.Log discussion, which I honestly think is going nowhere. Let's let the users choose how they want to log things and just throw meaningful exceptions when there are exceptional conditions met in the code. :) > Forgive the sometimes basic questions, I am not as experienced with C++ > networking as other people on this list and I'm trying to understand how > this will work conceptually. > I don't know about the others, but I certainly appreciate well asked questions. :) I don't necessarily love answering, but I do love questions nonetheless. :D So please, if you have questions, don't hesitate to ask away. :-) -- Dean Michael C. Berris http://cplusplus-soup.blogspot.com/ mikhailberis AT gmail DOT com +63 928 7291459 |
From: Dean M. B. <mik...@gm...> - 2007-05-23 14:20:25
|
Hi Peter! On 23 May 2007 00:32:20 +0200, Peter Simons <si...@cr...> wrote: > > > [The idea for a message class is that it can be used like so:] > > > > class my_handler { > > void handle(boost::network::message & msg) { > > // manipulate the message however i want to > > }; > > }; > > I wonder at what times during the HTTP transmission this handler > function would be invoked? From the server's perspective, the > relevant state changes occur here: > It's too early to tell when, but I'll try and put down my thoughts on the suggestions you have below. I must admit, I was just thinking of the simplest case in the example I gave in my other message's implementation. :) > 1) Complete HTTP header was received. Callback has a chance to > say "error response" without waiting for the body to be > transmitted. (Errors generally have "Connection: close" and > can be written early in the HTTP dialogue.) > This can be done, and implemented as a strategy part of the http::server<>'s template parameters. A "policies::header_complete_processor<...>" can be used as such: namespace http { template <class HandlerBase, class HeaderCompleteProcessor = policies::header_complete_ignore<void>, class BodyCompleteProcessor = policies::body_complete_processor<HandlerBase> class StreamingProcessor = policies::streaming_processor<void> > struct server : forward_to_real_impl<HandlerBase, HeaderCompleteProcessor, BodyCompleteProcessor, StreamingProcessor>::type { }; }; // namespace http > 2) Body data is received. > > a) Body still incomplete? Then the handler has the chance to > do something with the data we already have. If possible, > the handler should take ownership of the memory because > bodies may be very large ("upload") and the HTTP doesn't > want to buffer all that in its entirety. > > b) Body complete? Invoke the handler so that it has the > chance to put a response into the output buffer. > This behavior can be enabled in as much as policies above provide. It should be noted that a different handler can be provided for when the headers are complete, and when the body is already complete -- and yet another when the body is being processed as the data is streaming in. These different handlers could be differentiated by providing wrapper types (for metaprogramming) like suggested above: policies::header_complete_processor<my_header_processor> policies::body_complete_processor<my_body_processor> policies::streaming_processor<my_stream_validator> These policy classes can either be inherited from (which isn't the way I'd personally like it used) or encapsulated and used as instance variables in the http::server type. This now all depends on how the forwarding type behaves -- it could also be noted that this is currently still entirely up to us. ;-) > 3) Output buffer is empty. > > a) Does the handler have more data that needs to be written? > Then go back to (3). > > b) The transaction is complete. > > Does that make any sense? > #3 I don't quite understand (yet). Maybe a couple more hours sleep and a good re-read would do me good. :) But I think the callback mechanism is missing a handle/reference/smart_ptr to the actual socket from which the message came from. So I guess the handler's signature should look something like: struct my_handler { void operator() (boost::network::message const & message, http::socket_ptr_type socket) { // do whatever needs to be done } }; > What is the situation regarding protocols other than HTTP? SMTP, > for instance, has no real use for the boost::network::message > class because the payload is opaque for the server. There even is > an ESMTP extension CHUNKING (a.k.a "Binary MIME") that treats the > payload as an 8-bit octet stream without further structure. An > e-mail can be represented as a boost::network::message, but an > SMTP server doesn't really it. > SMTP servers, I am still a bit hazy on the implementation. But what I think is important (as well as in the case of HTTP) is to look at SMTP processing from the client perspective. The SMTP server can, and like as you suggest (and I think should), create a boost::network::message which is a representation of the SMTP session -- and have the body contain a serialized boost::network::message still encapsulating the email message and/or the attachments or whatnot. In the same light, an SMTP client implementation would just require one boost::network::message which represents the SMTP session, where the body is a serialized boost::network::message. If we intend to send many messages over a single link, we could enable that with the following technique: { // open a scope for the smtp connection smtp::session my_persistent_session(smtp::server("mail.google.com", 575, boost::network::tls), smtp::credentials("username", "password")); boost::network::message m; m << header("To", "mik...@gm...") << header("From", "so...@th...mewhere.nw") << destination("mik...@gm...") << source('so...@th...mewhere.nw") << header("Subject", "This is a test message.") << body("The quick brown fox jumps over the lazy dog") ; for (unsigned int i = 0; i < 100; ++i) // send 100 times smtp::send_message(my_persistent_session, m); } // close the scope for the smtp connection > Applications that do need 'message', however, would be mail > readers, news readers, and all kinds of mail-to-something-else > converters. That is a fairly large market, given what kind of > devices can deal with e-mail these days. > I agree completely, and the library should be able to work on these devices. .:) > > http::server<my_handler> handler_instance; > > boost::thread http_thread(handler_instance); > > http_thread.join(); > > How would the http::server<T> class mentioned here be used in a > single-threaded environment? Not every device that knows how to > speak IP also knows how to do multi-threading. In my humble > opinion, multi-threading should be an option, but no requirement. > I agree. Threading can be "yet another policy", and I tend to err on the side of caution than anything else. I'd like to explicitly say that the library is multi-threading -- so that we support the larger (I think) more dominant target which are desktop/infrastructure developers -- then provide mechanisms for single threaded operations. > How do others feel about this subject? I'm no expert on this, and would defer this to the community at large -- those evaluating and using the library. If there's a significant size of the user base looking for single threaded modes for the servers/clients, I would be inclined to consider providing single threaded implementations. However, I would jump the gun and say that it shouldn't be impossible to provide single-threaded implementations of the servers/client. :-) > > > > Chris Kohlhoff has proposed a way of using future<> and promise<> > > objects to implement asynchronous DNS lookups using a Singleton Active > > Object DNS resolver. > > Exactly, a DNS server invokes the proper handler function every > time a response is received, and because of the stateless nature > of the DNS protocol, there is no need for having more than one > resolver per process. What I don't understand is how these > insights get anyone any closer to implementing the DNS protocol. > Remember though that Chris already has a resolver in Boost.Asio... So I don't think we need to implement the DNS protocol -- unless you're suggesting that we actually have a dns::server<> template, which I think really is beyond the scope of the library ;-). > Neither Adns nor C-Ares care what I/O main loop drives them; both > libraries are concerned with interpreting DNS datagrams. My > personal opinion is that it would be an incredible waste of time > to re-invent that wheel. I noticed, though, that Chris has a > knack for re-inventing perfectly good wheels every know and then. > I guess it's actually quite likely that he'll end up > re-implementing DNS. I'll let myself be surprised. > I think you'll like to look at Boost.Asio's name resolver... :D > > > try { > > timeout_read(socket_ptr, buffer, boost::posix_time::seconds(30)); > > } catch (boost::network::read::timeout & e) { > > // the read has timed out. > > }; > > I don't think that function can exist in a non-blocking I/O > model. Unless, of course, setjmp() / longjmp() counts as a > solution. ;-) > It's meant to be a "blocking read" (referring to timeout_read(...) ). For non-blocking I/O, I don't think we need to deviate from how Boost.Asio does the asynchronous read stuff. :D > > > msg["SOME_HEADER"] = "SOME_CONTENT" might work if the internal > > representation of a message is a std::map<...>, I'm not > > entirely sure this makes sense for a multimap (like what we're > > currently using). > > It's difficult to say whether a map or a multi-map is the best > choice. The problem is that some protocols require a multi-map > and some protocols don't. An e-mail may have any number of > "Received:" lines. In HTTP, however, headers are generally > unique. > Actually, the reason I chose a multimap is precisely the reason you cite: because some protocols need non-unique headers, while some don't -- and since a multimap can work for either case (where the transformations/validations will just deal with the details), I chose to use a multimap for all cases. :-) One of my partners at http://software.orangeandbronze.com/ and I were talking about this in another project, where I chose to use a map instead of a multi-map. Needless to say, his case makes more sense to use a multimap because it can do pretty much anything a map can and then some. :) > Anyway, this is a stimulating conversation. It's fun to think and > talk about these topics. > Definitely. :) > Best regards, > Peter > Thanks for the insights! It definitely helps me make things clearer in my head, and write down the thoughts (somewhere) albeit in an "informal" manner. :) -- Dean Michael C. Berris http://cplusplus-soup.blogspot.com/ mikhailberis AT gmail DOT com +63 928 7291459 |
From: Peter S. <si...@cr...> - 2007-05-23 17:42:35
|
Hi Dean, it feels like we have different perspectives on the subject of networking. I mostly care about the server side. You seem to have more experience with the client side. Consequently, our views disagree on several occasions. This is good thing, because by finding a solution that we can agree on, we will come up with a solution that is better than any one either of us could have come up with alone. The difficult bit is to handle disagreements constructively. After thinking some ideas through, I feel that I might have been aiming too high. An HTTP implementation is beyond our scope. I see it like this: when I want to write a sophisticated web application, I don't bother with an HTTP server. I write a CGI or FastCGI that works with _any_ HTTP server. Even assuming that I would have to commit to one particular HTTP server, frankly, I'd probably write an Apache module instead of a policy-driven handler functor for Boost.Network. I wonder what our users hope to find in this library when it's finished. Please, everyone feel encouraged to post a list of things you would personally like to have; things you miss when writing network-oriented C++ code. Here is what I'd like to have: (1) A class to represent any kind of URI. I want to be able to read an URI by saying "Uri uri; cin >> uri;", and there it is. I would also like to be able to write open(uri) to obtain a stream socket that is connected to the address the URI represents. (libcurl does something like this in C.) (2) Uri's know "mail:user@domain", so clearly (1) requires a class to represent an e-mail address. I frequently need lists of addresses too. (3) Parsers for all kinds of HTTP and E-Mail headers. I have "Date: Wed May 23 19:05:46 CEST 2007\r\n", and now I want to know what time that is. The same goes for "If-Modified-Since:" and a lot of other headers. (4) A class to represent multi-part MIME messages. MIME is used in e-mail and in HTTP, yet in practice it's notoriously hard to handle. A good MIME Message class should support registering content-type-specific data handlers. Issuing a GPG certificates -- or verifying one -- is something such user-supplied function could do. If Boost.Network offers high-quality solutions for those problems, I'll be a user. I also feel that getting those classes implemented and documented up to the standards Boost expects will be a major effort that's measured in months, not weeks. Anyway, we have also talked about DNS: > Remember though that Chris already has a resolver in > Boost.Asio... Well, Asio has a wrapper for getnameinfo(). That is a portable system call that gives synchronous access to the resolver the OS uses. The system resolver is very convenient and great to have, but it is a synchronous resolver. A great many applications can't use it. Furthermore, the system resolver knows only about A records and PTR records. You can resolve names to addresses or addresses to names, but you cannot look up a mail exchanger or a TXT record. For all but the simplest applications, the system resolver is insufficient. I feel an asynchronous DNS resolver is beyond the scope of Asio or Boost.Network. Providing convenient and type-safe access to the asynchronous resolvers other people have written, however, might be worthwhile. Take care, Peter |
From: Alex O. <al...@gm...> - 2007-05-23 18:40:23
|
Hello all On 23 May 2007 19:42:26 +0200, Peter Simons <si...@cr...> wrote: > Hi Dean, > > it feels like we have different perspectives on the subject of > networking. I mostly care about the server side. You seem to have > more experience with the client side. Consequently, our views > disagree on several occasions. This is good thing, because by > finding a solution that we can agree on, we will come up with a > solution that is better than any one either of us could have come > up with alone. > > The difficult bit is to handle disagreements constructively. > > After thinking some ideas through, I feel that I might have been > aiming too high. An HTTP implementation is beyond our scope. I > see it like this: when I want to write a sophisticated web > application, I don't bother with an HTTP server. I write a CGI or > FastCGI that works with _any_ HTTP server. Even assuming that I > would have to commit to one particular HTTP server, frankly, I'd > probably write an Apache module instead of a policy-driven > handler functor for Boost.Network. > > I wonder what our users hope to find in this library when it's > finished. Please, everyone feel encouraged to post a list of > things you would personally like to have; things you miss when > writing network-oriented C++ code. Here is what I'd like to have: > > (1) A class to represent any kind of URI. I want to be able to > read an URI by saying "Uri uri; cin >> uri;", and there it > is. I would also like to be able to write open(uri) to > obtain a stream socket that is connected to the address the > URI represents. (libcurl does something like this in C.) > > (2) Uri's know "mail:user@domain", so clearly (1) requires a > class to represent an e-mail address. I frequently need > lists of addresses too. I'm not sure, that mailto: should be treated as an uri at SMTP messages > (3) Parsers for all kinds of HTTP and E-Mail headers. I have > "Date: Wed May 23 19:05:46 CEST 2007\r\n", and now I want to > know what time that is. The same goes for > "If-Modified-Since:" and a lot of other headers. > > (4) A class to represent multi-part MIME messages. MIME is used > in e-mail and in HTTP, yet in practice it's notoriously hard > to handle. A good MIME Message class should support > registering content-type-specific data handlers. Issuing a > GPG certificates -- or verifying one -- is something such > user-supplied function could do. Yes, MIME support is important for all of protocols in current time. In our parsers, that we implemented on my previous work (currently closed source) we had presented the message as a "lazy" structure, that parse data when we try to access to underlying elements (usually by name). So for example, HTTP request consists from: Method URL Version Headers empty line Body every of elements had parsers, associated with them, and it was possible to change parsers at runtime (especially for POST's). For headers, also was a list of mappings Name -> parser, that allow us to transparently parse data in headers, and represent them as a string or corresponding structure (members of that, also was available via named map). And when members of structure was changed, the text representation also will rebuild, when requested Here small example: node_type uri("uri", request["uri"].value(), new mimer::http::uri_nparser<std::string>); cout << "query=" << uri["query"].value() << std::endl; // .... process data ... request["uri"]=newUri; // modify it -- With best wishes, Alex Ott, MBA http://alexott.blogspot.com/ http://alexott-ru.blogspot.com/ http://content-filtering.blogspot.com/ http://xtalk.msk.su/~ott/ |
From: Glyn M. <gly...@gm...> - 2007-05-24 07:14:36
|
Peter, On 23 May 2007 19:42:26 +0200, Peter Simons <si...@cr...> wrote: > > I wonder what our users hope to find in this library when it's > finished. Please, everyone feel encouraged to post a list of > things you would personally like to have; things you miss when > writing network-oriented C++ code. I'll put my user's hat on for the moment: 1) A set of easy-to-use common protocols. When using C++ I find myself rewriting a lot of this kind of code. 2) The ability to create and use custom protocols within the network framework. In my current position, I find myself communicating with exotic hardware that uses their own protocols. I'd love to have a way to just use these within a C++ network framework. 3) Support for encryption and authentication. Peter, I'd go along with all four points in your list too. I use C++ a lot in my work and I look with envy at other languages which have much better standard library support for network programming. To my mind, this is a major flaw with the current C++ libraries, that there is no standard support for something as important as network programming. Of course, the recent discussions on this list demonstrate that is far from being simple even to decide what kind of support C++ programmers require with approaches from server and client perspectives. Another difficulty is that programmers quite different points of access (for want of a better term) to programming networking applications. What I mean by that is that different applications need to be written on different network layers and it is important that a library expose an interface to each layer. Anyway, we have also talked about DNS: > I feel an asynchronous DNS resolver is beyond the scope of Asio > or Boost.Network. Providing convenient and type-safe access to > the asynchronous resolvers other people have written, however, > might be worthwhile. I agree with this. More generally, I think its important that we provide simple, type-safe means for interfacing with other existing network C++ libraries and applications. G |
From: Dean M. B. <mik...@gm...> - 2007-05-25 01:51:34
|
Hi Peter! On 23 May 2007 19:42:26 +0200, Peter Simons <si...@cr...> wrote: > > it feels like we have different perspectives on the subject of > networking. I mostly care about the server side. You seem to have > more experience with the client side. Consequently, our views > disagree on several occasions. This is good thing, because by > finding a solution that we can agree on, we will come up with a > solution that is better than any one either of us could have come > up with alone. > I agree. When I proposed this networking library, I was more concerned about implementing common client protocols, and have been pretty hazy (or not very dedicated) to writing implementations for the server side. Your insights have opened my eyes to some issues I've failed to consider on the server side implementation details, and now I'm doing a lot of re-thinking about what the most productive approach would be. > The difficult bit is to handle disagreements constructively. > Indeed. I however, am very open to suggestions and comments, so please keep them coming. I certainly appreciate the input at any level. :-) > After thinking some ideas through, I feel that I might have been > aiming too high. An HTTP implementation is beyond our scope. I > see it like this: when I want to write a sophisticated web > application, I don't bother with an HTTP server. I write a CGI or > FastCGI that works with _any_ HTTP server. Even assuming that I > would have to commit to one particular HTTP server, frankly, I'd > probably write an Apache module instead of a policy-driven > handler functor for Boost.Network. > On the server side, I agree that it might be too ambitious to implement the HTTP protocol and provide a "hook-into-skeleton" implementation. But I'm not afraid to try and fail in this regard. :-) So we can definitely forgo the server side HTTP implementation. While I do agree that it would make sense to just write an application that talked FastCGI or whatnot instead of using a http::server<...>, in my work at Friendster I've seen that this does not scale well for thousands of connections at a time. I however am looking more at a very flexible HTTP client library. One which any C++ programmer with STL experience will be able to use right off the bat with minimal mental cartwheels. > I wonder what our users hope to find in this library when it's > finished. Please, everyone feel encouraged to post a list of > things you would personally like to have; things you miss when > writing network-oriented C++ code. Here is what I'd like to have: > [snipped great list] I'd like to add the following to that: o A set of primitive networking routines like a "blocking read that times out" or a "self-reconnecting socket connection" and "a raw packet builder/sender". o A minimal but comprehensive HTTP client implementation o A minimal but comprehensive SMTP client implementation o A flexible interface/framework where I can add more protocol implementations easily, or with very little effort (example, I want to implement a new binary protocol, I just need to add a few classes which talk a certain "language" and then leverage on whatever Boost.Network already has) So indeed, a URI class and a MIME handler would be good to have. I'm still not sure though how we could make that work without dynamic runtime programming (virtual, all that inefficient stuff). > > If Boost.Network offers high-quality solutions for those > problems, I'll be a user. I also feel that getting those classes > implemented and documented up to the standards Boost expects will > be a major effort that's measured in months, not weeks. > Yes, but I am positive we can get a minimal HTTP/SMTP client going in a couple weeks time. ;-) So the sooner I'm able to commit some more stuff sitting on my machine, the better it would look as the days go by. :-) If anybody also has stuff you'd like to put into the subversion, please feel free to do so. The sooner we can evaluate code from other people, the faster this process could go. :-) (Take this as a call for help, and if you can pick up from where the code currently already is). > Anyway, we have also talked about DNS: > > > Remember though that Chris already has a resolver in > > Boost.Asio... > > Well, Asio has a wrapper for getnameinfo(). That is a portable > system call that gives synchronous access to the resolver the OS > uses. The system resolver is very convenient and great to have, > but it is a synchronous resolver. A great many applications can't > use it. Furthermore, the system resolver knows only about A > records and PTR records. You can resolve names to addresses or > addresses to names, but you cannot look up a mail exchanger or a > TXT record. For all but the simplest applications, the system > resolver is insufficient. > Ah, okay. Then that would be a good thing to implement, if the DNS client that works on most platforms is inadequate, having one in Boost.Network would make sense. Unless there are already better implementations that do the job already... > I feel an asynchronous DNS resolver is beyond the scope of Asio > or Boost.Network. Providing convenient and type-safe access to > the asynchronous resolvers other people have written, however, > might be worthwhile. > I think implementing a server would be out of bounds of the Boost.Network scope, but a header-only client can be very attractive. :-) > Take care, > Peter > Thanks very much Peter for the insights! :-) They definitely help. :-) -- Dean Michael C. Berris http://cplusplus-soup.blogspot.com/ mikhailberis AT gmail DOT com +63 928 7291459 |