Re: [cpp-netlib-devel] Insights to the Message Class and Network Routines

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Peter!

On 23 May 2007 00:32:20 +0200, Peter Simons <si...@cr...> wrote:
>
>  > [The idea for a message class is that it can be used like so:]
>  >
>  > class my_handler  {
>  >   void handle(boost::network::message & msg) {
>  >     // manipulate the message however i want to
>  >   };
>  > };
>
> I wonder at what times during the HTTP transmission this handler
> function would be invoked? From the server's perspective, the
> relevant state changes occur here:
>

It's too early to tell when, but I'll try and put down my thoughts on
the suggestions you have below. I must admit, I was just thinking of
the simplest case in the example I gave in my other message's
implementation. :)

>  1) Complete HTTP header was received. Callback has a chance to
>     say "error response" without waiting for the body to be
>     transmitted. (Errors generally have "Connection: close" and
>     can be written early in the HTTP dialogue.)
>

This can be done, and implemented as a strategy part of the
http::server<>'s template parameters. A
"policies::header_complete_processor<...>" can be used as such:

namespace http {
  template <class HandlerBase,
    class HeaderCompleteProcessor = policies::header_complete_ignore<void>,
    class BodyCompleteProcessor = policies::body_complete_processor<HandlerBase>
    class StreamingProcessor = policies::streaming_processor<void>
  > struct server
  : forward_to_real_impl<HandlerBase,
    HeaderCompleteProcessor,
    BodyCompleteProcessor,
    StreamingProcessor>::type { };
}; // namespace http

>  2) Body data is received.
>
>     a) Body still incomplete? Then the handler has the chance to
>        do something with the data we already have. If possible,
>        the handler should take ownership of the memory because
>        bodies may be very large ("upload") and the HTTP doesn't
>        want to buffer all that in its entirety.
>
>     b) Body complete? Invoke the handler so that it has the
>        chance to put a response into the output buffer.
>

This behavior can be enabled in as much as policies above provide. It
should be noted that a different handler can be provided for when the
headers are complete, and when the body is already complete -- and yet
another when the body is being processed as the data is streaming in.

These different handlers could be differentiated by providing wrapper
types (for metaprogramming) like suggested above:

policies::header_complete_processor<my_header_processor>
policies::body_complete_processor<my_body_processor>
policies::streaming_processor<my_stream_validator>

These policy classes can either be inherited from (which isn't the way
I'd personally like it used) or encapsulated and used as instance
variables in the http::server type. This now all depends on how the
forwarding type behaves -- it could also be noted that this is
currently still entirely up to us. ;-)

>  3) Output buffer is empty.
>
>     a) Does the handler have more data that needs to be written?
>        Then go back to (3).
>
>     b) The transaction is complete.
>
> Does that make any sense?
>

#3 I don't quite understand (yet). Maybe a couple more hours sleep and
a good re-read would do me good. :)

But I think the callback mechanism is missing a
handle/reference/smart_ptr to the actual socket from which the message
came from. So I guess the handler's signature should look something
like:

struct my_handler {
  void operator() (boost::network::message const & message,
    http::socket_ptr_type socket) {
    // do whatever needs to be done
  }
};

> What is the situation regarding protocols other than HTTP? SMTP,
> for instance, has no real use for the boost::network::message
> class because the payload is opaque for the server. There even is
> an ESMTP extension CHUNKING (a.k.a "Binary MIME") that treats the
> payload as an 8-bit octet stream without further structure. An
> e-mail can be represented as a boost::network::message, but an
> SMTP server doesn't really it.
>

SMTP servers, I am still a bit hazy on the implementation. But what I
think is important (as well as in the case of HTTP) is to look at SMTP
processing from the client perspective.

The SMTP server can, and like as you suggest (and I think should),
create a boost::network::message which is a representation of the SMTP
session -- and have the body contain a serialized
boost::network::message still encapsulating the email message and/or
the attachments or whatnot.

In the same light, an SMTP client implementation would just require
one boost::network::message which represents the SMTP session, where
the body is a serialized boost::network::message. If we intend to send
many messages over a single link, we could enable that with the
following technique:

{ // open a scope for the smtp connection
  smtp::session my_persistent_session(smtp::server("mail.google.com",
575, boost::network::tls), smtp::credentials("username", "password"));
  boost::network::message m;
  m << header("To", "mik...@gm...")
    << header("From", "so...@th...mewhere.nw")
    << destination("mik...@gm...")
    << source('so...@th...mewhere.nw")
    << header("Subject", "This is a test message.")
    << body("The quick brown fox jumps over the lazy dog")
  ;
  for (unsigned int i = 0; i < 100; ++i) // send 100 times
    smtp::send_message(my_persistent_session, m);
} // close the scope for the smtp connection

> Applications that do need 'message', however, would be mail
> readers, news readers, and all kinds of mail-to-something-else
> converters. That is a fairly large market, given what kind of
> devices can deal with e-mail these days.
>

I agree completely, and the library should be able to work on these devices. .:)

>  >   http::server<my_handler> handler_instance;
>  >   boost::thread http_thread(handler_instance);
>  >   http_thread.join();
>
> How would the http::server<T> class mentioned here be used in a
> single-threaded environment? Not every device that knows how to
> speak IP also knows how to do multi-threading. In my humble
> opinion, multi-threading should be an option, but no requirement.
>

I agree. Threading can be "yet another policy", and I tend to err on
the side of caution than anything else.

I'd like to explicitly say that the library is multi-threading -- so
that we support the larger (I think) more dominant target which are
desktop/infrastructure developers -- then provide mechanisms for
single threaded operations.

> How do others feel about this subject?

I'm no expert on this, and would defer this to the community at large
-- those evaluating and using the library. If there's a significant
size of the user base looking for single threaded modes for the
servers/clients, I would be inclined to consider providing single
threaded implementations.

However, I would jump the gun and say that it shouldn't be impossible
to provide single-threaded implementations of the servers/client. :-)

>  >
>  > Chris Kohlhoff has proposed a way of using future<> and promise<>
>  > objects to implement asynchronous DNS lookups using a Singleton Active
>  > Object DNS resolver.
>
> Exactly, a DNS server invokes the proper handler function every
> time a response is received, and because of the stateless nature
> of the DNS protocol, there is no need for having more than one
> resolver per process. What I don't understand is how these
> insights get anyone any closer to implementing the DNS protocol.
>

Remember though that Chris already has a resolver in Boost.Asio... So
I don't think we need to implement the DNS protocol -- unless you're
suggesting that we actually have a dns::server<> template, which I
think really is beyond the scope of the library ;-).

> Neither Adns nor C-Ares care what I/O main loop drives them; both
> libraries are concerned with interpreting DNS datagrams. My
> personal opinion is that it would be an incredible waste of time
> to re-invent that wheel. I noticed, though, that Chris has a
> knack for re-inventing perfectly good wheels every know and then.
> I guess it's actually quite likely that he'll end up
> re-implementing DNS. I'll let myself be surprised.
>

I think you'll like to look at Boost.Asio's name resolver... :D

>
>  >   try {
>  >     timeout_read(socket_ptr, buffer, boost::posix_time::seconds(30));
>  >   } catch (boost::network::read::timeout & e) {
>  >     // the read has timed out.
>  >   };
>
> I don't think that function can exist in a non-blocking I/O
> model. Unless, of course, setjmp() / longjmp() counts as a
> solution. ;-)
>

It's meant to be a "blocking read" (referring to timeout_read(...) ).
For non-blocking I/O, I don't think we need to deviate from how
Boost.Asio does the asynchronous read stuff. :D

>
>  > msg["SOME_HEADER"] = "SOME_CONTENT" might work if the internal
>  > representation of a message is a std::map<...>, I'm not
>  > entirely sure this makes sense for a multimap (like what we're
>  > currently using).
>
> It's difficult to say whether a map or a multi-map is the best
> choice. The problem is that some protocols require a multi-map
> and some protocols don't. An e-mail may have any number of
> "Received:" lines. In HTTP, however, headers are generally
> unique.
>

Actually, the reason I chose a multimap is precisely the reason you
cite: because some protocols need non-unique headers, while some don't
-- and since a multimap can work for either case (where the
transformations/validations will just deal with the details), I chose
to use a multimap for all cases. :-) One of my partners at
http://software.orangeandbronze.com/ and I were talking about this in
another project, where I chose to use a map instead of a multi-map.
Needless to say, his case makes more sense to use a multimap because
it can do pretty much anything a map can and then some. :)

> Anyway, this is a stimulating conversation. It's fun to think and
> talk about these topics.
>

Definitely. :)

> Best regards,
> Peter
>

Thanks for the insights! It definitely helps me make things clearer in
my head, and write down the thoughts (somewhere) albeit in an
"informal" manner. :)

-- 
Dean Michael C. Berris
http://cplusplus-soup.blogspot.com/
mikhailberis AT gmail DOT com
+63 928 7291459