Thread: [cpp-netlib-devel] Insights to the Message Class and Network Routines (was Re: Some code already up

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Peter!

On 21 May 2007 20:02:02 +0200, Peter Simons <si...@cr...> wrote:
>
> No, I think rfc2822 is a perfect name for that hierarchy. My
> problem is with the "network" part, because the code is about
> text parsing, not about networks. The location I'd prefer
> would be more along the lines of "boost/spirit/rfc2822",
> "boost/parsers/rfc2822", or even "/boost/ietf/rfc/2822/".
> Whatever. The point is that putting this code into a network
> hierarchy doesn't feel right, which is why I suggested a
> different organization that offers more granularity.
>

Ah, now it makes more sense. :) I agree, this would be a good
collection which I think merits itself worthy (I think) of inclusion
into Boost either as part of Spirit or as a package in itself. :)

> I guess there would be code inside of "boost/network" that
> would depend on those parsers, but the parsers don't depend on
> "boost/network". So I feel they shouldn't necessarily be a
> part of it.
>

I agree completely. :)

>
>  > [scope of boost.network]
>
> Alright, thank you for sharing your ideas. From what I can
> tell, the things we ought to develop in this project are
> general purpose HTTP and SMTP drivers that can be re-used in
> clients and servers. I have implemented a (very rough) HTTP
> server here:
>
>   http://git.cryp.to/?p=mini-httpd
>

Yes, you're right. :)

I'll go take a look in a while, but basically the idea I'm playing
around with in my head is something to the effect of having a class
like:

namespace http {
template <
  class Base,
  class ProcessingPolicy=policies::async_processing,
  class ThreadingPolicy=policies::default_threading
>
struct server : Base, ProcessingPolicy, ThreadingPolicy {
  // all bolerplate, and basic HTTP server specific code
  private:

  // ... require a handle function from the parent class
  // which processes a boost::network::message
  void _call_handle(boost::network::message message_) {
    ThreadingPolicy::scoped_lock _(*this);
    Base::handle(message_);
  };

};

}; // namespace http

Which would be used like so:

class my_handler  {
  void handle(boost::network::message const & msg) {
    // manipulate the message however i want to
  };
};

{
  http::server<my_handler> handler_instance;
  boost::thread http_thread(handler_instance);
  http_thread.join();
}

So the goal would be to allow people to make "Custom HTTP Services"
without having to worry too much about the details of an HTTP Server
implementation (which of course under the hood requires Boost.Asio).

> I would very much appreciate feedback concerning that daemon,
> by the way. The latest "snapshot" comes with a
> Autoconf/Automake build system that should -- in theory --
> work on most Unix machines. My understanding is, however, that
> there is at least one more project (libhttpd?) that tries to
> do the same thing.
>

I'd take a look when I get the time... For the week I'm being taken
over by the day job... So I doubt I can do that within this week.

Maybe others can give Peter a hand at evaluating/profiling/testing the
daemon? :-)

> Concerning SMTP, I have no working implementation, but I do
> have a strong interest in the subject:
>
>   http://postmaster.cryp.to/
>
> That daemon is written in Haskell. That was great fun, but I'd
> like to turn that design into working C++ code, eventually.
>

Nice! Pretty much the above sample is what I'd use to do something
like this in C++ (using the Parametric Base Class Pattern or PBCP, as
how Joel de Guzman puts it in the documentation of Phoenix actors).

>
>  > Components we anticipate are (aside from the message class):
>  >
>  >   - asynchronous name resolver
>
> Yes, an asynchronous DNS resolver is badly missed.
> Implementing the DNS protocol from scratch is, however, a
> significant effort. It might be wise to base our efforts on
> one of the existing libraries, e.g.:
>
>   http://www.chiark.greenend.org.uk/~ian/adns/
>   http://daniel.haxx.se/projects/c-ares/
>

Chris Kohlhoff has proposed a way of using future<> and promise<>
objects to implement asynchronous DNS lookups using a Singleton Active
Object DNS resolver -- of course, using Boost.Asio still under the
hood. So I intend to use that as the (simple) implementation of the
asynchronous DNS resolver.

>
>  >   - "synchronous timeout reader/writer"
>
> I'm not quite sure what your description refers to. I've found
> that covenience suggests another layer on top of Asio in order
> to further separation of application code and i/o code. My
> best bet into that direction so far is visible here:
>
>   http://git.cryp.to/?p=mini-httpd;a=blob;f=io.hpp;hb=master
>   http://git.cryp.to/?p=mini-httpd;a=blob;f=io-driver.cpp;hb=master
>

The idea is to be able to do a:

  try {
    timeout_read(socket_ptr, buffer, boost::posix_time::seconds(30));
  } catch (boost::network::read::timeout & e) {
    // the read has timed out.
  };

Which is sorely missed in Boost.Asio. I've implemented something
similar based on Chris Kohlhoff's advice as well, which I intend to
make part of the library. :)

>
> Concerning the message class, I have to admit to being
> reserved about the use of operator<<() to construct headers.
> In my humble opinion, the statement
>
>     msg << header("SOME_HEADER", "SOME_CONTENT") ;
>
> should rather read:
>
>     msg["SOME_HEADER"]  = "SOME_CONTENT";
>     msg["SOME_HEADER"] += "SOME_MORE_CONTENT";
>

It's pretty trivial to make your suggestion above happen, and the only
reason I use operator<<() is that it allows me to "pass a directive to
a message". In the case of assignment of values as you have described
above, this does make sense and I agree we should support that usage
as well. However, the particular use case below makes it possible to
do pretty funky things with message construction:

  msg << header("SOME_HEADER", "SOME_CONTENT")
    << transform(match_header("SOME_HEADER"),
            boost::bind(boost::to_lower(_1))) // using bind
    << mutate(body(), arg1 = "Hello, World!") // using phoenix as well
    ;

So yes, though msg["SOME_HEADER"] = "SOME_CONTENT"; might work if the
internal representation of a message is a std::map<...>, I'm not
entirely sure this makes sense for a multimap (like what we're
currently using).

The current recommended method of accessing the headers of a message
are as described in the test case:

  headers_range<message>::type range = headers(msg)["SOME_HEADER"];

which is why I'm apprehensive with allowing msg["HEADER"] -- because
it doesn't convey that you're manipulating/accessing the headers of
the message, rather attributes of the message. Besides, msg["HEADER"]
doesn't make sense for a multimap as described above. :)

> Use of operator<<() for constructing bodys is fine though. One
> thing I'm not quite sure about yet is whether that message API
> holds up to incremental construction and consumption, e.g. in
> a non-blocking I/O model.
>

Right... But this is all keeping in mind that we'll be using the
construction routines in a spirit action:

  body_parser_p[ var(msg) << body(construct<std::string>(arg1, arg2))
]; // using phoenix

I'm not sure though about how asynchronous construction will be
affected though... In the traditional Boost.Asio asynchronous classes,
you typically have handlers which are scheduled/called asynchronously.
That means if you have a method "handle_1" and "handle_2",
constructing a message asynchronously based on network protocols would
be trivial:

  void handle_1(std::string some_string, message & m) const {
    m << header("1", some_string);
  }

  void handle_2(std::string some_string, message & m) const {
    m << header("2", some_string);
  }

Now locking would have to be involved somewhere here, which can be
taken care of by boost::asio::strand's for handlers that perform
message manipulations.

> I also feel that usage of the type
>
>   std::basic_string<char, std::char_traits<char>, std::allocator>
>
> shouldn't be hard-coded in messages. I guess it's fine to
> commit to 'char' as a character type (although unnecessary),
> but people will definitely want to choose their own
> allocators. Specifying  different trait classes per message
> type might also be useful. RFC headers typically are
> case-insensitive, for example. The trait class attached below
> might be useful in that context.
>

Yup, the next stages of refactoring should address this. I guess it
should be template parameters to the basic_message class, where
reasonable defaults are set. That being said, the other
types/directives will have to support the type change to the message
class. :D

> Best regards,
> Peter
>

Thank you very much for the insights! They definitely open my eyes
wide to other issues I've tend to overlook. :)

-- 
Dean Michael C. Berris
http://cplusplus-soup.blogspot.com/
mikhailberis AT gmail DOT com
+63 928 7291459