Re: [cpp-netlib-devel] HTTP Code Conversion (WAS RE: A fewquestions)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> -----Original Message-----
> From: cpp...@li... 
> [mailto:cpp...@li...] On 
> Behalf Of Michael Dickey
> Sent: Tuesday, March 25, 2008 12:46 AM
> To: C++ Networking Library Developers Mailing List
> Subject: Re: [cpp-netlib-devel] HTTP Code Conversion (WAS RE: 
> A fewquestions)
> 
> On Mar 23, 2008, at 7:52 PM, Dean Michael C. Berris wrote:
> 
[snip]
> >
> > Good question. There's really no straight answer yet, but I 
> was in the
> > middle of making an http::request object contain a basic_message  
> > inside,
> > and somehow make it convertible to a basic_message instead 
> of deriving
> > it from a basic_message. The reasons are:
> >
[snip]
> 
> I would like to do this hierarchy conversion right away (in the  
> branch).  There are many common aspects for HTTP requests and  
> responses, i.e. they both have name/value headers, they both have  
> content bodies, and the parsing semantics are almost identical  
> (unfortunately "almost", because things like the response 
> status code  
> influence response parsing....).   A lot of the parser code is also  
> simplified by these two concepts sharing a common base.
> 

I agree with the parser getting simplified with requests and responses
sharing a common base. I personally am not too worried about the request
parsing because cpp-netlib wasn't initially supposed to implement an
HTTP server -- so that reason was reflected in the design of building
the request object separately only for consumption of the HTTP client.
The original intention was that it should be a 'packaged request' or an
object that merely contained information about what the client should
do.

In hindsight, it would have been more consistent with the HTTP spec and
messaging nature of the protocol to model the request as a
basic_message<> extension.

> This does raise the question though of whether or not an  
> http_message<> class would make sense?  i.e.
> 
> http::message<> : public basic_message<>;   (or http_message<>?)
> http::request<> : public http::message<>;
> http::response<> : public http::message<>;
> 
> Perhaps this makes more sense as we add more members to the http  
> message objects...?
> 

The hierarchy depicted above makes logical sense to me -- unless there's
opposition to the inheritance hierarchy above, I say go for it.

On the issue of 'http_message', it would be perfectly fine for
http::message<> to contain structural information that's unique to that
type without it having its own base separate from the basic_message<>.
So I personally don't see the need for a different message type that's
not derived from basic_message<> because after all, the intention of the
basic_message<> is to provide a common base for all messages to be
defined from.

On the surface this intention does not make a lot of sense (yet), but
consider the possibility of being able to convert from one message type
to another using conversion functions -- consider the capability to
transform an HTTP response instance into an SMTP message or a serialized
basic_message for caching and retrieval later on? Building on
basic_message<> allows us to extend functionality _from the base_ and
have all others that derive from it get that functionality for free --
even if it doesn't make sense directly. For example, an IRC message may
be convertible into an XMPP message, but it doesn't make sense for a
conversion to an HTTP message to make sense (in my limited imagination)
even if it were possible.

Note that convertibility should be explicitly defined using the
transformation function(s) in the transformation layer, which all
protocols should be implementing much like how the basic_message<>
metafunctions are built.

But I digress. ;)

> >> 2) I'm new to the art of making a "header only" library and am not
> >> familiar with the techniques and best practices required for this
> >> yet.  This became an issue for me in converting over the "types"
> >> code.  What is the best way to define static constant 
> variables that
> >> will never change (strings, numbers, etc.)?  How do I avoid
> >> "duplicate
> >> symbol" errors in my compiler?
> >
> > The best way is to do something like this:
> >
> > template <typename tag=tags::default_tag>
> > struct constants {
> >  static typename tag::int_type const A_CONST = 1;
> > };
> >
> > And in places where you'd need A_CONST, you'd do this instead:
> >
> > 	constants<>::A_CONST
> >
> > I hope this helps. ;)
> 
> Interesting... so does this work around the issue with 
> compilers that  
> don't normally let you define constant values in headers.  i.e.
> 
> static const unsigned int A_CONST = 25;
> 
> would work in some compilers, but not in others.  Does this "type- 
> trick" work around that problem?
> 

It should. Consider the 'const' after the type. The problem with statics
is by default they are externalized -- but the magic with statics inside
templates is that they can be const static and accessible from the
outside. But don't take my word for it, you can try it out and see if it
works. ;)

> 
> On a totally unrelated note, I noticed that you use "typename 
> tag" in  
> the templates in some places.. shouldn't this be "typename 
> Tag" (based  
> on the Boost template parameter naming guidelines?)
> 

Ah, yes. I think we should start converting template argument names to
conform to the Boost template parameter naming guidelines too -- but
that's like the lowest priority at the moment if you ask me. ;)

> >
> > 2. We can specialize on basic_message<http::message_tag> and define
> > different storage mechanisms for the headers. It can be as 
> simple as a
> > case-insensitive std::multi_map or as complex as an
> > unordered_case_insensitive_multi_map -- which is (fortunately)  
> > entirely
> > up to us. ;)
> 
> We chose choice #2 in pion-net for a couple reasons:
> 
> a) it's much simpler & easier than having to write specialized  
> accessor functions whenever you interface with the container (which  
> performs the string conversion)
> 

I agree. :)

> b) it's probably also much faster
> 

I don't think this is too much of a concern (yet), but definitely is a
good plus to have.

> c) it allows for constants to work properly (see the http::types  
> strings constants).  i.e.
> 
> namespace http { namespace types { namespace { headers {
> 	static const std::string 
> CONTENT_LENGTH("Content-Length");	//  
> definition uses "recommended" case
> };
> 
> std::cout << types::headers::CONTENT_LENGTH << ": " <<  
> request.header(types::headers::CONTENT_LENGTH) << types::CRLF;
> 
> As for using an "unordered" container, that is my personal 
> preference  
> and what we are using in pion-net.  The challenge with this is with  
> compiler support; not everyone seems to support tr1 yet, but those  
> which do not support alternatives (i.e. stdext::hash_map in 
> gcc).  We  
> have a separate "PionHashMap.hpp" header in pion that uses #define's  
> to work-around the compiler differences.
> 
> I left it as a "multimap" in my code for the sole purpose of 
> starting  
> out keeping everything as simple as possible.  I do like the idea of  
> hashing for these headers, though since this would (in theory) help  
> increase the performance of lookups (of course, in practical use it  
> might make no difference..)  If you prefer that route too, I can  
> convert over the #define stuff we have in PionHashMap, maybe 
> by adding  
> a "unordered_map.hpp" details file?
> 

I like unordered too, but I usually have not found a compelling case for
something like that except in performance critical code. Though in this
case, it might be a good decision to make early in the game. ;)

I have a comment though on the 'types' namespace -- would you think this
would be unneccessary? I don't see a problem with having an
'http::headers' struct which contained all the necessary strings.
Example would be something like:

namespace http {
  template <typename Tag=tags::default_tag>
    struct headers_impl {
      static char const * const CONTENT_LENGTH = "Content-length";
      ...
    };

  typedef headers_impl<> headers;
}

If it wouldn't be too much of a pain to use unordered containers inside
the basic_message<http::message_tag> and maintain a consistent interface
that the basic_message<> exposes (and some more unique to the
basic_message<http::message_tag>) then I'd say go ahead with that
approach instead.

> >> 4) I have a few other container types to add to the http message
> >> implementations, namely for query string and cookie parameters (see
> >> the "query_params" and "cookie_params" typedefs in 
> types.hpp).  What
> >> would be the best approach to incorporate these into http::request?
> >>
> >
> > Ah, good question.
> >
> > If you notice, http::request is a template -- you can hijack the tag
> > parameter and require these types to be passed to the tag, or use a
> > traits (meta)function (read: type) to determine the correct type for
> > these extra containers in an http::request. Example would 
> be something
> > like this (roughly from memory):
> >
> > namespace http {
> >  template <typename tag, ... >
> >  struct request {
> >    typename query_params<tag>::type query_params;
> >    typename cookie_params<tag>::type cookie_params;
> >    ...
> >  };
> > }
> 
> Interesting.. so we could leave it up to the user to define the  
> container type for these (i.e. multimap or unordered_multimap)?
> 
> Sort of makes me wonder though if we'd be giving people too 
> much rope  
> (to hang themselves with)?  Versus making it a specific type, that's  
> protected and only allows accessors?
> 

Not really the users -- more like people who want to extend the library.
Remember that we will be implementing 'query_params<>' and
'cookie_params<>' for the tags we define. If for some reason people
would want to extend or modify the functionality of the request objects,
then it should be entirely possible in the most un-intrusive manner.

> > That would be the simplest and most flexible way to do it. Another  
> > way,
> > is to make that part of the fusion sequence encapsulated 
> inside of an
> > http::request instead of being part of the actual request 
> object. This
> > is how we currently package the data within an http::request, which
> > would make it consistent and easily iterated upon in a template
> > metaprogramming level -- for example, we might want to do 
> compile-time
> > processing for some operations instead of doing it at runtime like
> > enabling normalizations for elements in a header, requirements  
> > checking,
> > etc.
> >> 5) In general, what is the best way to add additional member (like
> >> status code, for example) variables to the http::request and
> >> http::response classes?  It doesn't look like they were
> >> intended to be
> >> extended, but maybe that is b/c the implementation is just 
> missing...
> >>
> >
> > The best way would be to add it in the fusion map used inside the
> > http::request and http::response types. Since we're doing 
> this 'header
> > only' style, we best use the header only packaging that fusion  
> > allows us
> > to have. :)
> >
> > I can go into more detail if you need more information. :)
> 
> Guess I need to read up on the fusion library =)
> 
> Would this just be for container types or does it apply to simple  
> types (like ints) as well?
> 

A fusion map works like this:

struct tags {
  struct index1 { };
  struct index2 { };
  struct index3 { };
};

namespace fusion = boost::fusion;

typedef
  fusion::map<
    fusion::pair<tags::index1, int>,
    fusion::pair<tags::index2, std::string>,
    fusion::pair<tags::index3, std::vector<int> >
  >
  my_map;

{
  my_map instance;
  fusion::at_key<tags::index1>(instance) = 1;
  fusion::at_key<tags::index2>(instance) = "Hello, World!";
  fusion::at_key<tags::index3>(instance) = std::vector<int>(0, 100);

  cout << fusion::at_key<tags::index2>(instance); // "Hello, World!"
};

So what the fusion map allows you to do is to map values to types, so
that you can index the values using types instead of relying on some
form of polymorphism at runtime. Of course, you could do it with member
variables but that takes away the capability to perform compile-time
munging/processing on the encapsulated data in the fusion container (in
this case, a fusion map).

> > I hope this helps Mike!
> >
> 
> Absolutely, thanks!
> 

Glad to be of help. :)

--
Dean Michael Berris
Software Engineer, Friendster, Inc.
<dmb...@fr...>
+639287291459