Re: [cpp-netlib-devel] HTTP Synchronous server reply methods?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> Dean Michael Berris wrote on Tuesday, February 01, 2011 1:20 PM
> 
> Well, the problem with sending binary data as 7-bit clear over the
> network has been documented extensively over the Internet. The spec
> clearly states that you have to be transferring data safe to transfer
> over 7-bit transfers encoding -- meaning that's technically ASCII
> text. It is an accident that images are being sent in the clear as
> binary data, and if you notice in history this is largely why people
> (browser developers and server developers) had to agree that they
> would just take whatever was sent over the wire in the body and just
> have the MIME identifiers there.
> 

Either way you're sending binary in these examples... the only issue was whether that binary should be copied into a std::string

> The design of the library has actually nothing to do with whether I
> think text should be the only way transmitted over HTTP. If you also
> notice the type of the string is parametric to the tag type used. It
> makes it *easy* to just use std::string. I could very well be
> implementing a chained-block-data-structure for the underlying message
> storage and manipulate those directly and expose ranges for the
> accessor/wrappers (like how ACE does it) but that's too much work to
> do at the moment -- patches to implement this would be most welcome.
> ;)
> 
> At any rate, the reason why it's technically better to send things via
> HTTP using Base64 encoding is really just so that you're OK as far as
> the spec goes. This avoids all the endianness issues you might
> encounter on the other end (although Boost.Asio should be dealing with
> that issue for us). It's also largely a matter of convenience -- it's
> perfectly *fine* to put binary data in an std::string or an
> std::vector.
> 
> > Having an overload that makes std::string usage natural (like it is
> now) is good thing.  Forcing someone to copy memory regions into a
> std::string is a bad thing.
> >
> 
> The reason the copy is forced is for simplicity of the implementation.
> Again if you wanted to use a no-copy or single-copy interface, use the
> asynchronous server implementation *today*. ;)
> 
> >>
> >> Yet another way is to put the burden of managing the memory of the
> >> data to be written out by the server, by providing an optional
> >> callback function when providing the content. So it would then be a
> >> variant between a string and a tuple<void*, size_t,
> >> function<void(void*)> >. All of these options makes the synchronous
> >> handler's implementation needlessly complex.
> >
> > My original point was only about the inelegance of using std::string.
>  The point you're raising here is a different one, and all you're
> saying is that the lifetime of the payload needs to be roughly the same
> as the lifetime of the result object.  That's not a terribly
> complicated- problem, and there are lots of solutions to it.  It seems
> that one that's easy for the user is to just hand off ownership to the
> response, something like
> >
> 
> See, the fact that I'm even using std::string is already something I
> detest (read the thread about [string]proposal on the Boost ML ;) ) --

I've been reading the (extensive) [string] proposal on boost. :)

> but at the moment it is the most sane and simple thing to do lacking a
> proper efficient segmented data storage mechanism around (no
> std::deque has its own issues, and ptr_list<array<T,N> > is too
> "esoteric"). Forcing people to deal with std::vector<char> is just
> unnecessary when std::string is much more familiar however ugly.

I'm not proposing vector<char>- the foundation of my point is the library *should not* make the container choice for the user.  There is no choice that works well for everyone.

> 
> Of course nothing's stopping you or anyone to create a different tag
> type that defines string<Tag>::type as std::vector<char> or anything
> for that matter. ;)
> 
> The idea is really, to make the hard thing simple to do -- imagine
> having to implement your own HTTP server, and you might think whether
> you're making a copy of the data one extra time is *not* that big of a
> deal. Of course I'd like to improve the library, it's just that, well,
> if only more people submitted pull requests and actually addressed it,
> we might have a better library to use sooner than later. :P
> 

> Erik wrote:
>>
>> auto_ptr<MyObject> obj(new MyObject);
>> response << body(obj);
>>

> Uh oh, this is dangerous because the user can use the obj right after
> the data is passed to the response.

Nope- the auto_ptr will be cleared when the ownership transfers to the response, right?

> Erik wrote:
>> should be sufficient for POD types, and for non-POD types, maybe
>> you'd need something like
>>
>> auto_ptr<vector<char> > obj(new vector<char>());
>> response << body(&(*obj)[0], obj->size(), obj);
>>
> 
> And this is just ugly. ;)

A little... think of it from a user standpoint- how would you send an arbitrary object in the response body?  Can you write anything cleaner or more intuitive than

response << body(void*, length) ?  The way you're describing makes you write

response.content.resize(length);
memcpy((void*)response.content.data(), &object, length);

That looks less ugly?  More intuitive?

> Erik wrote:
>>> The response can delete it whenever it's done with doing whatever it
>> does.
>>
> 
> That's bad design. :D
>

That's what it happens in the current implementation... the only difference is that it is deleting a copy (that's in the body) instead of the original.  The issue here is that the lifetime of the response object is longer than the lifetime of the handler function.  The solution is going to be to have the response destroy the body memory, either in std::string form (like now) or in original-object form.  Either way the response is taking ownership of the memory that it is sending down the wire.

> 
> Unfortunately, that's not much better than saying:
> 
>   std::ifstream f(...);
>   response.content.reserve(file_size);
>   f.read(response.content.data(), file_size);
> 
> Am I missing something?

I think this example illustrates my point about the trickiness of the interface.  There are two reasons this won't work.

1.  response.content.data() returns const char*, right?  You can't write into that.
2.  After this is run, response.content.size() will always be zero, right?  And nothing will be sent?

You probably wrote these bugs because you're up too late. :)  Still, they show the brittleness of using a string like this.

> 
> For data that's already in memory I get the utility of being able to
> refer to the bytes directly. Unfortunately making the library
> deallocate memory that the user allocated is, quite bluntly, bad form
> and error prone.
> 
> Makes sense?
> 

I'd say that passing ownership of an object to a library is neither bad form nor error prone.  Sutter has a good writeup of using auto_ptr effectively in 'Exceptional C++' that seems applicable.  Transferring object ownership with auto_ptr is safe, efficient, and idiomatic.

Anyway, you're putting the sweat equity into cpp-netlib, and are closer to the project.  My only original observation was 'The same question will come up again and again because it's a surprising interface'.  It clearly can work as it is, but it is surprising.  That's why he didn't figure it out by himself. 

Erik

----------------------------------------------------------------------
This message w/attachments (message) is intended solely for the use of the intended recipient(s) and may contain information that is privileged, confidential or proprietary. If you are not an intended recipient, please notify the sender, and then please delete and destroy all copies and attachments, and be advised that any review or dissemination of, or the taking of any action in reliance on, the information contained in or attached to this message is prohibited. 
Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Sender. Subject to applicable law, Sender may intercept, monitor, review and retain e-communications (EC) traveling through its networks/systems and may produce any such EC to regulators, law enforcement, in litigation and as required by law. 
The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or free of errors or viruses. 

References to "Sender" are references to any subsidiary of Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this EC may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: 
http://www.bankofamerica.com/emaildisclaimer. By messaging with Sender you consent to the foregoing.