Re: [cpp-netlib-devel] async keepalive and other things

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Tue, Mar 22, 2011 at 2:59 AM, Rick Richardson
<ric...@gm...> wrote:
>
>
> On Mon, Mar 21, 2011 at 10:08 AM, Dean Michael Berris
> <mik...@gm...> wrote:
>>
>> Hi Rick!
>>
>> ...
>>
>> Okay, but how do you deal with the parsing? Do you reset the parser
>> state per connection?
>>
> Yes. That is my current approach. Oddly enough, I have it partially working.
> I have a test which pipelines two requests. Currently it gets the first
> request and responds correctly. The 2nd request is read and processed by
> handle_read_data but is not correctly parsed, and I'm not sure why yet. I
> just need to spend a bit more time with it.
>

There is a parser type that does the incremental parsing of incoming
data. Maybe all that needs to be done is to check to see if the
parser's state is reset properly every time once a GET request is
parsed appropriately. There may be other complications but I was
thinking of a way to make it happen leveraging object lifetimes, which
may or may not work -- I still have to explore that possibility.

That said I'll wait to see your pull request(s) soon. :)

>
>>
>> Asynchronous server handlers can call connection.write(...) multiple
>> times per request. The calls to connection.read(...) in the case of a
>> handler that wants to read up data from the body of a POST/PUT request
>> will also have to deal with the boundary issues -- granted that the
>> HTTP/1.1 spec doesn't allow pipelining for POST/PUT requests, we'd
>> want to be able to let the handler recover or at least deal with
>> invalid data appropriately.
>
> Yea, this is a tricky point because we're basically relying on the top layer
> of the stack to complete parsing for us.
> It is good that the spec disallows pipelining posts and puts, but either way
> I would like to actually process the entire body in the Connection object
> because there is *tons* of complicating parsing logic I would like to put in
> for POSTS:
> There is a new type of socket exhaustion attack out which correctly
> (according to the HTTP spec) slowly dribbles in data in a multi-part post.
>  I would like to put counter-measures to defend against such things as well
> as other DOS attacks
> (see http://www.ecl-labs.org/2011/03/17/roboo-http-mitigator.html )
>

Right. Implementing the POST/PUT body parser was something I wasn't
thinking of incorporating in the connection, opting to delegate those
to helper/utility types/functions. However I see that having those in
the connection's implementation would do a lot of good to a lot of
people so I definitely hope to see how you (and others) would take
this task on. :)

>>
>> Have you handled this? (Granted, I haven't looked at your patch just yet).
>
> I don't handle this well enough yet, apparently.

That's fine. :)

>>
>> That sounds alright, although we might have to guard for a race
>> somehow there when the handler is actually reading from the same
>> connection (in the case of a POST/PUT handler). Basically we have to
>> know whether the timeout happened while the handler is waiting for a
>> POST/PUT body or whether the timeout happened while at the boundaries
>> of the GET requests.
>
> If we process the multipart POST body in the Connection object, we can
> re-set the keepalive timer for every packet of data we receive, (if the
> client waits more than TIMEOUT seconds between each packet, I would like to
> kick them off anyways)

Yes, I agree. Although that would mean making the Connection object's
implementation a lot hairier than it already is. Just thinking about
incremental parsing support for multi-part MIME on the *server* side
is making me want to block off a good week and just hole up with a
powerful machine to get that figured out. :D

>>
>> > I haven't bothered to fork cpp-netlib on github so I don't have a
>> > checkin to
>> > make, but I did create a 'git diff' from  boost/   I have attached it
>> > here
>> > for your perusal.
>> > Please let me know your preferred approach for reviewing and accepting
>> > patches (I am assuming a pull request, which I can get around to)
>> >
>>
>> You assume right that I would much rather prefer looking at diffs over
>> at Github, so a pull request is going to be most appreciated.
>>
> I will do that.. That patch is already outdated anyways. I made a bunch of
> changes.
>

Cool, I'll look forward to the pull requests then. :)

> Flush is being called. I did find one of the problems.  Most clients expect
> that the header arrive in a single packet. The problem in AsyncConnection is
> that it was sending the first line as a single packet, then following up
> with the rest of the header.  I altered the code (in a later patch than what
> I sent) to add the first line straight into the header buffer. It fixed
> things.
>

Interesting. So how does it change the interface though? And how do
you intend to support sending '100 Continue' responses from the
handler with this change?

>>
>> As far as the spec goes, the server may send a "Connection: close" at
>> any time to signal to the client that it will initiate a close to
>> complete the transfer. Also if you're using Transfer-Encoding:
>> Chunked, I think there's a special sequence to indicate the end of the
>> response -- I need to cite that part, although an "empty" chunk
>> usually signals the end of the message.
>>
>> If you just blindly keep the connection alive though, then the answer
>> is the last request from the client side *should* have a "Connection:
>> close" to let the server know that this is the last request in the
>> pipeline. If it doesn't, the client may have to wait indefinitely for
>> the server to close the connection while the server is stuck in a
>> state where it would assume that there might be more requests that may
>> come into the connection.
>
> Ah. That last bit is a really useful piece of information. If all clients
> conform to that, then it will make this job much easier.

It's recommended by the spec that this be the mode of operation, but
if you think about it though just having a 'Content-Length' field in
the HTTP response should allow the client to not have to wait for the
connection to be closed regardless of whether requests were pipelined
or not. Does that make sense?

>>
>> Thank you too and I definitely hope to hear from you soon!
>>
>
> Thanks. I would like to experiment with adding request body processing, even
> if it doesn't end up in the AsyncConnection, there is a lot of work to be
> done here.   I am not looking forward to CHUNK processing yet, but I will
> get to that this weekend (My day job and 2 kids keep me rather occupied
> during the week)
>

Cool. :)

I know how that goes -- with family and having a day job. In my case
though lack of a powerful enough machine to do template-heavy
development on is killing my productivity a lot. Hopefully I can
remedy that soon now that the wife has agreed to invest on a more
modern machine now. :D

Have a good one Rick and I look forward to your pull requests soon!

-- 
Dean Michael Berris
http://about.me/deanberris