From: Rick R. <ric...@gm...> - 2011-03-21 18:59:50
|
On Mon, Mar 21, 2011 at 10:08 AM, Dean Michael Berris < mik...@gm...> wrote: > Hi Rick! > > ... > > Okay, but how do you deal with the parsing? Do you reset the parser > state per connection? > > Yes. That is my current approach. Oddly enough, I have it partially working. I have a test which pipelines two requests. Currently it gets the first request and responds correctly. The 2nd request is read and processed by handle_read_data but is not correctly parsed, and I'm not sure why yet. I just need to spend a bit more time with it. > Asynchronous server handlers can call connection.write(...) multiple > times per request. The calls to connection.read(...) in the case of a > handler that wants to read up data from the body of a POST/PUT request > will also have to deal with the boundary issues -- granted that the > HTTP/1.1 spec doesn't allow pipelining for POST/PUT requests, we'd > want to be able to let the handler recover or at least deal with > invalid data appropriately. > Yea, this is a tricky point because we're basically relying on the top layer of the stack to complete parsing for us. It is good that the spec disallows pipelining posts and puts, but either way I would like to actually process the entire body in the Connection object because there is *tons* of complicating parsing logic I would like to put in for POSTS: There is a new type of socket exhaustion attack out which correctly (according to the HTTP spec) slowly dribbles in data in a multi-part post. I would like to put counter-measures to defend against such things as well as other DOS attacks (see http://www.ecl-labs.org/2011/03/17/roboo-http-mitigator.html ) > > Have you handled this? (Granted, I haven't looked at your patch just yet). > I don't handle this well enough yet, apparently. > > > If there are no pending additional requests, then the async_read_some > will > > block indefinitely. > > The second phase of the plan, then, is to register a deadline timer at > start > > of the new connection. Every new request that gets read will kick the > > deadline timer back up to its max timeout time. When the timer does go > off, > > it cancels any operations that are pending on the Connection's socket. > > This does two things, first and foremost, it keeps a hold of the > > Connection's shared ptr, so that it is guaranteed not to be destroyed > while > > the keepalive is in effect. Secondly, it quite obviously will free the > > async_read_some from its wait. > > That sounds alright, although we might have to guard for a race > somehow there when the handler is actually reading from the same > connection (in the case of a POST/PUT handler). Basically we have to > know whether the timeout happened while the handler is waiting for a > POST/PUT body or whether the timeout happened while at the boundaries > of the GET requests. > If we process the multipart POST body in the Connection object, we can re-set the keepalive timer for every packet of data we receive, (if the client waits more than TIMEOUT seconds between each packet, I would like to kick them off anyways) > > I haven't bothered to fork cpp-netlib on github so I don't have a checkin > to > > make, but I did create a 'git diff' from boost/ I have attached it > here > > for your perusal. > > Please let me know your preferred approach for reviewing and accepting > > patches (I am assuming a pull request, which I can get around to) > > > > You assume right that I would much rather prefer looking at diffs over > at Github, so a pull request is going to be most appreciated. > > I will do that.. That patch is already outdated anyways. I made a bunch of changes. > > 3. In attempting to benchmark my changes. I did note that yes, > keepalive > > works, and also that it breaks httperf. My problem is that httperf > doesn't > > seem to recognize a completed response unless the server closes the > > connection. It claims to be a 1.1 compliant tester, and when it sends > > requests it doesn't mention a connection: close in its headers, so I'm > > assuming it should expect the connection to remain open. > > > > I think you need to flush the data from the server end. I forget if > TCP_NODELAY is turned on by default and whether httperf (I really > don't know what that is, you have to pardon me I use Apache's ab for > performance testing) actually has a short timeout setting for data > transfer or whether the buffers are small enough to avoid buffer > starvation. How much data are you sending from the server side during > the benchmarks? > > Flush is being called. I did find one of the problems. Most clients expect that the header arrive in a single packet. The problem in AsyncConnection is that it was sending the first line as a single packet, then following up with the rest of the header. I altered the code (in a later patch than what I sent) to add the first line straight into the header buffer. It fixed things. > > Does anyone know the correct procedure here, or why it might be thinking > > that the responses have not completed? My concern here is that there is > > some nuance of the HTTP 1.1 spec that I may have missed which allows > servers > > to delineate between responses since, with pipelining, multiple responses > > may come all at once over the same connection. > > > > As far as the spec goes, the server may send a "Connection: close" at > any time to signal to the client that it will initiate a close to > complete the transfer. Also if you're using Transfer-Encoding: > Chunked, I think there's a special sequence to indicate the end of the > response -- I need to cite that part, although an "empty" chunk > usually signals the end of the message. > > If you just blindly keep the connection alive though, then the answer > is the last request from the client side *should* have a "Connection: > close" to let the server know that this is the last request in the > pipeline. If it doesn't, the client may have to wait indefinitely for > the server to close the connection while the server is stuck in a > state where it would assume that there might be more requests that may > come into the connection. > Ah. That last bit is a really useful piece of information. If all clients conform to that, then it will make this job much easier. > > Thank you too and I definitely hope to hear from you soon! > > Thanks. I would like to experiment with adding request body processing, even if it doesn't end up in the AsyncConnection, there is a lot of work to be done here. I am not looking forward to CHUNK processing yet, but I will get to that this weekend (My day job and 2 kids keep me rather occupied during the week) |