From: Stephen D. <sd...@gm...> - 2006-01-02 08:21:29
|
On 12/31/05, Vlad Seryakov <vl...@cr...> wrote: > > > I think we talked about this before, but I can't find it in the > > mailing list archive. Anyway, the problem with recording the upload > > process is all the locking that's required. You could minimize this, > > e.g. by only recording uploads above a certain size, or to a certain > > URL. > > Yes, that is true, but at the same time, GET requests do not carry the > body, so no locking willhappen. POST uploads are very slow processes in > a sense that user expects to wait until it finishes and locking will > happen only for uploads. It is possible further minimize it by doing > locking only several times instead of on every read but avoiding locks > is not possible. POST request with small amounts of data are really common. Think of a user logging in via a web form. Couple of bytes. The way you've coded it at the mo (and I realize it's just a first cut), all requests with more than 0 bytes of body content will cause the upload stats code to fire. That's why I suggested we may want to have some lower threshold, or restrict it to certain URLs via urlspecific data. We don't want to track every form submission. > > It reminds me of a similar problem we had. Spooling large uploads to d= isk: > > > > https://sourceforge.net/mailarchive/forum.php?thread_id=3D7524448&forum= _id=3D43966 > > > > Vlad implemented the actual spooling, but moving that work into the > > conn threads, reading lazily, is still to be done. > > > > Lazy uploading is exactly the hook you need to track upload progress. > > The client starts to upload a file. Read-ahead occurs in the driver > > thread, say 8k. Control is passed to a conn thread, which then calls > > Ns_ConnContent(). The remaining content is read from the client, in > > the context of the conn thread and so not blocking the driver thread, > > and perhaps the content is spooled to disk. > > > > To implement upload tracking you would register a proc for /upload > > which instead of calling Ns_ConnContent(), calls Ns_ConnRead() > > multiple times, recording the number of bytes read in the upload > > tracking cache, and saving the data to disk or wherever. > > > > A lot more control of the upload process is needed, whether it be to > > control size, access, to record stats, or something we haven't thought > > of yet. If we complete the work to get lazy reading from the client > > working, an upload tracker will be an easy module to write. > > > > One problem i see with lazy uploads that if you have multiple clients > doing large POSTs, spawning multiple clients for a long time reading > that content will waste resources, each conn thread is heavy, with Tcl > interp. Not necessarily. You can create another thread pool and then ensure that those threads never run Tcl code. Remember, Tcl interps are allocated lazily on first use. We could also adjust stacksize per thread pool. > Using driver thread reading small chunks from the connections > and putting it into file will keep everything smooth. But with small > uploads on fast network this may not be an issue, so it needs a > compromize solution here, may be configurable options. Currently, > spooling into file can be disabled/enabled, lazy spooling may be > implemented similar way. Actually, lazy file spooling can be easily > done, because even Ns_ConnContent calls SockRead which does spooling, we > just need to introduce an option that tells how much we should spool in > the main thread and then continue in the conn thread. We already have maxreadahead, which is the amount of data read by the driver thread into a memory buffer. I think you're either happy letting the driver thread block writing to disk, or you're not. Why would you set a threshold on this? |