From: Blaisorblade <bla...@ya...> - 2007-03-29 00:37:24
|
On mercoled=EC 28 marzo 2007, Jeff Dike wrote: > [ This patch needs to get into 2.6.21, as it fixes a serious bug > introduced soon after 2.6.20 ] > > Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices > queues and locks, which was fine as far as it went, but left in place > a global which controlled access to submitting requests to the host. > This should have been made per-device as well, since it causes I/O > hangs when multiple block devices are in use. > > This patch fixes that by replacing the global with an activity flag in > the device structure in order to tell whether the queue is currently > being run. =46inally that variable has a understandable name. However in a mail from J= ens=20 Axboe, titled: "Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" , w= ith=20 Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag=20 altogether, so we may explore this for the future: > > Add some comments about requirements for ubd_io_lock and expand its use. > > > > When an irq signals that the "controller" (i.e. another thread on the > > host, which does the actual requests and is the only one blocked on I/O > > on the host) has done some work, we call again the request function > > ourselves (do_ubd_request). > > > > We now do that with ubd_io_lock held - that's useful to protect against > > concurrent calls to elv_next_request and so on. > > Not only useful, required, as I think I complained about a year or more > ago :-) > > > XXX: Maybe we shouldn't call at all the request function. Input needed = on > > this. Are we supposed to plug and unplug the queue? That code > > "indirectly" does that by setting a flag, called do_ubd, which makes the > > request function return (it's a residual of 2.4 block layer interface). > > Sometimes you need to. I'd probably just remove the do_ubd check and > always recall the request function when handling completions, it's > easier and safe. Anyway, the main speedups to do on the UBD driver are: * implement write barriers (so much less fsync) - this is performance kille= r=20 n.1 * possibly to use the new 2.6 request layout with scatter/gather I/O, and=20 vectorized I/O on the host * while at vectorizing I/O using async I/O * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes=20 cpu-bound. To make a different but related example, with a SpeedScale laptop, it's=20 interesting to double CPU frequency and observe tuntap speed double too.=20 (with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending=20 whether UML trasmits or receives data; with 2GHz double rates). Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and=20 still the double at 2Ghz. This is a different UML though. * using futexes instead of pipes for synchronization (required for previous= =20 one). =2D-=20 Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade |