From: Avi K. <av...@qu...> - 2008-04-22 15:06:02
|
Anthony Liguori wrote: >> >> If I submit sequential O_DIRECT reads with aio_read(), will they enter >> the device read queue in the same order, and reach the disk in that >> order (allowing for reordering when worthwhile by the elevator)? >> > > There's no guarantee that any sort of order will be preserved by AIO > requests. The same is true with writes. This is what fdsync is for, > to guarantee ordering. I believe he'd like a hint to get good scheduling, not a guarantee. With a thread pool if the threads are scheduled out of order, so are your requests. If the elevator doesn't plug the queue, the first few requests may not be optimally sorted. -- error compiling committee.c: too many arguments to function |
From: Jamie L. <ja...@sh...> - 2008-04-22 15:23:32
|
Avi Kivity wrote: > Anthony Liguori wrote: > >>If I submit sequential O_DIRECT reads with aio_read(), will they enter > >>the device read queue in the same order, and reach the disk in that > >>order (allowing for reordering when worthwhile by the elevator)? > >> > >There's no guarantee that any sort of order will be preserved by AIO > >requests. The same is true with writes. This is what fdsync is for, > >to guarantee ordering. > > I believe he'd like a hint to get good scheduling, not a guarantee. > With a thread pool if the threads are scheduled out of order, so are > your requests. > If the elevator doesn't plug the queue, the first few requests may > not be optimally sorted. That's right. Then they tend to settle to a good order. But any delay in scheduling one of the threads, or a signal received by one of them, can make it lose order briefly, making the streaming stutter as the disk performes a few local seeks until it settles to good order again. You can mitigate the disruption in various ways. 1. If all threads share an "offset" variable, and reads and increments that atomically just prior to calling pread(), that helps especially at the start. (If threaded I/O is used for QEMU disk emulation, I would suggest doing that, in the more general form of popping a request from QEMU's internal shared queue at the last moment.) 2. Using more threads helps keep it sustained, at the cost of more wasted I/O when there's a cancellation (changed mind), and more memory. However, AIO, in principle (if not implementations...) could be better at keeping the suggested I/O order than thread, without special tricks. -- Jamie |
From: Marcelo T. <mto...@re...> - 2008-04-18 15:06:53
|
On Thu, Apr 17, 2008 at 02:26:52PM -0500, Anthony Liguori wrote: > This patch introduces a Linux-aio backend that is disabled by default. To > use this backend effectively, the user should disable caching and select > it with the appropriate -aio option. For instance: > > qemu-system-x86_64 -drive foo.img,cache=off -aio linux > > There's no universal way to asynchronous wait with linux-aio. At some point, > signals were added to signal completion. More recently, and eventfd interface > was added. This patch relies on the later. > > We try hard to detect whether the right support is available in configure to > avoid compile failures. > + do { > + err = io_submit(aio_ctxt_id, 1, iocbs); > + } while (err == -1 && errno == EINTR); > + > + if (err != 1) { > + fprintf(stderr, "failed to submit aio request: %m\n"); > + exit(1); > + } > + > + outstanding_requests++; > + > + return &aiocb->common; > +} > + > +static void la_wait(void) > +{ > + main_loop_wait(10); > +} Sleeping in the context of vcpu's is extremely bad (eg virtio-block blocks in write() throttling which kills performance). It should wait on IO completions instead (qemu-kvm.c creates a pthread "waitqueue" to resolve that issue). Other than that looks fine to me, will give it a try. |
From: Anthony L. <ali...@us...> - 2008-04-18 15:19:51
|
Marcelo Tosatti wrote: > On Thu, Apr 17, 2008 at 02:26:52PM -0500, Anthony Liguori wrote: > >> This patch introduces a Linux-aio backend that is disabled by default. To >> use this backend effectively, the user should disable caching and select >> it with the appropriate -aio option. For instance: >> >> qemu-system-x86_64 -drive foo.img,cache=off -aio linux >> >> There's no universal way to asynchronous wait with linux-aio. At some point, >> signals were added to signal completion. More recently, and eventfd interface >> was added. This patch relies on the later. >> >> We try hard to detect whether the right support is available in configure to >> avoid compile failures. >> > > >> + do { >> + err = io_submit(aio_ctxt_id, 1, iocbs); >> + } while (err == -1 && errno == EINTR); >> + >> + if (err != 1) { >> + fprintf(stderr, "failed to submit aio request: %m\n"); >> + exit(1); >> + } >> + >> + outstanding_requests++; >> + >> + return &aiocb->common; >> +} >> + >> +static void la_wait(void) >> +{ >> + main_loop_wait(10); >> +} >> > > Sleeping in the context of vcpu's is extremely bad (eg virtio-block > blocks in write() throttling which kills performance). It should wait > on IO completions instead (qemu-kvm.c creates a pthread "waitqueue" to > resolve that issue). > > Other than that looks fine to me, will give it a try. > FWIW, I'm not getting wonderful results in KVM. It's hard to tell though because time seems wildly inaccurate (even with kvm clock in the guest). The time issue appears unrelated to this set of patches. Regards, Anthony Liguori |
From: Marcelo T. <mto...@re...> - 2008-04-18 17:43:05
|
On Fri, Apr 18, 2008 at 10:18:33AM -0500, Anthony Liguori wrote: > >Sleeping in the context of vcpu's is extremely bad (eg virtio-block > >blocks in write() throttling which kills performance). It should wait > >on IO completions instead (qemu-kvm.c creates a pthread "waitqueue" to > >resolve that issue). > > > >Other than that looks fine to me, will give it a try. > > > > FWIW, I'm not getting wonderful results in KVM. It's hard to tell > though because time seems wildly inaccurate (even with kvm clock in the > guest). The time issue appears unrelated to this set of patches. Oh, you won't get completion signals on the aio eventfd. You might want to try the select-with-timeout() stuff. Will submit that with proper signalfd emulation shortly. |