kvm-devel Mailing List for kernel virtual machine (Page 48)

kvm-devel — kernel virtual machine development

You can subscribe to this list here.

2006	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul	_Aug	_Sep	_Oct (33)	_Nov (325)	_Dec (320)
2007	_Jan (484)	_Feb (438)	_Mar (407)	_Apr (713)	_May (831)	_Jun (806)	_Jul (1023)	_Aug (1184)	_Sep (1118)	_Oct (1461)	_Nov (1224)	_Dec (1042)
2008	_Jan (1449)	_Feb (1110)	_Mar (1428)	_Apr (1643)	_May (682)	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 .. 46 47 48 49 50 .. 703 > >> (Page 48 of 703)

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Avi K. <av...@qu...> - 2008-04-22 16:27:24

Anthony Liguori wrote:
>
> I think we need to decide what we want to target in terms of upper 
> limits.
>
> With a bridge or two, we can probably easily do 128.
>
> If we really want to push things, I think we should do a PCI based 
> virtio controller.  I doubt a large number of PCI devices is ever 
> going to perform very well b/c of interrupt sharing and some of the 
> assumptions in virtio_pci.
>
> If we implement a controller, we can use a single interrupt, but 
> multiplex multiple notifications on that single interrupt.  We can 
> also be more aggressive about using shared memory instead of PCI 
> config space which would reduce the overall number of exits.
>
> We could easily support a very large number of devices this way.  But 
> again, what do we want to target for now? 

I think that for networking we should keep things as is.  I don't see 
anybody using 100 virtual NICs.

For mass storage, we should follow the SCSI model with a single device 
serving multiple disks, similar to what you suggest.  Not sure if the 
device should have a single queue or one queue per disk.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [RFC] linuxboot Option ROM for Linux kernel booting

From: H. P. A. <hp...@zy...> - 2008-04-22 16:23:13

Nguyen Anh Quynh wrote:
> Hi,
> 
> I am thinking about comibing this ROM with the extboot. Both two ROM
> are about "booting", so I think that is reasonable. So we will have
> only 1 ROM that supports both external boot and Linux boot.
> 
> Is that desirable or not?
> 

Does it make the code simpler and easier to understand?  If not, then I 
would say no.

	-hpa

Re: [kvm-devel] [RFC PATCH] virtio: change config to guest endian.

From: Hollis B. <ho...@us...> - 2008-04-22 16:22:57

On Tuesday 22 April 2008 06:22:48 Avi Kivity wrote:
> Rusty Russell wrote:
> > [Christian, Hollis, how much is this ABI breakage going to hurt you?]
> >
> > A recent proposed feature addition to the virtio block driver revealed
> > some flaws in the API, in particular how easy it is to break big
> > endian machines.
> >
> > The virtio config space was originally chosen to be little-endian,
> > because we thought the config might be part of the PCI config space
> > for virtio_pci.  It's actually a separate mmio region, so that
> > argument holds little water; as only x86 is currently using the virtio
> > mechanism, we can change this (but must do so now, before the
> > impending s390 and ppc merges).
> 
> This will probably annoy Hollis which has guests that can go both ways.

Rusty and I have discussed it. Ultimately, this just takes us from a 
cross-architecture endianness definition to a per-architecture definition. 
Anyways, we've already fallen into this situation with the virtio ring data 
itself, so we're really saying "same endianness as the ring".

-- 
Hollis Blanchard
IBM Linux Technology Center

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Anthony L. <an...@co...> - 2008-04-22 16:18:43

Marcelo Tosatti wrote:
>> Maybe require explicit device/function assignment on the command line?  
>> It will be managed anyway.
>>     
>
> ACPI does support hotplugging of individual functions inside slots,
> not sure how well does Linux (and other OSes) support that.. should be
> transparent though.
>   

I think we need to decide what we want to target in terms of upper limits.

With a bridge or two, we can probably easily do 128.

If we really want to push things, I think we should do a PCI based 
virtio controller.  I doubt a large number of PCI devices is ever going 
to perform very well b/c of interrupt sharing and some of the 
assumptions in virtio_pci.

If we implement a controller, we can use a single interrupt, but 
multiplex multiple notifications on that single interrupt.  We can also 
be more aggressive about using shared memory instead of PCI config space 
which would reduce the overall number of exits.

We could easily support a very large number of devices this way.  But 
again, what do we want to target for now?

Regards,

Anthony Liguori

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Javier G. <ja...@gu...> - 2008-04-22 15:57:54

On Tue, Apr 22, 2008 at 3:10 AM, Avi Kivity <av...@qu...> wrote:
>  I'm rooting for btrfs myself.

but could btrfs (when stable) work for migration?  i'm curious about
OCFS2 performance on this kind of load...

when i manage to sell the idea of a KVM cluster i'd like to know if i
should try first EVMS-HA (cluster LV's) or OCFS (cluster FS)

-- 
Javier

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Jamie L. <ja...@sh...> - 2008-04-22 15:37:21

Avi Kivity wrote:
> >And video streaming on some embedded devices with no MMU!  (Due to the
> >page cache heuristics working poorly with no MMU, sustained reliable
> >streaming is managed with O_DIRECT and the app managing cache itself
> >(like a database), and that needs AIO to keep the request queue busy.
> >At least, that's the theory.)
> 
> Could use threads as well, no?

Perhaps.  This raises another point about AIO vs. threads:

If I submit sequential O_DIRECT reads with aio_read(), will they enter
the device read queue in the same order, and reach the disk in that
order (allowing for reordering when worthwhile by the elevator)?

With threads this isn't guaranteed and scheduling makes it quite
likely to issue the parallel synchronous reads out of order, and for
them to reach the disk out of order because the elevator doesn't see
them simultaneously.

With AIO (non-Glibc! (and non-kthreads)) it might be better at
keeping the intended issue order, I'm not sure.

It is highly desirable: O_DIRECT streaming performance depends on
avoiding seeks (no reordering) and on keeping the request queue
non-empty (no gap).

I read a man page for some other unix, describing AIO as better than
threaded parallel reads for reading tape drives because of this (tape
seeks are very expensive).  But the rest of the man page didn't say
anything more.  Unfortunately I don't remember where I read it.  I
have no idea whether AIO submission order is nearly always preserved
in general, or expected to be.

> It's me at fault here.  I just assumed that because it's easy to do aio 
> in a thread pool efficiently, that's what glibc does.
> 
> Unfortunately the code does some ridiculous things like not service 
> multiple requests on a single fd in parallel.  I see absolutely no 
> reason for it (the code says "fight for resources").

Ouch.  Perhaps that relates to my thought above, about multiple
requests to the same file causing seek storms when thread scheduling
is unlucky?

> So my comments only apply to linux-aio vs a sane thread pool.  Sorry for 
> spreading confusion.

Thanks.  I thought you'd measured it :-)

> It could and should.  It probably doesn't.
> 
> A simple thread pool implementation could come within 10% of Linux aio 
> for most workloads.  It will never be "exactly", but for small numbers 
> of disks, close enough.

I would wait for benchmark results for I/O patterns like sequential
reading and writing, because of potential for seeks caused by request
reordering, before being confident of that.

> >Hmm.  Thanks.  I may consider switching to XFS now....
> 
> I'm rooting for btrfs myself.

In the unlikely event they backport btrfs to kernel 2.4.26-uc0, I'll
be happy to give it a try! :-)

-- Jamie

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Jamie L. <ja...@sh...> - 2008-04-22 15:36:29

Avi Kivity wrote:
> >Perhaps.  This raises another point about AIO vs. threads:
> >
> >If I submit sequential O_DIRECT reads with aio_read(), will they enter
> >the device read queue in the same order, and reach the disk in that
> >order (allowing for reordering when worthwhile by the elevator)?
> 
> Yes, unless the implementation in the kernel (or glibc) is threaded.

> 
> >With threads this isn't guaranteed and scheduling makes it quite
> >likely to issue the parallel synchronous reads out of order, and for
> >them to reach the disk out of order because the elevator doesn't see
> >them simultaneously.
> 
> If the disk is busy, it doesn't matter.  The requests will queue and the 
> elevator will sort them out.  So it's just the first few requests that 
> may get to disk out of order.

There's two cases where it matters to a read-streaming app:

    1. Disk isn't busy with anything else, maximum streaming
       performance is desired.

    2. Disk is busy with unrelated things, but you're using I/O
       priorities to give the streaming app near-absolute priority.
       Then you need to maintain overlapped streaming requests,
       otherwise disk is given to a lower priority I/O.  If that
       happens often, you lose, priority is ineffective.  Because one
       of the streaming requests is usually being serviced, elevator
       has similar limitations as for a disk which is not busy with
       anything else.

> I haven't considered tape, but this is a good point indeed.  I expect it 
> doesn't make much of a difference for a loaded disk.

Yes, as long as it's loaded with unrelated requests at the same I/O
priority, the elevator has time to sort requests and hide thread
scheduling artifacts.

Btw, regarding QEMU: QEMU gets requests _after_ sorting by the guest's
elevator, then submits them to the host's elevator.  If the guest and
host elevators are both configured 'anticipatory', do the anticipatory
delays add up?

-- Jamie

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Marcelo T. <mto...@re...> - 2008-04-22 15:32:59

On Tue, Apr 22, 2008 at 05:51:51PM +0300, Avi Kivity wrote:
> Anthony Liguori wrote:
> > Avi Kivity wrote:
> >> Anthony Liguori wrote:
> >>  
> >>> This patch changes virtio devices to be multi-function devices whenever
> >>> possible.  This increases the number of virtio devices we can 
> >>> support now by
> >>> a factor of 8.
> >>>
> >>> With this patch, I've been able to launch a guest with either 220 
> >>> disks or 220
> >>> network adapters.
> >>>
> >>>       
> >>
> >> Does this play well with hotplug?  Perhaps we need to allocate a new 
> >> device on hotplug.
> >>   
> >
> > Probably not.  I imagine you can only hotplug devices, not individual 
> > functions?
> >
> 
> It sounds reasonable to expect so.  ACPI has objects for devices, not 
> functions (IIRC).

So what I dislike about multifunction devices is the fact that a single
slot shares an IRQ, and that special code is required in the QEMU
drivers (virtio guest capability might not always be present).

I don't see any need for using them if we can extend PCI slots...

> Maybe require explicit device/function assignment on the command line?  
> It will be managed anyway.

ACPI does support hotplugging of individual functions inside slots,
not sure how well does Linux (and other OSes) support that.. should be
transparent though.

Re: [kvm-devel] [PATCH 0 of 9] mmu notifier #v12

From: Robin H. <ho...@sg...> - 2008-04-22 15:26:03

Andrew, Could we get direction/guidance from you as regards
the invalidate_page() callout of Andrea's patch set versus the
invalidate_range_start/invalidate_range_end callout pairs of Christoph's
patchset?  This is only in the context of the __xip_unmap, do_wp_page,
page_mkclean_one, and try_to_unmap_one call sites.

On Tue, Apr 22, 2008 at 03:48:47PM +0200, Andrea Arcangeli wrote:
> On Tue, Apr 22, 2008 at 08:36:04AM -0500, Robin Holt wrote:
> > I am a little confused about the value of the seq_lock versus a simple
> > atomic, but I assumed there is a reason and left it at that.
> 
> There's no value for anything but get_user_pages (get_user_pages takes
> its own lock internally though). I preferred to explain it as a
> seqlock because it was simpler for reading, but I totally agree in the
> final implementation it shouldn't be a seqlock. My code was meant to
> be pseudo-code only. It doesn't even need to be atomic ;).

Unless there is additional locking in your fault path, I think it does
need to be atomic.

> > I don't know what you mean by "it'd" run slower and what you mean by
> > "armed and disarmed".
> 
> 1) when armed the time-window where the kvm-page-fault would be
> blocked would be a bit larger without invalidate_page for no good
> reason

But that is a distinction without a difference.  In the _start/_end
case, kvm's fault handler will not have any _DIRECT_ blocking, but
get_user_pages() had certainly better block waiting for some other lock
to prevent the process's pages being refaulted.

I am no VM expert, but that seems like it is critical to having a
consistent virtual address space.  Effectively, you have a delay on the
kvm fault handler beginning when either invalidate_page() is entered
or invalidate_range_start() is entered until when the _CALLER_ of the
invalidate* method has unlocked.  That time will remain essentailly
identical for either case.  I would argue you would be hard pressed to
even measure the difference.

> 2) if you were to remove invalidate_page when disarmed the VM could
> would need two branches instead of one in various places

Those branches are conditional upon there being list entries.  That check
should be extremely cheap.  The vast majority of cases will have no
registered notifiers.  The second check for the _end callout will be
from cpu cache.

> I don't want to waste cycles if not wasting them improves performance
> both when armed and disarmed.

In summary, I think we have narrowed down the case of no registered
notifiers to being infinitesimal.  The case of registered notifiers
being a distinction without a difference.

> > When I was discussing this difference with Jack, he reminded me that
> > the GRU, due to its hardware, does not have any race issues with the
> > invalidate_page callout simply doing the tlb shootdown and not modifying
> > any of its internal structures.  He then put a caveat on the discussion
> > that _either_ method was acceptable as far as he was concerned.  The real
> > issue is getting a patch in that satisfies all needs and not whether
> > there is a seperate invalidate_page callout.
> 
> Sure, we have that patch now, I'll send it out in a minute, I was just
> trying to explain why it makes sense to have an invalidate_page too
> (which remains the only difference by now), removing it would be a
> regression on all sides, even if a minor one.

I think GRU is the only compelling case I have heard for having the
invalidate_page seperate.  In the case of the GRU, the hardware enforces a
lifetime of the invalidate which covers all in-progress faults including
ones where the hardware is informed after the flush of a PTE.  in all
cases, once the GRU invalidate instruction is issued, all active requests
are invalidated.  Future faults will be blocked in get_user_pages().
Without that special feature of the hardware, I don't think any code
simplification exists.  I, of course, reserve the right to be wrong.

I believe the argument against a seperate invalidate_page() callout was
Christoph's interpretation of Andrew's comments.  I am not certain Andrew
was aware of this special aspects of the GRU hardware and whether that
had been factored into the discussion at that point in time.

Thanks,
Robin

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

From: Avi K. <av...@qu...> - 2008-04-22 15:24:31

Andrea Arcangeli wrote:
> On Tue, Apr 22, 2008 at 04:56:10PM +0200, Eric Dumazet wrote:
>   
>> Andrea Arcangeli a écrit :
>>     
>>> +
>>> +static int mm_lock_cmp(const void *a, const void *b)
>>> +{
>>> +	cond_resched();
>>> +	if ((unsigned long)*(spinlock_t **)a <
>>> +	    (unsigned long)*(spinlock_t **)b)
>>> +		return -1;
>>> +	else if (a == b)
>>> +		return 0;
>>> +	else
>>> +		return 1;
>>> +}
>>> +
>>>       
>> This compare function looks unusual...
>> It should work, but sort() could be faster if the
>> if (a == b) test had a chance to be true eventually...
>>     
>
> Hmm, are you saying my mm_lock_cmp won't return 0 if a==b?
>
>   

You need to compare *a to *b (at least, that's what you're doing for the 
< case).

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Jamie L. <ja...@sh...> - 2008-04-22 15:23:32

Avi Kivity wrote:
> Anthony Liguori wrote:
> >>If I submit sequential O_DIRECT reads with aio_read(), will they enter
> >>the device read queue in the same order, and reach the disk in that
> >>order (allowing for reordering when worthwhile by the elevator)?
> >>  
> >There's no guarantee that any sort of order will be preserved by AIO 
> >requests.  The same is true with writes.  This is what fdsync is for, 
> >to guarantee ordering.
> 
> I believe he'd like a hint to get good scheduling, not a guarantee.  
> With a thread pool if the threads are scheduled out of order, so are 
> your requests.

> If the elevator doesn't plug the queue, the first few requests may
> not be optimally sorted.

That's right.  Then they tend to settle to a good order.  But any
delay in scheduling one of the threads, or a signal received by one of
them, can make it lose order briefly, making the streaming stutter as
the disk performes a few local seeks until it settles to good order
again.

You can mitigate the disruption in various ways.

  1. If all threads share an "offset" variable, and reads and
     increments that atomically just prior to calling pread(), that helps
     especially at the start.  (If threaded I/O is used for QEMU disk
     emulation, I would suggest doing that, in the more general form
     of popping a request from QEMU's internal shared queue at the last
     moment.)

  2. Using more threads helps keep it sustained, at the cost of more
     wasted I/O when there's a cancellation (changed mind), and more
     memory.

However, AIO, in principle (if not implementations...) could be better
at keeping the suggested I/O order than thread, without special tricks.

-- Jamie

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Marcelo T. <mto...@re...> - 2008-04-22 15:16:49

On Tue, Apr 22, 2008 at 05:32:45PM +0300, Avi Kivity wrote:
> Anthony Liguori wrote:
> > This patch changes virtio devices to be multi-function devices whenever
> > possible.  This increases the number of virtio devices we can support now by
> > a factor of 8.
> >
> > With this patch, I've been able to launch a guest with either 220 disks or 220
> > network adapters.
> >
> >   
> 
> Does this play well with hotplug?  Perhaps we need to allocate a new 
> device on hotplug.
> 
> (certainly if we have a device with one function, which then gets 
> converted to a multifunction device)

Would have to change the hotplug code to handle functions...

It sounds less hacky to just extend the PCI slots instead of (ab)using
multiple functions per-slot.

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Anthony L. <ali...@us...> - 2008-04-22 15:16:45

Ryan Harper wrote:
> * Anthony Liguori <ali...@us...> [2008-04-22 09:16]:
>   
>> This patch changes virtio devices to be multi-function devices whenever
>> possible.  This increases the number of virtio devices we can support now by
>> a factor of 8.
>>
>> With this patch, I've been able to launch a guest with either 220 disks or 220
>> network adapters.
>>     
>
> Have you confirmed that the network devices show up?  I was playing
> around with some of the limits last night and while it is easy to get
> QEMU to create the adapters, so far I've only had a guest see 29 pci
> nics (e1000).
>   

Yup, I had an eth219

Regards,

Anthony Liguori

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

From: Andrea A. <an...@qu...> - 2008-04-22 15:15:52

On Tue, Apr 22, 2008 at 04:56:10PM +0200, Eric Dumazet wrote:
> Andrea Arcangeli a écrit :
>> +
>> +static int mm_lock_cmp(const void *a, const void *b)
>> +{
>> +	cond_resched();
>> +	if ((unsigned long)*(spinlock_t **)a <
>> +	    (unsigned long)*(spinlock_t **)b)
>> +		return -1;
>> +	else if (a == b)
>> +		return 0;
>> +	else
>> +		return 1;
>> +}
>> +
> This compare function looks unusual...
> It should work, but sort() could be faster if the
> if (a == b) test had a chance to be true eventually...

Hmm, are you saying my mm_lock_cmp won't return 0 if a==b?

> static int mm_lock_cmp(const void *a, const void *b)
> {
> 	unsigned long la = (unsigned long)*(spinlock_t **)a;
> 	unsigned long lb = (unsigned long)*(spinlock_t **)b;
>
> 	cond_resched();
> 	if (la < lb)
> 		return -1;
> 	if (la > lb)
> 		return 1;
> 	return 0;
> }

If your intent is to use the assumption that there are going to be few
equal entries, you should have used likely(la > lb) to signal it's
rarely going to return zero or gcc is likely free to do whatever it
wants with the above. Overall that function is such a slow path that
this is going to be lost in the noise. My suggestion would be to defer
microoptimizations like this after 1/12 will be applied to mainline.

Thanks!

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Ryan H. <ry...@us...> - 2008-04-22 15:15:35

* Anthony Liguori <ali...@us...> [2008-04-22 09:16]:
> This patch changes virtio devices to be multi-function devices whenever
> possible.  This increases the number of virtio devices we can support now by
> a factor of 8.
> 
> With this patch, I've been able to launch a guest with either 220 disks or 220
> network adapters.

Have you confirmed that the network devices show up?  I was playing
around with some of the limits last night and while it is easy to get
QEMU to create the adapters, so far I've only had a guest see 29 pci
nics (e1000).


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ry...@us...

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Jamie L. <ja...@sh...> - 2008-04-22 15:12:30

Anthony Liguori wrote:
> >Perhaps.  This raises another point about AIO vs. threads:
> >
> >If I submit sequential O_DIRECT reads with aio_read(), will they enter
> >the device read queue in the same order, and reach the disk in that
> >order (allowing for reordering when worthwhile by the elevator)?
> 
> There's no guarantee that any sort of order will be preserved by AIO 
> requests.  The same is true with writes.  This is what fdsync is for, to 
> guarantee ordering.

You misunderstand.  I'm not talking about guarantees, I'm talking
about expectations for the performance effect.

Basically, to do performant streaming read with O_DIRECT you need two
things:

   1. Overlap at least 2 requests, so the device is kept busy.

   2. Requests be sent to the disk in a good order, which is usually
      (but not always) sequential offset order.

The kernel does this itself with buffered reads, doing readahead.
It works very well, unless you have other problems caused by readahead.

With O_DIRECT, an application has to do the equivalent of readahead
itself to get performant streaming.

If the app uses two threads calling pread(), it's hard to ensure the
kernel even _sees_ the first two calls in sequential offset order.
You spawn two threads, and then both threads call pread() with
non-deterministic scheduling.  The problem starts before even entering
the kernel.

Then, depending on I/O scheduling in the kernel, it might send the
less good pread() to the disk immediately, then later a backward head
seek and the other one.  The elevator cannot fix this: it doesn't have
enough information, unless it adds artificial delays.  But artificial
delays may harm too; it's not optimal.

After that, the two threads tend to call pread() in the best order
provided there's no scheduling conflicts, but are easily disrupted by
other tasks, especially on SMP (one reading thread per CPU, so when
one of them is descheduled, the other continues and issues a request
in the 'wrong' order.)

With AIO, even though you can't be sure what the kernel does, you can
be sure the kernel receives aio_read() calls in the exact order which
is most likely to perform well.  Application knowledge of it's access
pattern is passed along better.

As I've said, I saw a man page which described why this makes AIO
superior to using threads for reading tapes on that OS.  So it's not a
completely spurious point.

This has nothing to do with guarantees.

-- Jamie

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Luca T. <kro...@gm...> - 2008-04-22 15:12:23

On Tue, Apr 22, 2008 at 4:15 PM, Anthony Liguori <ali...@us...> wrote:
> This patch changes virtio devices to be multi-function devices whenever
>  possible.  This increases the number of virtio devices we can support now by
>  a factor of 8.
[...]
>  diff --git a/qemu/hw/virtio.c b/qemu/hw/virtio.c
>  index 9100bb1..9ea14d3 100644
>  --- a/qemu/hw/virtio.c
>  +++ b/qemu/hw/virtio.c
>  @@ -405,9 +405,18 @@ VirtIODevice *virtio_init_pci(PCIBus *bus, const char *name,
>      PCIDevice *pci_dev;
>      uint8_t *config;
>      uint32_t size;
>  +    static int devfn = 7;
>  +
>  +    if ((devfn % 8) == 7)
>  +       devfn = -1;
>  +    else
>  +       devfn++;

This code look strange... devfn should be passed to virtio_init_pci by
virtio-{net,blk} init functions, no?

Luca

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Avi K. <av...@qu...> - 2008-04-22 15:06:02

Anthony Liguori wrote:
>>
>> If I submit sequential O_DIRECT reads with aio_read(), will they enter
>> the device read queue in the same order, and reach the disk in that
>> order (allowing for reordering when worthwhile by the elevator)?
>>   
>
> There's no guarantee that any sort of order will be preserved by AIO 
> requests.  The same is true with writes.  This is what fdsync is for, 
> to guarantee ordering.

I believe he'd like a hint to get good scheduling, not a guarantee.  
With a thread pool if the threads are scheduled out of order, so are 
your requests.  If the elevator doesn't plug the queue, the first few 
requests may not be optimally sorted.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [RFC] linuxboot Option ROM for Linux kernel booting

From: Anthony L. <ali...@us...> - 2008-04-22 15:05:30

Nguyen Anh Quynh wrote:
> Hi,
>
> This should be submitted to upstream (but not to kvm-devel list), but
> this is only the test code that I want to quickly send out for
> comments. In case it looks OK, I will send it to upstream later.
>
> Inspired by extboot and conversations with Anthony and HPA, this
> linuxboot option ROM is a simple option ROM that intercepts int19 in
> order to execute linux setup code. This approach eliminates the need
> to manipulate the boot sector for this purpose.
>
> To test it, just load linux kernel with your KVM/QEMU image using
> -kernel option in normal way.
>
> I succesfully compiled and tested it with kvm-66 on Ubuntu 7.10, guest
> Ubuntu 8.04.
>   

For the next rounds, could you actually rebase against upstream QEMU and 
submit to qemu-devel?  One of Paul Brook's objections to extboot had 
historically been that it wasn't not easily sharable with other 
architectures.  With a C version, it seems more reasonable now to do that.

Make sure you remove all the old linux boot code too within QEMU along 
with the -hda checks.

Regards,

Anthony Liguori

> Thanks,
> Quynh
>
>
> # diffstat linuxboot1.diff
>  Makefile             |   13 ++++-
>  linuxboot/Makefile   |   40 +++++++++++++++
>  linuxboot/boot.S     |   54 +++++++++++++++++++++
>  linuxboot/farvar.h   |  130 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  linuxboot/rom.c      |  104 ++++++++++++++++++++++++++++++++++++++++
>  linuxboot/signrom    |binary
>  linuxboot/signrom.c  |  128 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  linuxboot/util.h     |   69 +++++++++++++++++++++++++++
>  qemu/Makefile        |    3 -
>  qemu/Makefile.target |    2
>  qemu/hw/linuxboot.c  |   39 +++++++++++++++
>  qemu/hw/pc.c         |   22 +++++++-
>  qemu/hw/pc.h         |    5 +
>  13 files changed, 600 insertions(+), 9 deletions(-)
>

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Avi K. <av...@qu...> - 2008-04-22 15:03:48

Jamie Lokier wrote:
> Avi Kivity wrote:
>   
>>> And video streaming on some embedded devices with no MMU!  (Due to the
>>> page cache heuristics working poorly with no MMU, sustained reliable
>>> streaming is managed with O_DIRECT and the app managing cache itself
>>> (like a database), and that needs AIO to keep the request queue busy.
>>> At least, that's the theory.)
>>>       
>> Could use threads as well, no?
>>     
>
> Perhaps.  This raises another point about AIO vs. threads:
>
> If I submit sequential O_DIRECT reads with aio_read(), will they enter
> the device read queue in the same order, and reach the disk in that
> order (allowing for reordering when worthwhile by the elevator)?
>   

Yes, unless the implementation in the kernel (or glibc) is threaded.

> With threads this isn't guaranteed and scheduling makes it quite
> likely to issue the parallel synchronous reads out of order, and for
> them to reach the disk out of order because the elevator doesn't see
> them simultaneously.
>   

If the disk is busy, it doesn't matter.  The requests will queue and the 
elevator will sort them out.  So it's just the first few requests that 
may get to disk out of order.

> With AIO (non-Glibc! (and non-kthreads)) it might be better at
> keeping the intended issue order, I'm not sure.
>
> It is highly desirable: O_DIRECT streaming performance depends on
> avoiding seeks (no reordering) and on keeping the request queue
> non-empty (no gap).
>
> I read a man page for some other unix, describing AIO as better than
> threaded parallel reads for reading tape drives because of this (tape
> seeks are very expensive).  But the rest of the man page didn't say
> anything more.  Unfortunately I don't remember where I read it.  I
> have no idea whether AIO submission order is nearly always preserved
> in general, or expected to be.
>   

I haven't considered tape, but this is a good point indeed.  I expect it 
doesn't make much of a difference for a loaded disk.

>   
>> It's me at fault here.  I just assumed that because it's easy to do aio 
>> in a thread pool efficiently, that's what glibc does.
>>
>> Unfortunately the code does some ridiculous things like not service 
>> multiple requests on a single fd in parallel.  I see absolutely no 
>> reason for it (the code says "fight for resources").
>>     
>
> Ouch.  Perhaps that relates to my thought above, about multiple
> requests to the same file causing seek storms when thread scheduling
> is unlucky?
>   

My first thought on seeing this is that it relates to a deficiency on 
older kernels servicing multiple requests on a single fd (i.e. a 
per-file lock).  I don't know if such a deficiency ever existed, though.

>   
>> It could and should.  It probably doesn't.
>>
>> A simple thread pool implementation could come within 10% of Linux aio 
>> for most workloads.  It will never be "exactly", but for small numbers 
>> of disks, close enough.
>>     
>
> I would wait for benchmark results for I/O patterns like sequential
> reading and writing, because of potential for seeks caused by request
> reordering, before being confident of that.
>
>   

I did have measurements (and a test rig) at a previous job (where I did 
a lot of I/O work); IIRC the performance of a tuned thread pool was not 
far behind aio, both for seeks and sequential.  It was a while back though.


-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [Qemu-devel] Re: [RFC] linuxboot Option ROM for Linux kernel booting

From: Laurent V. <Lau...@bu...> - 2008-04-22 15:02:56

Le mardi 22 avril 2008 à 08:50 -0500, Anthony Liguori a écrit :
> Nguyen Anh Quynh wrote:
> > Hi,
> >
> > This should be submitted to upstream (but not to kvm-devel list), but
> > this is only the test code that I want to quickly send out for
> > comments. In case it looks OK, I will send it to upstream later.
> >
> > Inspired by extboot and conversations with Anthony and HPA, this
> > linuxboot option ROM is a simple option ROM that intercepts int19 in
> > order to execute linux setup code. This approach eliminates the need
> > to manipulate the boot sector for this purpose.
> >
> > To test it, just load linux kernel with your KVM/QEMU image using
> > -kernel option in normal way.
> >
> > I succesfully compiled and tested it with kvm-66 on Ubuntu 7.10, guest
> > Ubuntu 8.04.
> >   
> 
> For the next rounds, could you actually rebase against upstream QEMU and 
> submit to qemu-devel?  One of Paul Brook's objections to extboot had 
> historically been that it wasn't not easily sharable with other 
> architectures.  With a C version, it seems more reasonable now to do that.

Moreover add a binary version of the ROM in the pc-bios directory: it
avoids to have a cross-compiler to build ROM on non-x86 architecture.

Regards,
Laurent

> Make sure you remove all the old linux boot code too within QEMU along 
> with the -hda checks.
> 
> Regards,
> 
> Anthony Liguori
> 
> > Thanks,
> > Quynh
> >
> >
> > # diffstat linuxboot1.diff
> >  Makefile             |   13 ++++-
> >  linuxboot/Makefile   |   40 +++++++++++++++
> >  linuxboot/boot.S     |   54 +++++++++++++++++++++
> >  linuxboot/farvar.h   |  130 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  linuxboot/rom.c      |  104 ++++++++++++++++++++++++++++++++++++++++
> >  linuxboot/signrom    |binary
> >  linuxboot/signrom.c  |  128 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  linuxboot/util.h     |   69 +++++++++++++++++++++++++++
> >  qemu/Makefile        |    3 -
> >  qemu/Makefile.target |    2
> >  qemu/hw/linuxboot.c  |   39 +++++++++++++++
> >  qemu/hw/pc.c         |   22 +++++++-
> >  qemu/hw/pc.h         |    5 +
> >  13 files changed, 600 insertions(+), 9 deletions(-)
> >   
> 
> 
> 
> 
-- 
------------- Lau...@bu... ---------------
"The best way to predict the future is to invent it."
- Alan Kay

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

From: Anthony L. <an...@co...> - 2008-04-22 14:53:28

Jamie Lokier wrote:
> Avi Kivity wrote:
>   
>>> And video streaming on some embedded devices with no MMU!  (Due to the
>>> page cache heuristics working poorly with no MMU, sustained reliable
>>> streaming is managed with O_DIRECT and the app managing cache itself
>>> (like a database), and that needs AIO to keep the request queue busy.
>>> At least, that's the theory.)
>>>       
>> Could use threads as well, no?
>>     
>
> Perhaps.  This raises another point about AIO vs. threads:
>
> If I submit sequential O_DIRECT reads with aio_read(), will they enter
> the device read queue in the same order, and reach the disk in that
> order (allowing for reordering when worthwhile by the elevator)?
>   

There's no guarantee that any sort of order will be preserved by AIO 
requests.  The same is true with writes.  This is what fdsync is for, to 
guarantee ordering.

Regards,

Anthony Liguori

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Avi K. <av...@qu...> - 2008-04-22 14:52:03

Anthony Liguori wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>  
>>> This patch changes virtio devices to be multi-function devices whenever
>>> possible.  This increases the number of virtio devices we can 
>>> support now by
>>> a factor of 8.
>>>
>>> With this patch, I've been able to launch a guest with either 220 
>>> disks or 220
>>> network adapters.
>>>
>>>       
>>
>> Does this play well with hotplug?  Perhaps we need to allocate a new 
>> device on hotplug.
>>   
>
> Probably not.  I imagine you can only hotplug devices, not individual 
> functions?
>

It sounds reasonable to expect so.  ACPI has objects for devices, not 
functions (IIRC).

Maybe require explicit device/function assignment on the command line?  
It will be managed anyway.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH] Make virtio devices multi-function

From: Anthony L. <an...@co...> - 2008-04-22 14:46:38

Avi Kivity wrote:
> Anthony Liguori wrote:
>   
>> This patch changes virtio devices to be multi-function devices whenever
>> possible.  This increases the number of virtio devices we can support now by
>> a factor of 8.
>>
>> With this patch, I've been able to launch a guest with either 220 disks or 220
>> network adapters.
>>
>>   
>>     
>
> Does this play well with hotplug?  Perhaps we need to allocate a new 
> device on hotplug.
>   

Probably not.  I imagine you can only hotplug devices, not individual 
functions?

Regards,

Anthony Liguori

> (certainly if we have a device with one function, which then gets 
> converted to a multifunction device)
>
>

Re: [kvm-devel] KVM: PIT: make last_injected_time per-guest

From: Avi K. <av...@qu...> - 2008-04-22 14:43:09

Marcelo Tosatti wrote:
> Otherwise multiple guests use the same variable and boom.
>
> Also use kvm_vcpu_kick() to make sure that if a timer triggers on 
> a different CPU the event won't be missed.
>
>   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function

157 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 46 47 48 49 50 .. 703 > >> (Page 48 of 703)

2006	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (33)	Nov (325)	Dec (320)
2007	Jan (484)	Feb (438)	Mar (407)	Apr (713)	May (831)	Jun (806)	Jul (1023)	Aug (1184)	Sep (1118)	Oct (1461)	Nov (1224)	Dec (1042)
2008	Jan (1449)	Feb (1110)	Mar (1428)	Apr (1643)	May (682)	Jun	Jul	Aug	Sep	Oct	Nov	Dec