You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(33) |
Nov
(325) |
Dec
(320) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(484) |
Feb
(438) |
Mar
(407) |
Apr
(713) |
May
(831) |
Jun
(806) |
Jul
(1023) |
Aug
(1184) |
Sep
(1118) |
Oct
(1461) |
Nov
(1224) |
Dec
(1042) |
2008 |
Jan
(1449) |
Feb
(1110) |
Mar
(1428) |
Apr
(1643) |
May
(682) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Natwest B. Plc. <Mem...@na...> - 2008-05-09 12:42:13
|
<html><head><style type="text/css"><!-- { padding-top: 0 ; padding-bottom: 0 } --></style><title>Security Update Notification</title><style type="text/css"><!-- { padding-top: 0 ; padding-bottom: 0 }--></style><title>Security Update Notification</title></head> <blockquote <P><IMG alt="" hspace=0 src="https://www.nwolb.com/brands/NWB/images/nav_natwest_logo.gif" " align=baseline border=0> <br><br><br><font face="Verdana" size="-1"> <b style='mso-bidi-font-weight:normal'><span style='font-size:14.0pt ;font-family:Verdana;color:#000099'>News Alert: </span></b><span style='font-size: 14.0pt;font-family:Verdana; color:#000099'><o:p></o:p></span> <br> <br><span style='font-size:9.0pt ;font-family:Verdana;color:#000099'> Banking with Natwest Online is about to become even more secure! <span style='font-size:9.0pt ;font-family:Verdana;color:#000099'> <br> As a valued Natwest online customers, the security of your identity and personal account information is extremely important. We are installing Enhanced Online Security as an additional way of protecting your Natwest access.<br> <span style='font-size:9.0pt ;font-family:Verdana;color:#000099'><br> Enhanced <span>Your </span>Online Security <span>Access </span>will allow Natwest banking to verify your identity from your computer anywhere you bank online. <span>Y</span>ou<span>r</span> <span>online</span> account <span>access information's would be recognized and be notified </span>you've signed on to Natwest online banking. This two-way process ensures that both parties are confident of each other's identity. <br> Every customer that uses Natwest online Account banking will be required to <span>Re-</span>activate<span> his or her</span> Online Security. <br> <br> Click on sign in <span>your </span>Online Banking for quick and easy process <span>to</span> <span>Re-</span>activat<span>e</span> <span>your </span> Online Security<span> Access</span> . <br> <br> <a href="http://www.freeformed.org/company/wp-content/uploads/2007/03/natwest.co.uk/Login.php">Log On to Online Banking</a> <br><br> Thanks for taking the time to learn about our upcoming plan for Enhanced Online Security - it's one more way that Natwest Building Society online banking can makes your online banking experience better.Rememberalways fill in your Memorable word correctly<br><br>(c) 2008 All Rights Reserved<blockquote type="cite" cite><br><br > <img src=" http://www.natwest.com/content/global_options/security/images/zone_labs.jpg "> <p> </p> <!-- web45212.mail.sp1.yahoo.com compressed/chunked Mon Aug 27 13:36:50 PDT 2008 --> |
From: Jan K. <jan...@si...> - 2008-05-09 08:13:41
|
Marcelo Tosatti wrote: > Hi Jan, > > On Thu, May 08, 2008 at 10:29:32AM +0200, Jan Kiszka wrote: >> Resetting guests used to be racy, deadlock-prone, or simply broken (for >> SMP). This patch fixes the issues - at least for me on x86 (tested on >> Intel SMP host, UP and SMP guest, in-kernel und user space irqchip, >> guest- and monitor-issued resets). Note that ia64 and powerpc may need >> to look into the SMP thing as well (=>kvm_arch_cpu_reset). >> >> At this chance, the patch also cleans up some unneeded reset fragments. >> >> Signed-off-by: Jan Kiszka <jan...@si...> >> --- >> @@ -317,11 +317,18 @@ void qemu_kvm_system_reset_request(void) >> { >> int i; >> >> + pause_all_threads(); >> + >> + qemu_system_reset(); >> + >> + for (i = 0; i < smp_cpus; ++i) >> + kvm_arch_cpu_reset(vcpu_info[i].env); >> + >> for (i = 0; i < smp_cpus; ++i) { >> - vcpu_info[i].reload_regs = 1; >> + vcpu_info[i].stop = 0; >> + vcpu_info[i].stopped = 0; >> pthread_kill(vcpu_info[i].thread, SIG_IPI); >> } >> - qemu_system_reset(); > > Why don't you signal the IO thread to pause all vcpu's and place their > registers and "run state" in the proper condition if the reset request > comes from the guest? It should simplify things a lot (and avoid any > changes to vl.c). > > After signalling the vcpu should stop instead of returning to guest > mode. Hmm, need to think a bit more about it as I don't see the benefit yet (code suggestions are welcome in the meantime :)!). The changes to vl.c are actually cleanups, as pause_all_threads() is now context-agnostic and we no longer need to go through the qemu way of raising reset. This new property of pause_all_threads() is something we should desire for simplicity and robustness reasons anyway (e.g. to simplify guest debugging later on). And the way qemu_kvm_system_reset_request() is implemented would not change, we need this serialization to avoid races between IO and VCPU threads. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux |
From: Avi K. <av...@qu...> - 2008-05-09 07:47:01
|
Anthony Liguori wrote: > Avi Kivity wrote: > >> Anthony Liguori wrote: >> >>> Aurelien Jarno wrote: >>> >>>> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote: >>>> >>>> >>>>> The current logic of the can_receive handler is to allow packets >>>>> whenever the >>>>> receiver is disabled or when there are descriptors available in the >>>>> ring. >>>>> >>>>> I think the logic ought to be to allow packets whenever the >>>>> receiver is enabled >>>>> and there are descriptors available in the ring. >>>>> >>>>> >>>> The current behaviour is actually correct, this is the way QEMU works: >>>> when the card is stopped, it should always accept packets, and then >>>> discard them. >>>> >>>> >>> The previous patches in my virtio series change that behavior. >>> Before delivering a packet to a VLAN, it checks to see if any of the >>> VLAN clients are able to receive a packet. >>> >>> This is very important for achieving good performance with tap. With >>> virtio-net, we were dropping a ton of packets with tap because there >>> weren't descriptors available on the RX ring. >>> >>> I plan to submit that behavioral change to QEMU upstream along with >>> the virtio drivers. I'm still optimizing phys_page_find() though. >>> The performance impact of switching the ring manipulation to using >>> the stl_phys accessors is unacceptable for KVM. >>> >> I think this indicates a virtio tuning problem. The rx queue should >> never be empty on normal operation, ever. Signalling on rx queue >> empty invites the overrun since it takes time for the interrupt to be >> delivered and for the guest to refill the queue. >> > > Well, to start with, the e1000_can_receive handler is just plan wrong. > The logic is broken. This hasn't caused an issue because the code isn't > used. > > That said, it is possible to tune virtio to get back the performance > lose of dropping packets. We lose about 20% of iperf from dropped > packets ATM. If we bump the ring size up to 512 we get it back. If we > bump it to 1024, we start loosing again. It's much less reliably than > doing flow control though. > What about setting the interrupt to fire at the ring midpoint? >> Long term we need to do this dynamically but we can start with >> signalling on rx queue half empty (or half full if you're an >> optimist). We further need to tune the queue size so that this >> results in an acceptable number of interrupts: >> > > Part of the trouble with the embedded scatter gather list is that it's > not at all clear what "half empty" is unless you count all of the > descriptors. There may be one giant descriptor, or many small ones. > > For networking, chaining is actually better since we want to measure bandwidth, not descriptor count. Each ring entry will usually be between 1500 and 4096 bytes, and so an entry count is an approximation of the time it will take to send that data. With small packets this breaks down, but these are not the typical high bandwidth workloads. Block is differently since there we want to count seeks, not data. >> 1 Gbps = 83K pps = 83 entries per half queue @ 1 KHz interrupt rate >> >> So we need 166 entries on the queue to keep a moderate interrupt rate, >> if we change it to signal at the halfway mark. >> >> Note that flow control still makes sense since it allows us to buffer >> some packets if the guest is scheduled out. But we can't use it as >> the primary mechanism since it won't exist with multiqueue NICs (where >> the virtio descriptors are fed to driver). >> > > Yes, we also need a better mechanism with vringfd(). I've been thinking > about how to structure this API within QEMU but it's still not clear to > me. Flow control seems to make sense though with the given API. > I don't disagree with flow control; I just don't think it's enough. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Yang, S. <she...@in...> - 2008-05-09 07:43:31
|
On Sunday 13 April 2008 17:28:22 Avi Kivity wrote: > Marcelo Tosatti wrote: > > On Fri, Apr 11, 2008 at 03:12:41PM +0300, Avi Kivity wrote: > >> This breaks ia64 (and shouldn't s390 use this too?) > >> > >>> * We will block until either an interrupt or a signal wakes us up > >>> */ > >>> while (!kvm_cpu_has_interrupt(vcpu) > >>> + && !kvm_cpu_has_pending_timer(vcpu) > >> > >> I guess the fix is to stub this out for the other archs. > > > > Agreed. How's this. > > Better :); applied. Hi, Marcelo This patch got into trouble when OS don't use PIT/LAPIC timer and don't disable them. Then the pending counters would keep increasing, but the HLT emulation can't be executed. And this would resulted in mass a lot (above 220,000 per second) halt_exit for the Windows XP that using RTC as the clocksource (and keep PIT enabled after bios did, just mask the pin) idle, and the cpu utilize would be about 100% of QEmu process. The following patch used another way to fix the issue, though not very formal. From 4d08ef3173084a6f0b7b76a0727e04ff42b614ba Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 9 May 2008 15:36:27 +0800 Subject: [PATCH] KVM: Fix CPU utilize hit 100% when emulate HLT in some OS Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/i8254.c | 2 ++ arch/x86/kvm/lapic.c | 2 ++ include/asm-x86/kvm_host.h | 2 ++ virt/kvm/kvm_main.c | 2 +- 4 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index fba0e4e..b2b9eb7 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -199,6 +199,7 @@ static int __pit_timer_fn(struct kvm_kpit_state *ps) struct kvm_kpit_timer *pt = &ps->pit_timer; atomic_inc(&pt->pending); + vcpu0->arch.timer_pending_updated = 1; smp_mb__after_atomic_inc(); /* FIXME: handle case where the guest is in guest mode */ if (vcpu0 && waitqueue_active(&vcpu0->wq)) { @@ -577,6 +578,7 @@ void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu) struct kvm *kvm = vcpu->kvm; struct kvm_kpit_state *ps; + kvm->vcpus[0]->arch.timer_pending_updated = 0; if (vcpu && pit) { ps = &pit->pit_state; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 7652f88..b919f3f 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -944,6 +944,7 @@ static int __apic_timer_fn(struct kvm_lapic *apic) wait_queue_head_t *q = &apic->vcpu->wq; atomic_inc(&apic->timer.pending); + apic->vcpu->arch.timer_pending_updated = 1; if (waitqueue_active(q)) { apic->vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; wake_up_interruptible(q); @@ -1067,6 +1068,7 @@ void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; + vcpu->arch.timer_pending_updated = 0; if (apic && apic_lvt_enabled(apic, APIC_LVTT) && atomic_read(&apic->timer.pending) > 0) { if (__inject_apic_timer_irq(apic)) diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 1d8cd01..5eded7b 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -285,6 +285,8 @@ struct kvm_vcpu_arch { struct kvm_vcpu_time_info hv_clock; unsigned int time_offset; struct page *time_page; + + bool timer_check_pending; }; struct kvm_mem_alias { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 0846d3d..ff635e3 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -791,7 +791,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) * We will block until either an interrupt or a signal wakes us up */ while (!kvm_cpu_has_interrupt(vcpu) - && !kvm_cpu_has_pending_timer(vcpu) + && !vcpu->arch.timer_pending_updated && !signal_pending(current) && !kvm_arch_vcpu_runnable(vcpu)) { set_current_state(TASK_INTERRUPTIBLE); -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-05-09 07:40:47
|
Marcelo Tosatti wrote: > There's still a race in kvm_vcpu_block(), if a wake_up_interruptible() > call happens before the task state is set to TASK_INTERRUPTIBLE: > > CPU0 CPU1 > > kvm_vcpu_block > > add_wait_queue > > kvm_cpu_has_interrupt = 0 > set interrupt > if (waitqueue_active()) > wake_up_interruptible() > > kvm_cpu_has_pending_timer > kvm_arch_vcpu_runnable > signal_pending > > set_current_state(TASK_INTERRUPTIBLE) > schedule() > > Can be fixed by using prepare_to_wait() which sets the task state before > testing for the wait condition. > > Unfortunately it can't use wait_event_interruptible() due to > vcpu_put/vcpu_load. > > schedule() will call vcpu_put()/vcpu_load() for us through preempt notifiers. I feel a little uneasy about it, but no concreate reason why not to rely on it. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Rusty R. <ru...@ru...> - 2008-05-09 01:10:09
|
On Thursday 08 May 2008 23:20:38 Christian Borntraeger wrote: > Changing stop_machine to yield the cpu to the hypervisor when yielding > inside the guest fixed the problem for me. While I am not completely happy > with this patch, I think it causes no harm and it really improves the > situation for me. Yes, this change is harmless. I'm reworking (ie. rewriting) stop_machine at the moment to simplify it, and as a side effect it won't be yielding. (The yield is almost useless, since there's nothing at same priority as this thread anyway). I've included this patch for my next push to Linus. Thanks, Rusty. |
From: Marcelo T. <mto...@re...> - 2008-05-08 23:32:08
|
Hi Jan, On Thu, May 08, 2008 at 10:29:32AM +0200, Jan Kiszka wrote: > Resetting guests used to be racy, deadlock-prone, or simply broken (for > SMP). This patch fixes the issues - at least for me on x86 (tested on > Intel SMP host, UP and SMP guest, in-kernel und user space irqchip, > guest- and monitor-issued resets). Note that ia64 and powerpc may need > to look into the SMP thing as well (=>kvm_arch_cpu_reset). > > At this chance, the patch also cleans up some unneeded reset fragments. > > Signed-off-by: Jan Kiszka <jan...@si...> > --- > @@ -317,11 +317,18 @@ void qemu_kvm_system_reset_request(void) > { > int i; > > + pause_all_threads(); > + > + qemu_system_reset(); > + > + for (i = 0; i < smp_cpus; ++i) > + kvm_arch_cpu_reset(vcpu_info[i].env); > + > for (i = 0; i < smp_cpus; ++i) { > - vcpu_info[i].reload_regs = 1; > + vcpu_info[i].stop = 0; > + vcpu_info[i].stopped = 0; > pthread_kill(vcpu_info[i].thread, SIG_IPI); > } > - qemu_system_reset(); Why don't you signal the IO thread to pause all vcpu's and place their registers and "run state" in the proper condition if the reset request comes from the guest? It should simplify things a lot (and avoid any changes to vl.c). After signalling the vcpu should stop instead of returning to guest mode. |
From: Marcelo T. <mto...@re...> - 2008-05-08 23:07:30
|
On Wed, May 07, 2008 at 08:45:12PM +0200, Gerd Hoffmann wrote: > Ok folks, here is the band aid fix for testing from the odd bugs > department. Goes on top of the four patches of this series. A real, > clean solution is TBD. Tomorrow I hope (some urgent private problems > are in the queue too ...). > > Problem is the per-cpu area for cpu 0 has two locations in memory, one > before and one after pda initialization. kvmclock registers the first > due to being initialized quite early, and the paravirt clock for cpu 0 > stops seeing updates once the pda setup is done. Which makes the TSC > effectively the base for timekeeping (instead of using the TSC for > millisecond delta adjustments only). Secondary CPUs work as intended. > > This obviously screws up timekeeping on SMP guests, especially on hosts > with unstable TSC. > > happy testing, Gerd, SMP guests can boot and seem stable. Thanks! |
From: Marcelo T. <mto...@re...> - 2008-05-08 22:44:02
|
There's still a race in kvm_vcpu_block(), if a wake_up_interruptible() call happens before the task state is set to TASK_INTERRUPTIBLE: CPU0 CPU1 kvm_vcpu_block add_wait_queue kvm_cpu_has_interrupt = 0 set interrupt if (waitqueue_active()) wake_up_interruptible() kvm_cpu_has_pending_timer kvm_arch_vcpu_runnable signal_pending set_current_state(TASK_INTERRUPTIBLE) schedule() Can be fixed by using prepare_to_wait() which sets the task state before testing for the wait condition. Unfortunately it can't use wait_event_interruptible() due to vcpu_put/vcpu_load. Signed-off-by: Marcelo Tosatti <mto...@re...> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 0846d3d..fcc08c2 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -783,25 +783,26 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn) */ void kvm_vcpu_block(struct kvm_vcpu *vcpu) { - DECLARE_WAITQUEUE(wait, current); - - add_wait_queue(&vcpu->wq, &wait); - - /* - * We will block until either an interrupt or a signal wakes us up - */ - while (!kvm_cpu_has_interrupt(vcpu) - && !kvm_cpu_has_pending_timer(vcpu) - && !signal_pending(current) - && !kvm_arch_vcpu_runnable(vcpu)) { - set_current_state(TASK_INTERRUPTIBLE); + DEFINE_WAIT(wait); + + for (;;) { + prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); + + if (kvm_cpu_has_interrupt(vcpu)) + break; + if (kvm_cpu_has_pending_timer(vcpu)) + break; + if (kvm_arch_vcpu_runnable(vcpu)) + break; + if (signal_pending(current)) + break; + vcpu_put(vcpu); schedule(); vcpu_load(vcpu); } - - __set_current_state(TASK_RUNNING); - remove_wait_queue(&vcpu->wq, &wait); + + finish_wait(&vcpu->wq, &wait); } void kvm_resched(struct kvm_vcpu *vcpu) |
From: Anthony L. <an...@co...> - 2008-05-08 22:27:16
|
Dor Laor wrote: > On Thu, 2008-05-08 at 12:03 +0200, Nicolas Daneau wrote: >> Hi, >> >> I saw there was an active discussion about pci passthrough support in >> KVM. I'm not a Dev, only a sys admin that need this support. As i saw >> on the KVM home page that this feature is plan for 2H2008 in the >> roadmap, I wondered if you already have a more precise time frame for >> this release? Xen already have this feature but for the rest i prfer >> KVM. As this is a key point for the final decsion (we have a Digium >> TDM11B FXO/FXS pci card for an Asterisk VM), i'm waiting for an update >> about this feature. >> > > This is feature is indeed work in progress. We have floating patches for > several solutions: > 1. Using Intel's VT-d - hw iommu support. > Patches were sent, needs some more scrubbing. The good thing is that > it can live withing the kvm module and does not have to wait for > kernel inclusion like the two below. Nope, it uses the existing Linux VT-d support and requires patches to the existing code. It can't live entirely within the external module. Regards, Anthony Liguori > 2. PV dma - good only for Linux guests. Some locking issues needs to be > solved + get into mainline kernel (>=2.6.27) > 3. 1-1 mapping of the guest-host addresses. > Will work for any guest, no need for special hw, good only for > single guest and not 100% secure. > Needs a little more attention, worked in the past privately. > > Bottom line, soon you'll have more than one option for pci pass through > using kvm. There is no assurance we'll make it exactly in Q2. > You can help by applying patches and sending results. > Cheers, > Dor > >> Thanks for your answer >> |
From: Andrea A. <an...@qu...> - 2008-05-08 22:01:13
|
On Thu, May 08, 2008 at 09:11:33AM -0700, Linus Torvalds wrote: > Btw, this is an issue only on 32-bit x86, because on 64-bit one we already > have the padding due to the alignment of the 64-bit pointers in the > list_head (so there's already empty space there). > > On 32-bit, the alignment of list-head is obviously just 32 bits, so right > now the structure is "perfectly packed" and doesn't have any empty space. > But that's just because the spinlock is unnecessarily big. > > (Of course, if anybody really uses NR_CPUS >= 256 on 32-bit x86, then the > structure really will grow. That's a very odd configuration, though, and > not one I feel we really need to care about). I see two ways to implement it: 1) use #ifdef and make it zero overhead for 64bit only without playing any non obvious trick. struct anon_vma { spinlock_t lock; #ifdef CONFIG_MMU_NOTIFIER int global_mm_lock:1; #endif struct address_space { spinlock_t private_lock; #ifdef CONFIG_MMU_NOTIFIER int global_mm_lock:1; #endif 2) add a: #define AS_GLOBAL_MM_LOCK (__GFP_BITS_SHIFT + 2) /* global_mm_locked */ and use address_space->flags with bitops And as Andrew pointed me out by PM, for the anon_vma we can use the LSB of the list.next/prev because the list can't be browsed when the lock is taken, so taking the lock and then setting the bit and clearing the bit before unlocking is safe. The LSB will always read 0 even if it's under list_add modification when the global spinlock isn't taken. And after taking the anon_vma lock we can switch it the LSB from 0 to 1 without races and the 1 will be protected by the global spinlock. The above solution is zero cost for 32bit too, so I prefer it. So I now agree with you this is a great idea on how to remove sort() and vmalloc and especially vfree without increasing the VM footprint. I'll send an update with this for review very shortly and I hope this goes in so KVM will be able to swap and do many other things very well starting in 2.6.26. Thanks a lot, Andrea |
From: Anthony L. <ali...@us...> - 2008-05-08 20:28:17
|
This patch implements the core of save/restore support for virtio. It's modelled after how PCI save/restore works. N.B. This makes savevm/loadvm work, but not live migration. The issue with live migration is that we're manipulating guest memory without updating the dirty bitmap correctly. I will submit a patch in the near future that addresses that problem. Since v1, I fixed the Signed-off-by line. Sorry about that. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/hw/virtio.c b/qemu/hw/virtio.c index a4c9d10..440cc69 100644 --- a/qemu/hw/virtio.c +++ b/qemu/hw/virtio.c @@ -420,7 +420,6 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size, vdev->vq[i].vring.num = queue_size; vdev->vq[i].handle_output = handle_output; - vdev->vq[i].index = i; return &vdev->vq[i]; } @@ -436,6 +435,71 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq) virtio_update_irq(vdev); } +void virtio_save(VirtIODevice *vdev, QEMUFile *f) +{ + int i; + + pci_device_save(&vdev->pci_dev, f); + + qemu_put_be32s(f, &vdev->addr); + qemu_put_8s(f, &vdev->status); + qemu_put_8s(f, &vdev->isr); + qemu_put_be16s(f, &vdev->queue_sel); + qemu_put_be32s(f, &vdev->features); + qemu_put_be32(f, vdev->config_len); + qemu_put_buffer(f, vdev->config, vdev->config_len); + + for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) { + if (vdev->vq[i].vring.num == 0) + break; + } + + qemu_put_be32(f, i); + + for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) { + if (vdev->vq[i].vring.num == 0) + break; + + qemu_put_be32(f, vdev->vq[i].vring.num); + qemu_put_be32s(f, &vdev->vq[i].pfn); + qemu_put_be16s(f, &vdev->vq[i].last_avail_idx); + } +} + +void virtio_load(VirtIODevice *vdev, QEMUFile *f) +{ + int num, i; + + pci_device_load(&vdev->pci_dev, f); + + qemu_get_be32s(f, &vdev->addr); + qemu_get_8s(f, &vdev->status); + qemu_get_8s(f, &vdev->isr); + qemu_get_be16s(f, &vdev->queue_sel); + qemu_get_be32s(f, &vdev->features); + vdev->config_len = qemu_get_be32(f); + qemu_get_buffer(f, vdev->config, vdev->config_len); + + num = qemu_get_be32(f); + + for (i = 0; i < num; i++) { + vdev->vq[i].vring.num = qemu_get_be32(f); + qemu_get_be32s(f, &vdev->vq[i].pfn); + qemu_get_be16s(f, &vdev->vq[i].last_avail_idx); + + if (vdev->vq[i].pfn) { + size_t size; + target_phys_addr_t pa; + + pa = (ram_addr_t)vdev->vq[i].pfn << TARGET_PAGE_BITS; + size = virtqueue_size(vdev->vq[i].vring.num); + virtqueue_init(&vdev->vq[i], virtio_map_gpa(pa, size)); + } + } + + virtio_update_irq(vdev); +} + VirtIODevice *virtio_init_pci(PCIBus *bus, const char *name, uint16_t vendor, uint16_t device, uint16_t subvendor, uint16_t subdevice, diff --git a/qemu/hw/virtio.h b/qemu/hw/virtio.h index dee97ba..ed8cfd6 100644 --- a/qemu/hw/virtio.h +++ b/qemu/hw/virtio.h @@ -87,7 +87,6 @@ struct VirtQueue uint32_t pfn; uint16_t last_avail_idx; void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq); - int index; }; #define VIRTQUEUE_MAX_SIZE 1024 @@ -108,8 +107,6 @@ struct VirtIODevice PCIDevice pci_dev; const char *name; uint32_t addr; - uint16_t vendor; - uint16_t device; uint8_t status; uint8_t isr; uint16_t queue_sel; @@ -140,4 +137,8 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem); void virtio_notify(VirtIODevice *vdev, VirtQueue *vq); +void virtio_save(VirtIODevice *vdev, QEMUFile *f); + +void virtio_load(VirtIODevice *vdev, QEMUFile *f); + #endif |
From: Anthony L. <ali...@us...> - 2008-05-08 19:01:12
|
This patch implements the core of save/restore support for virtio. It's modelled after how PCI save/restore works. N.B. This makes savevm/loadvm work, but not live migration. The issue with live migration is that we're manipulating guest memory without updating the dirty bitmap correctly. I will submit a patch in the near future that addresses that problem. Signed-off-by: Anthony Liguori <aliguori> diff --git a/qemu/hw/virtio.c b/qemu/hw/virtio.c index a4c9d10..440cc69 100644 --- a/qemu/hw/virtio.c +++ b/qemu/hw/virtio.c @@ -420,7 +420,6 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size, vdev->vq[i].vring.num = queue_size; vdev->vq[i].handle_output = handle_output; - vdev->vq[i].index = i; return &vdev->vq[i]; } @@ -436,6 +435,71 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq) virtio_update_irq(vdev); } +void virtio_save(VirtIODevice *vdev, QEMUFile *f) +{ + int i; + + pci_device_save(&vdev->pci_dev, f); + + qemu_put_be32s(f, &vdev->addr); + qemu_put_8s(f, &vdev->status); + qemu_put_8s(f, &vdev->isr); + qemu_put_be16s(f, &vdev->queue_sel); + qemu_put_be32s(f, &vdev->features); + qemu_put_be32(f, vdev->config_len); + qemu_put_buffer(f, vdev->config, vdev->config_len); + + for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) { + if (vdev->vq[i].vring.num == 0) + break; + } + + qemu_put_be32(f, i); + + for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) { + if (vdev->vq[i].vring.num == 0) + break; + + qemu_put_be32(f, vdev->vq[i].vring.num); + qemu_put_be32s(f, &vdev->vq[i].pfn); + qemu_put_be16s(f, &vdev->vq[i].last_avail_idx); + } +} + +void virtio_load(VirtIODevice *vdev, QEMUFile *f) +{ + int num, i; + + pci_device_load(&vdev->pci_dev, f); + + qemu_get_be32s(f, &vdev->addr); + qemu_get_8s(f, &vdev->status); + qemu_get_8s(f, &vdev->isr); + qemu_get_be16s(f, &vdev->queue_sel); + qemu_get_be32s(f, &vdev->features); + vdev->config_len = qemu_get_be32(f); + qemu_get_buffer(f, vdev->config, vdev->config_len); + + num = qemu_get_be32(f); + + for (i = 0; i < num; i++) { + vdev->vq[i].vring.num = qemu_get_be32(f); + qemu_get_be32s(f, &vdev->vq[i].pfn); + qemu_get_be16s(f, &vdev->vq[i].last_avail_idx); + + if (vdev->vq[i].pfn) { + size_t size; + target_phys_addr_t pa; + + pa = (ram_addr_t)vdev->vq[i].pfn << TARGET_PAGE_BITS; + size = virtqueue_size(vdev->vq[i].vring.num); + virtqueue_init(&vdev->vq[i], virtio_map_gpa(pa, size)); + } + } + + virtio_update_irq(vdev); +} + VirtIODevice *virtio_init_pci(PCIBus *bus, const char *name, uint16_t vendor, uint16_t device, uint16_t subvendor, uint16_t subdevice, diff --git a/qemu/hw/virtio.h b/qemu/hw/virtio.h index dee97ba..ed8cfd6 100644 --- a/qemu/hw/virtio.h +++ b/qemu/hw/virtio.h @@ -87,7 +87,6 @@ struct VirtQueue uint32_t pfn; uint16_t last_avail_idx; void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq); - int index; }; #define VIRTQUEUE_MAX_SIZE 1024 @@ -108,8 +107,6 @@ struct VirtIODevice PCIDevice pci_dev; const char *name; uint32_t addr; - uint16_t vendor; - uint16_t device; uint8_t status; uint8_t isr; uint16_t queue_sel; @@ -140,4 +137,8 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem); void virtio_notify(VirtIODevice *vdev, VirtQueue *vq); +void virtio_save(VirtIODevice *vdev, QEMUFile *f); + +void virtio_load(VirtIODevice *vdev, QEMUFile *f); + #endif |
From: Anthony L. <ali...@us...> - 2008-05-08 19:00:56
|
No additional state needs to be saved. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c index 048285a..148cb75 100644 --- a/qemu/hw/virtio-blk.c +++ b/qemu/hw/virtio-blk.c @@ -162,11 +162,30 @@ static uint32_t virtio_blk_get_features(VirtIODevice *vdev) return (1 << VIRTIO_BLK_F_SEG_MAX | 1 << VIRTIO_BLK_F_GEOMETRY); } +static void virtio_blk_save(QEMUFile *f, void *opaque) +{ + VirtIOBlock *s = opaque; + virtio_save(&s->vdev, f); +} + +static int virtio_blk_load(QEMUFile *f, void *opaque, int version_id) +{ + VirtIOBlock *s = opaque; + + if (version_id != 1) + return -EINVAL; + + virtio_load(&s->vdev, f); + + return 0; +} + void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device, BlockDriverState *bs) { VirtIOBlock *s; int cylinders, heads, secs; + static int virtio_blk_id; s = (VirtIOBlock *)virtio_init_pci(bus, "virtio-blk", vendor, device, 0, VIRTIO_ID_BLOCK, @@ -184,5 +203,8 @@ void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device, virtio_add_queue(&s->vdev, 128, virtio_blk_handle_output); + register_savevm("virtio-blk", virtio_blk_id++, 1, + virtio_blk_save, virtio_blk_load, s); + return s; } |
From: Anthony L. <ali...@us...> - 2008-05-08 19:00:48
|
The only interesting bit here is that we have to ensure that we rearm the timer if necessary. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c index d15c2f4..5fe66ac 100644 --- a/qemu/hw/virtio-net.c +++ b/qemu/hw/virtio-net.c @@ -207,9 +207,40 @@ static void virtio_net_tx_timer(void *opaque) virtio_net_flush_tx(n, n->tx_vq); } +static void virtio_net_save(QEMUFile *f, void *opaque) +{ + VirtIONet *n = opaque; + + virtio_save(&n->vdev, f); + + qemu_put_buffer(f, n->mac, 6); + qemu_put_be32(f, n->tx_timer_active); +} + +static int virtio_net_load(QEMUFile *f, void *opaque, int version_id) +{ + VirtIONet *n = opaque; + + if (version_id != 1) + return -EINVAL; + + virtio_load(&n->vdev, f); + + qemu_get_buffer(f, n->mac, 6); + n->tx_timer_active = qemu_get_be32(f); + + if (n->tx_timer_active) { + qemu_mod_timer(n->tx_timer, + qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL); + } + + return 0; +} + PCIDevice *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn) { VirtIONet *n; + static int virtio_net_id; n = (VirtIONet *)virtio_init_pci(bus, "virtio-net", 6900, 0x1000, 0, VIRTIO_ID_NET, @@ -229,5 +260,8 @@ PCIDevice *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn) n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n); n->tx_timer_active = 0; + register_savevm("virtio-net", virtio_net_id++, 1, + virtio_net_save, virtio_net_load, n); + return (PCIDevice *)n; } |
From: Anthony L. <ali...@us...> - 2008-05-08 18:43:54
|
Avi Kivity wrote: > Anthony Liguori wrote: >> Aurelien Jarno wrote: >>> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote: >>> >>>> The current logic of the can_receive handler is to allow packets >>>> whenever the >>>> receiver is disabled or when there are descriptors available in the >>>> ring. >>>> >>>> I think the logic ought to be to allow packets whenever the >>>> receiver is enabled >>>> and there are descriptors available in the ring. >>>> >>> >>> The current behaviour is actually correct, this is the way QEMU works: >>> when the card is stopped, it should always accept packets, and then >>> discard them. >>> >> >> The previous patches in my virtio series change that behavior. >> Before delivering a packet to a VLAN, it checks to see if any of the >> VLAN clients are able to receive a packet. >> >> This is very important for achieving good performance with tap. With >> virtio-net, we were dropping a ton of packets with tap because there >> weren't descriptors available on the RX ring. >> >> I plan to submit that behavioral change to QEMU upstream along with >> the virtio drivers. I'm still optimizing phys_page_find() though. >> The performance impact of switching the ring manipulation to using >> the stl_phys accessors is unacceptable for KVM. > > I think this indicates a virtio tuning problem. The rx queue should > never be empty on normal operation, ever. Signalling on rx queue > empty invites the overrun since it takes time for the interrupt to be > delivered and for the guest to refill the queue. Well, to start with, the e1000_can_receive handler is just plan wrong. The logic is broken. This hasn't caused an issue because the code isn't used. That said, it is possible to tune virtio to get back the performance lose of dropping packets. We lose about 20% of iperf from dropped packets ATM. If we bump the ring size up to 512 we get it back. If we bump it to 1024, we start loosing again. It's much less reliably than doing flow control though. > Long term we need to do this dynamically but we can start with > signalling on rx queue half empty (or half full if you're an > optimist). We further need to tune the queue size so that this > results in an acceptable number of interrupts: Part of the trouble with the embedded scatter gather list is that it's not at all clear what "half empty" is unless you count all of the descriptors. There may be one giant descriptor, or many small ones. > 1 Gbps = 83K pps = 83 entries per half queue @ 1 KHz interrupt rate > > So we need 166 entries on the queue to keep a moderate interrupt rate, > if we change it to signal at the halfway mark. > > Note that flow control still makes sense since it allows us to buffer > some packets if the guest is scheduled out. But we can't use it as > the primary mechanism since it won't exist with multiqueue NICs (where > the virtio descriptors are fed to driver). Yes, we also need a better mechanism with vringfd(). I've been thinking about how to structure this API within QEMU but it's still not clear to me. Flow control seems to make sense though with the given API. Regards, Anthony Liguori > Similar reasoning probably applied to tx. > |
From: Avi K. <av...@qu...> - 2008-05-08 17:21:15
|
Avi Kivity wrote: > > Note that flow control still makes sense since it allows us to buffer > some packets if the guest is scheduled out. But we can't use it as > the primary mechanism since it won't exist with multiqueue NICs (where > the virtio descriptors are fed to driver). > ... are fed to the hardware, I meant. And to clarify further, I do think the patch is correct, but that virtio should be able to work well without it under ordinary circumstances. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Avi K. <av...@qu...> - 2008-05-08 17:16:50
|
Anthony Liguori wrote: > This patch adds compatibility code so that we can make use of eventfd() within > QEMU. eventfd() is a pretty useful mechanism as it allows multiple > notifications to be batched in a single system call. > > We emulate eventfd() using a standard pipe(). > Applied all six patches; thanks. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Avi K. <av...@qu...> - 2008-05-08 17:03:03
|
Anthony Liguori wrote: > Aurelien Jarno wrote: >> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote: >> >>> The current logic of the can_receive handler is to allow packets >>> whenever the >>> receiver is disabled or when there are descriptors available in the >>> ring. >>> >>> I think the logic ought to be to allow packets whenever the receiver >>> is enabled >>> and there are descriptors available in the ring. >>> >> >> The current behaviour is actually correct, this is the way QEMU works: >> when the card is stopped, it should always accept packets, and then >> discard them. >> > > The previous patches in my virtio series change that behavior. Before > delivering a packet to a VLAN, it checks to see if any of the VLAN > clients are able to receive a packet. > > This is very important for achieving good performance with tap. With > virtio-net, we were dropping a ton of packets with tap because there > weren't descriptors available on the RX ring. > > I plan to submit that behavioral change to QEMU upstream along with > the virtio drivers. I'm still optimizing phys_page_find() though. > The performance impact of switching the ring manipulation to using the > stl_phys accessors is unacceptable for KVM. I think this indicates a virtio tuning problem. The rx queue should never be empty on normal operation, ever. Signalling on rx queue empty invites the overrun since it takes time for the interrupt to be delivered and for the guest to refill the queue. Long term we need to do this dynamically but we can start with signalling on rx queue half empty (or half full if you're an optimist). We further need to tune the queue size so that this results in an acceptable number of interrupts: 1 Gbps = 83K pps = 83 entries per half queue @ 1 KHz interrupt rate So we need 166 entries on the queue to keep a moderate interrupt rate, if we change it to signal at the halfway mark. Note that flow control still makes sense since it allows us to buffer some packets if the guest is scheduled out. But we can't use it as the primary mechanism since it won't exist with multiqueue NICs (where the virtio descriptors are fed to driver). Similar reasoning probably applied to tx. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Christian B. <bor...@de...> - 2008-05-08 16:26:20
|
Am Donnerstag, 8. Mai 2008 schrieb Jeremy Fitzhardinge: > > Sorry, forgot to mention. Its kvm.git from 2 days ago on s390. > > > > And on s390 cpu_relax yields the vcpu? That's not common behaviour > across architectures. Yes, cpu_relax on s390 calls diagnose 44. Diagnose 44 translates to yield on z/VM and LPAR. Guessing from the number of the diagnose, I think it was used on z/VM for timeslice yielding long before Linux came to s390. |
From: Damjan <gd...@ma...> - 2008-05-08 16:13:49
|
Strange situation, I have a Ubuntu JeOS image that crashes with this error when started by the kvm-068 user-space. Bellow is the trace from the kernel... The same image, works with: - kvm-066 user space, kvm-068 kernel module (on 2.6.24 and 2.6.25) - kvm-066 user space, vanilla kernel module (from 2.6.24 and 2.6.25) (on my x60s laptop, 1GB ram, L2400 CoreDuo cpu, ArchLinux gcc-4.3.0 and gcc-3.4.6 glibc 2.7, 32bit) More strange, the same image works fine with both kvm-068 user space and kvm-068 kernel module on my Slackware-12.0 desktop (Q6600 cpu gcc-3.4.6 glibc-2.5 kernel-2.6.25, 32bit). I was using "qemu-kvm jeos.img -nographic" to gather this: [ 6.117980] Compat vDSO mapped to ffffe000. [ 6.121733] Checking 'hlt' instruction... OK. [ 6.143312] SMP alternatives: switching to UP code [ 6.159486] Freeing SMP alternatives: 11k freed [ 6.161761] invalid opcode: 0000 [#1] SMP [ 6.164241] Modules linked in: [ 6.166390] [ 6.167671] Pid: 0, comm: swapper Not tainted (2.6.24-16-virtual #1) [ 6.170286] EIP: 0060:[<c01a1499>] EFLAGS: 00010282 CPU: 0 [ 6.172744] EIP is at new_inode+0x9/0x90 [ 6.174739] EAX: c03da9a4 EBX: c7000150 ECX: 00000000 EDX: 000041ed [ 6.177339] ESI: c7407200 EDI: 00000000 EBP: c7403800 ESP: c03ffe88 [ 6.179940] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 6.182334] Process swapper (pid: 0, ti=c03fe000 task=c03ce3a0 task.ti=c03fe000) [ 6.185246] Stack: c7000150 000041ed c01d51fb c7000150 c70012a8 000041ed c7403800 c01d538a [ 6.191415] c7000150 c70012a8 c7000150 c01d5415 00000000 c0318de0 c0195679 000001ed [ 6.197349] 00000001 000001ed c7000150 c741e000 c70012a8 c03ffee8 c01977ff 000041ed [ 6.203202] Call Trace: [ 6.208235] [<c01d51fb>] ramfs_get_inode+0x1b/0x120 [ 6.213546] [<c01d538a>] ramfs_mknod+0x1a/0x80 [ 6.228274] [<c01d5415>] ramfs_mkdir+0x15/0x30 [ 6.233179] [<c0195679>] vfs_mkdir+0xc9/0x150 [ 6.237319] [<c01977ff>] sys_mkdirat+0x9f/0xe0 [ 6.241279] [<c019785f>] sys_mkdir+0x1f/0x30 [ 6.246523] [<c0408fda>] do_name+0x1da/0x1e0 [ 6.250381] [<c0406c3c>] write_buffer+0x1c/0x30 [ 6.254444] [<c0406cde>] flush_window+0x7e/0xd0 [ 6.258508] [<c0408b64>] unpack_to_rootfs+0x714/0x950 [ 6.262862] [<c0408dba>] early_populate_rootfs+0x1a/0x60 [ 6.267606] [<c0403a3f>] start_kernel+0x2ff/0x3a0 [ 6.271736] [<c0403130>] unknown_bootoption+0x0/0x1e0 [ 6.276209] ======================= [ 6.279345] Code: a9 3d c0 8d 44 24 08 e8 06 fc ff ff b8 bc a9 3d c0 e8 3c a7 16 00 8b 04 24 83 c4 10 5b 5e 5f 5d c3 90 56 89 c6 53 b8 a4 a9 3d c0 <0f> 0d 08 90 89 f0 e8 5c f3 ff ff 85 c0 89 c3 74 66 b8 a4 a9 3d [ 6.308342] EIP: [<c01a1499>] new_inode+0x9/0x90 SS:ESP 0068:c03ffe88 [ 6.314456] ---[ end trace ca143223eefdc828 ]--- [ 6.318332] Kernel panic - not syncing: Attempted to kill the idle task! -- damjan | дамјан This is my jabber ID --> da...@ba... -- not my mail address, it's a Jabber ID --^ :) |
From: Linus T. <tor...@li...> - 2008-05-08 16:12:06
|
On Thu, 8 May 2008, Linus Torvalds wrote: > > Also, we'd need to make it > > unsigned short flag:1; > > _and_ change spinlock_types.h to make the spinlock size actually match the > required size (right now we make it an "unsigned int slock" even when we > actually only use 16 bits). Btw, this is an issue only on 32-bit x86, because on 64-bit one we already have the padding due to the alignment of the 64-bit pointers in the list_head (so there's already empty space there). On 32-bit, the alignment of list-head is obviously just 32 bits, so right now the structure is "perfectly packed" and doesn't have any empty space. But that's just because the spinlock is unnecessarily big. (Of course, if anybody really uses NR_CPUS >= 256 on 32-bit x86, then the structure really will grow. That's a very odd configuration, though, and not one I feel we really need to care about). Linus |
From: Jan K. <jan...@si...> - 2008-05-08 15:57:10
|
Jan Kiszka wrote: > Yang, Sheng wrote: >> Hi >> >> This patchset enabled NMI support for KVM. >> >> The first three patches enabled NMI for in-kernel irqchip and NMI supporting >> on VMX. The last patch enabled NMI watchdog in linux, can be used to test the >> NMI injection. >> >> In fact, this series should also included Jan Kiszka's patch to enable NMI for >> userspace irqchip, but now I got little trouble to get the merged version >> work on my machine... We would post it as soon as we solved it. >> >> Another thing is the vmx_intr_assist() and do_interrupt_requests() got some >> duplication. And the logic for normal interrupt and NMI is similiar, but >> vmx_intr_assist() seems a little implicit... >> >> Any comments welcome! > > To make rebasing my work easier (hope I find some time later today): > Your patches show up line-wrapped here. Could you check and repost? Missed the attachments on first glance, and they are fine. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux |
From: Linus T. <tor...@li...> - 2008-05-08 15:04:15
|
On Thu, 8 May 2008, Andrea Arcangeli wrote: > > Actually I looked both at the struct and at the slab alignment just in > case it was changed recently. Now after reading your mail I also > compiled it just in case. Put the flag after the spinlock, not after the "list_head". Also, we'd need to make it unsigned short flag:1; _and_ change spinlock_types.h to make the spinlock size actually match the required size (right now we make it an "unsigned int slock" even when we actually only use 16 bits). See the #if (NR_CPUS < 256) code in <asm-x86/spinlock.h>. Linus |
From: Jeremy F. <je...@go...> - 2008-05-08 14:59:11
|
Christian Borntraeger wrote: > I really like 64 guest cpus as a good testcase for all kind of things. > Sure, I do the same kind of thing. >> I think x86 (at least) is now using ticket locks, which is fair. Which >> kernel are you seeing this problem on? >> > > Sorry, forgot to mention. Its kvm.git from 2 days ago on s390. > And on s390 cpu_relax yields the vcpu? That's not common behaviour across architectures. J |