kvm-devel Mailing List for kernel virtual machine (Page 9)

Brought to you by: avik, mtosatti

kvm-devel — kernel virtual machine development

You can subscribe to this list here.

2006	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul	_Aug	_Sep	_Oct (33)	_Nov (325)	_Dec (320)
2007	_Jan (484)	_Feb (438)	_Mar (407)	_Apr (713)	_May (831)	_Jun (806)	_Jul (1023)	_Aug (1184)	_Sep (1118)	_Oct (1461)	_Nov (1224)	_Dec (1042)
2008	_Jan (1449)	_Feb (1110)	_Mar (1428)	_Apr (1643)	_May (682)	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 .. 7 8 9 10 11 .. 703 > >> (Page 9 of 703)

Re: [kvm-devel] pinning, tsc and apic

From: Anthony L. <an...@co...> - 2008-05-12 21:44:40

Ryan Harper wrote:
> * Anthony Liguori <an...@co...> [2008-05-12 15:05]:
>   
>> Ryan Harper wrote:
>>     
>>> I've been digging into some of the instability we see when running
>>> larger numbers of guests at the same time.  The test I'm currently using
>>> involves launching 64 1vcpu guests on an 8-way AMD box.
>>>       
>> Note this is a Barcelona system and therefore should have a 
>> fixed-frequency TSC.
>>
>>     
>>>  With the latest
>>> kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
>>> 64 of these 1 second apart,
>>>       
>> BTW, what if you don't pace-out the startups?  Do we still have issues 
>> with that?
>>     
>
> Do you mean without the 1 second delay or with a longer delay?  My
> experience is that delay helps (fewer hangs), but doesn't solve things
> completely.
>   

So you see problems when using numactrl to pin and using a 0-second 
delay?  The short delay may help reduce the number of CPU migrations 
which would explain your observation.

If there are problems when doing a 0-second delay and numactl, then 
perhaps it's not just a cpu-migration issue.

> In svm.c, I do think we account for most of that time since the delta
> calculation will shift the guest time forward to the tsc value read in
> svm_vcpu_load().  We'll still miss the time between fixing the offset
> and when the guest can actually read its tsc.
>   

Yes, which is the duration that the guest isn't scheduled on any 
processor and the next time it runs happens to be on a different process.

>> A possible way to fix this (that's only valid on a processor with a 
>> fixed-frequency TSC), is to take a high-res timestamp on vcpu_put, and 
>> then on vcpu_load, take the delta timestamp since the old TSC was saved, 
>> and use the TSC frequency on the new pcpu to calculate the number of 
>> elapsed cycles.
>>
>> Assuming a fixed frequency TSC, and a calibrated TSC across all 
>> processors, you could get the same affects by using the VT tsc delta 
>> logic.  Basically, it always uses the new CPU's TSC unless that would 
>> cause the guest to move backwards in time.  As long as you have a 
>> stable, calibrated TSC, this would work out.
>>
>> Can you try your old patch that did this and see if it fixes the problem?
>>     
>
> Yeah, I'll give it a spin.
>   

Thanks,

Anthony Liguori

Re: [kvm-devel] pinning, tsc and apic

From: Ryan H. <ry...@us...> - 2008-05-12 21:24:17

* Anthony Liguori <an...@co...> [2008-05-12 15:05]:
> Ryan Harper wrote:
> >I've been digging into some of the instability we see when running
> >larger numbers of guests at the same time.  The test I'm currently using
> >involves launching 64 1vcpu guests on an 8-way AMD box.
> 
> Note this is a Barcelona system and therefore should have a 
> fixed-frequency TSC.
> 
> >  With the latest
> >kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
> >64 of these 1 second apart,
> 
> BTW, what if you don't pace-out the startups?  Do we still have issues 
> with that?

Do you mean without the 1 second delay or with a longer delay?  My
experience is that delay helps (fewer hangs), but doesn't solve things
completely.

> 
> > and only a handful (1 to 3)  end up not
> >making it up.  In dmesg on the host, I get a couple messages:
> >
> >[321365.362534] vcpu not ready for apic_round_robin
> >
> >and 
> >
> >[321503.023788] Unsupported delivery mode 7
> >
> >Now, the interesting bit for me was when I used numactl to pin the guest
> >to a processor, all of the guests come up with no issues at all.  As I
> >looked into it, it means that we're not running any of the vcpu
> >migration code which on svm is comprised of tsc_offset recalibration and
> >apic migration, and on vmx, a little more per-vcpu work
> >  
> 
> Another data point is that -no-kvm-irqchip doesn't make the situation 
> better.

Right.  Let me clarify; I still see hung guests, but I need to validate
if I see the apic related messages in the host or not, I don't recall
for certain.

> >I've convinced myself that svm.c's tsc offset calculation works and
> >handles the migration from cpu to cpu quite well.  I added the following
> >snippet to trigger if we ever encountered the case where we migrated to
> >a tsc that was behind:
> >
> >    rdtscll(tsc_this);
> >    delta = vcpu->arch.host_tsc - tsc_this;
> >    old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset;
> >    new_time = tsc_this + svm->vmcb->control.tsc_offset + delta;
> >    if (new_time < old_time) {
> >        printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n",
> >               vcpu->cpu, cpu, old_time - new_time);
> >    }
> >    svm->vmcb->control.tsc_offset += delta;
> >  
> 
> Time will never go backwards, but what can happen is that the TSC 
> frequency will slow down.  This is because upon VCPU migration, we don't 
> account for the time between vcpu_put on the old processor and vcpu_load 
> on the new processor.  This time then disappears.

In svm.c, I do think we account for most of that time since the delta
calculation will shift the guest time forward to the tsc value read in
svm_vcpu_load().  We'll still miss the time between fixing the offset
and when the guest can actually read its tsc.

> 
> A possible way to fix this (that's only valid on a processor with a 
> fixed-frequency TSC), is to take a high-res timestamp on vcpu_put, and 
> then on vcpu_load, take the delta timestamp since the old TSC was saved, 
> and use the TSC frequency on the new pcpu to calculate the number of 
> elapsed cycles.
> 
> Assuming a fixed frequency TSC, and a calibrated TSC across all 
> processors, you could get the same affects by using the VT tsc delta 
> logic.  Basically, it always uses the new CPU's TSC unless that would 
> cause the guest to move backwards in time.  As long as you have a 
> stable, calibrated TSC, this would work out.
> 
> Can you try your old patch that did this and see if it fixes the problem?

Yeah, I'll give it a spin.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ry...@us...

Re: [kvm-devel] [PATCH 001/001] mmu-notifier-core v17

From: Jack S. <st...@sg...> - 2008-05-12 20:01:23

On Fri, May 09, 2008 at 09:32:30PM +0200, Andrea Arcangeli wrote:
> From: Andrea Arcangeli <an...@qu...>
> 
> With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to
> pages. There are secondary MMUs (with secondary sptes and secondary
> tlbs) too. sptes in the kvm case are shadow pagetables, but when I say
> spte in mmu-notifier context, I mean "secondary pte". In GRU case
> there's no actual secondary pte and there's only a secondary tlb
> because the GRU secondary MMU has no knowledge about sptes and every
> secondary tlb miss event in the MMU always generates a page fault that
> has to be resolved by the CPU (this is not the case of KVM where the a
> secondary tlb miss will walk sptes in hardware and it will refill the
>...

FYI, I applied to patch to a tree that has the GRU driver. All regression
tests passed.

--- jack

Re: [kvm-devel] pinning, tsc and apic

From: Anthony L. <an...@co...> - 2008-05-12 19:49:08

Ryan Harper wrote:
> I've been digging into some of the instability we see when running
> larger numbers of guests at the same time.  The test I'm currently using
> involves launching 64 1vcpu guests on an 8-way AMD box.

Note this is a Barcelona system and therefore should have a 
fixed-frequency TSC.

>   With the latest
> kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
> 64 of these 1 second apart,

BTW, what if you don't pace-out the startups?  Do we still have issues 
with that?

>  and only a handful (1 to 3)  end up not
> making it up.  In dmesg on the host, I get a couple messages:
>
> [321365.362534] vcpu not ready for apic_round_robin
>
> and 
>
> [321503.023788] Unsupported delivery mode 7
>
> Now, the interesting bit for me was when I used numactl to pin the guest
> to a processor, all of the guests come up with no issues at all.  As I
> looked into it, it means that we're not running any of the vcpu
> migration code which on svm is comprised of tsc_offset recalibration and
> apic migration, and on vmx, a little more per-vcpu work
>   

Another data point is that -no-kvm-irqchip doesn't make the situation 
better.

> I've convinced myself that svm.c's tsc offset calculation works and
> handles the migration from cpu to cpu quite well.  I added the following
> snippet to trigger if we ever encountered the case where we migrated to
> a tsc that was behind:
>
>     rdtscll(tsc_this);
>     delta = vcpu->arch.host_tsc - tsc_this;
>     old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset;
>     new_time = tsc_this + svm->vmcb->control.tsc_offset + delta;
>     if (new_time < old_time) {
>         printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n",
>                vcpu->cpu, cpu, old_time - new_time);
>     }
>     svm->vmcb->control.tsc_offset += delta;
>   

Time will never go backwards, but what can happen is that the TSC 
frequency will slow down.  This is because upon VCPU migration, we don't 
account for the time between vcpu_put on the old processor and vcpu_load 
on the new processor.  This time then disappears.

A possible way to fix this (that's only valid on a processor with a 
fixed-frequency TSC), is to take a high-res timestamp on vcpu_put, and 
then on vcpu_load, take the delta timestamp since the old TSC was saved, 
and use the TSC frequency on the new pcpu to calculate the number of 
elapsed cycles.

Assuming a fixed frequency TSC, and a calibrated TSC across all 
processors, you could get the same affects by using the VT tsc delta 
logic.  Basically, it always uses the new CPU's TSC unless that would 
cause the guest to move backwards in time.  As long as you have a 
stable, calibrated TSC, this would work out.

Can you try your old patch that did this and see if it fixes the problem?

> Noting that vcpu->arch.host_tsc is the tsc of the previous cpu the vcpu
> was running on (see svm_put_vcpu()).  This allows me to check if we are
> in fact increasing the guest's view of the tsc.  I've not be able to
> trigger this at all when the vcpus are migrating.
>
> As for the apic, the migrate code seems to be rather simple, but I've
> not yet dived in to see if we've got anything racy in there:
>
> lapic.c:
> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
> {
>     struct kvm_lapic *apic = vcpu->arch.apic;
>     struct hrtimer *timer;
>
>     if (!apic)
>         return;
>
>     timer = &apic->timer.dev;
>     if (hrtimer_cancel(timer))
>         hrtimer_start(timer, timer->expires, HRTIMER_MODE_ABS);
> }
>
>   

There's a big FIXME in the __apic_timer_fn() to make sure the timer runs 
on the current "pCPU".  As written, it's possible for the timer to 
happen on a different pcpu as the current vcpu's but it wasn't obvious 
to me that it would cause problems.  Eddie, et al: Care to elaborate on 
what the TODO was trying to address?

Regards,

Anthony Liguori

> Ryan Harper
> Software Engineer; Linux Technology Center
> IBM Corp., Austin, Tx
> (512) 838-9253   T/L: 678-9253
> ry...@us...
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft 
> Defy all challenges. Microsoft(R) Visual Studio 2008. 
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> kvm-devel mailing list
> kvm...@li...
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>

[kvm-devel] pinning, tsc and apic

From: Ryan H. <ry...@us...> - 2008-05-12 19:20:11

I've been digging into some of the instability we see when running
larger numbers of guests at the same time.  The test I'm currently using
involves launching 64 1vcpu guests on an 8-way AMD box.  With the latest
kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
64 of these 1 second apart, and only a handful (1 to 3)  end up not
making it up.  In dmesg on the host, I get a couple messages:

[321365.362534] vcpu not ready for apic_round_robin

and 

[321503.023788] Unsupported delivery mode 7

Now, the interesting bit for me was when I used numactl to pin the guest
to a processor, all of the guests come up with no issues at all.  As I
looked into it, it means that we're not running any of the vcpu
migration code which on svm is comprised of tsc_offset recalibration and
apic migration, and on vmx, a little more per-vcpu work

I've convinced myself that svm.c's tsc offset calculation works and
handles the migration from cpu to cpu quite well.  I added the following
snippet to trigger if we ever encountered the case where we migrated to
a tsc that was behind:

    rdtscll(tsc_this);
    delta = vcpu->arch.host_tsc - tsc_this;
    old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset;
    new_time = tsc_this + svm->vmcb->control.tsc_offset + delta;
    if (new_time < old_time) {
        printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n",
               vcpu->cpu, cpu, old_time - new_time);
    }
    svm->vmcb->control.tsc_offset += delta;

Noting that vcpu->arch.host_tsc is the tsc of the previous cpu the vcpu
was running on (see svm_put_vcpu()).  This allows me to check if we are
in fact increasing the guest's view of the tsc.  I've not be able to
trigger this at all when the vcpus are migrating.

As for the apic, the migrate code seems to be rather simple, but I've
not yet dived in to see if we've got anything racy in there:

lapic.c:
void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
{
    struct kvm_lapic *apic = vcpu->arch.apic;
    struct hrtimer *timer;

    if (!apic)
        return;

    timer = &apic->timer.dev;
    if (hrtimer_cancel(timer))
        hrtimer_start(timer, timer->expires, HRTIMER_MODE_ABS);
}



Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ry...@us...

[kvm-devel] [ kvm-Bugs-1962539 ] Remaining of svn conflict in qemu/Makefile.target

From: SourceForge.net <no...@so...> - 2008-05-12 17:58:27

Bugs item #1962539, was opened at 2008-05-12 19:58
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1962539&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Pascal Terjan (pterjan)
Assigned to: Nobody/Anonymous (nobody)
Summary: Remaining of svn conflict in qemu/Makefile.target

Initial Comment:
[pterjan@celeste kvm-68]$ grep -r "<<<<<<<" qemu
qemu/Makefile.target:<<<<<<< HEAD:qemu/Makefile.target


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1962539&group_id=180599

Re: [kvm-devel] [patch 1/2] KVM: hlt emulation should take in-kernel APIC/PIT timers into account

From: Marcelo T. <mto...@re...> - 2008-05-12 16:36:53

On Sat, May 10, 2008 at 10:12:02AM +0800, Yang, Sheng wrote:
> > Did you have kvm.git commit 8ae6dc90ac84d9734e343210c8ec709f50cd9d89
> > when testing this?
> >
> > I believe it should fix that issue, because "ps->inject_pending" won't
> > be set by kvm_pit_timer_intr_post() if the IRQ is masked. Please correct
> > me if I'm wrong.
> 
> Oh, sorry, I missed that commit. But... It just solved an half of the problem. 
> LAPIC suffered from it as well, and the current HLT emulation still didn't 
> work... And I can't find something like inject_pending in LAPIC timer.
> 
> I have to say, I think my method is more preciously, directly and efficient... 
> It also can be extended easily if we got more clock sources (though I don't 
> think this would happen in near future...). In fact, I think take care of 
> pending counts is some kind of *wrong concept*... We should take care of the 
> window, or when the increment of pending counters happened, CMIIW. And it got 
> nothing to do with the current counter number (yeah, I realized it after saw 
> the hlt behaviour in XP, not before ;) ).

Sheng,

The problem is that you don't want to emulate hlt if you have a pending
timer _and_ the guest is accepting events. So for example if there are
two apic timers pending, you inject one of them, guest execute's hlt, we
end up in vcpu_block().

Does this work for you?

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 7652f88..d41e34c 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -961,7 +961,7 @@ int apic_has_pending_timer(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *lapic = vcpu->arch.apic;
 
-	if (lapic)
+	if (lapic && apic_enabled(lapic) && apic_lvt_enabled(lapic, APIC_LVTT))
 		return atomic_read(&lapic->timer.pending);
 
 	return 0;

[kvm-devel] exposing PCI devices to guest

From: Gerry R. <gr...@ve...> - 2008-05-12 16:07:57

I would like to expose some fxo PCI cards to my Asterisk guest under 
KVM.  How can I do this?  I see where maybe you can hotplug some nic | 
storage devices but what about other PCI devices like my fxo cards?  Is 
there some kernel line options that could do this?

Regards,
Gerry

Re: [kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Avi K. <av...@qu...> - 2008-05-12 15:44:56

Anthony Liguori wrote:
>>
>> Yes.  The loop was a (perhaps premature) optimization that is now 
>> totally unnecessary, unless I'm missing something quite large.
>>   
>
> It used to be that kvm_eat_signal() selected after consuming as many 
> signals as possible while only sleeping once.  That's why there's a 
> combination of sleeping and polling.
>

Yes.

> Now the VCPU threads never select so the whole loop can be simplified 
> to a single sigtimedwait() that always blocks.
>
> In reality, I don't think sigtimedwait() is really needed/useful for 
> VCPUs anymore.  We only use it to catch SIG_IPI and we only use 
> SIG_IPI to break out of sleeping.  I don't see any reason why we 
> couldn't switch over to using a file descriptor for notification (or a 
> pthread condition).

How would you stop a vcpu running in guest mode?

> In the very least, we could just select() on nothing and allow SIG_IPI 
> to break us out of the select.

sigtimedwait() (or just sigwait, now) doesn't require the signal to be 
delivered, so it's faster.

If there's nothing to select, why call select()?

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Anthony L. <an...@co...> - 2008-05-12 15:11:58

Avi Kivity wrote:
>> Now the VCPU threads never select so the whole loop can be simplified 
>> to a single sigtimedwait() that always blocks.
>>
>> In reality, I don't think sigtimedwait() is really needed/useful for 
>> VCPUs anymore.  We only use it to catch SIG_IPI and we only use 
>> SIG_IPI to break out of sleeping.  I don't see any reason why we 
>> couldn't switch over to using a file descriptor for notification (or 
>> a pthread condition).
>
> How would you stop a vcpu running in guest mode?

Yeah, I forgot about that.

>> In the very least, we could just select() on nothing and allow 
>> SIG_IPI to break us out of the select.
>
> sigtimedwait() (or just sigwait, now) doesn't require the signal to be 
> delivered, so it's faster.

Yeah, sigtimedwait() is probably the right thing to do since we have to 
use a signal for IPI.

Regards,

Anthony Liguori

> If there's nothing to select, why call select()?
>

Re: [kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Anthony L. <an...@co...> - 2008-05-12 14:31:47

Avi Kivity wrote:
> Jan Kiszka wrote:
>   
>>> Given that with the iothread we spend very little time processing
>>> signals in vcpu threads, maybe it's better to drop the loop completely. 
>>> The common case is zero or one pending signals.  The uncommon case of
>>> two or more pending signals will be handled by the KVM_RUN ioctl
>>> returning immediately with -EINTR (i.e. in the outer loop).
>>>
>>>     
>>>       
>> You mean
>>
>> static void kvm_main_loop_wait(CPUState *env, int timeout)
>> {
>>     pthread_mutex_unlock(&qemu_mutex);
>>     kvm_eat_signal(env, timeout);
>>     pthread_mutex_lock(&qemu_mutex);
>>     cpu_single_env = env;
>>
>>     vcpu_info[env->cpu_index].signalled = 0;
>> }
>>
>> ?
>>   
>>     
>
> Yes.  The loop was a (perhaps premature) optimization that is now 
> totally unnecessary, unless I'm missing something quite large.
>   

It used to be that kvm_eat_signal() selected after consuming as many 
signals as possible while only sleeping once.  That's why there's a 
combination of sleeping and polling.

Now the VCPU threads never select so the whole loop can be simplified to 
a single sigtimedwait() that always blocks.

In reality, I don't think sigtimedwait() is really needed/useful for 
VCPUs anymore.  We only use it to catch SIG_IPI and we only use SIG_IPI 
to break out of sleeping.  I don't see any reason why we couldn't switch 
over to using a file descriptor for notification (or a pthread 
condition).  In the very least, we could just select() on nothing and 
allow SIG_IPI to break us out of the select.

Regards,

Anthony Liguori

> Oh.  There used to be a bug where we didn't check for a pending signal 
> before the first guest entry, so this would add a lot of latency 
> (effectively making the bug window much larger).  That was only closed 
> in 2.6.24 (by 7e66f350).
>
>

Re: [kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Avi K. <av...@qu...> - 2008-05-12 11:56:41

Jan Kiszka wrote:
>> Given that with the iothread we spend very little time processing
>> signals in vcpu threads, maybe it's better to drop the loop completely. 
>> The common case is zero or one pending signals.  The uncommon case of
>> two or more pending signals will be handled by the KVM_RUN ioctl
>> returning immediately with -EINTR (i.e. in the outer loop).
>>
>>     
>
> You mean
>
> static void kvm_main_loop_wait(CPUState *env, int timeout)
> {
>     pthread_mutex_unlock(&qemu_mutex);
>     kvm_eat_signal(env, timeout);
>     pthread_mutex_lock(&qemu_mutex);
>     cpu_single_env = env;
>
>     vcpu_info[env->cpu_index].signalled = 0;
> }
>
> ?
>   

Yes.  The loop was a (perhaps premature) optimization that is now 
totally unnecessary, unless I'm missing something quite large.

Oh.  There used to be a bug where we didn't check for a pending signal 
before the first guest entry, so this would add a lot of latency 
(effectively making the bug window much larger).  That was only closed 
in 2.6.24 (by 7e66f350).

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH 2/2] qemu-kvm: Fix guest resetting v2

From: Avi K. <av...@qu...> - 2008-05-12 11:44:34

Jan Kiszka wrote:
> Avi Kivity wrote:
>   
>> Jan Kiszka wrote:
>>     
>>> Resetting guests used to be racy, deadlock-prone, or simply broken (for
>>> SMP). This patch fixes the issues, following Marcelo's suggestion to
>>> consolidate the reset activity in the I/O thread. All vcpus are cleanly
>>> stopped before the emulated hardware is reset, and kvm_arch_cpu_reset is
>>> introduced and invoked to ensure that non-boot cpus are put into the
>>> right state on x86. Note that other arch may need to look into this
>>> service as well to get SMP reset right.
>>>
>>>   
>>>       
>> hmm.  This means that reset is executed asynchronously of the calling cpu.
>>     
>
> Nope, the calling cpu is put in stopped state after informing the I/O
> thread about the reset request. We just cannot control when the request
> is handled /wrt to other cpus, but that's not different from real life I
> guess.
>   

That's true for triple-faults, but not for ordinary keyboard controller 
resets.  All you do there is call main_loop_break().

But I think that's fine; on a real machine the keyboard controller reset 
is processed by a separate processor (the keyboard controller) which has 
nonzero latency.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Jan K. <jan...@we...> - 2008-05-12 11:42:10

Attachments: signature.asc

Avi Kivity wrote:
> Jan Kiszka wrote:
>> Having both kvm_eat_signal and kvm_eat_signals makes the code harder to
>> read. Moreover, given the single caller of kvm_eat_signals, there is no
>> real reason to keep it in a separate function.
>>
>>   
> 
> Given that with the iothread we spend very little time processing
> signals in vcpu threads, maybe it's better to drop the loop completely. 
> The common case is zero or one pending signals.  The uncommon case of
> two or more pending signals will be handled by the KVM_RUN ioctl
> returning immediately with -EINTR (i.e. in the outer loop).
> 

You mean

static void kvm_main_loop_wait(CPUState *env, int timeout)
{
    pthread_mutex_unlock(&qemu_mutex);
    kvm_eat_signal(env, timeout);
    pthread_mutex_lock(&qemu_mutex);
    cpu_single_env = env;

    vcpu_info[env->cpu_index].signalled = 0;
}

?

Jan

Re: [kvm-devel] [PATCH 2/2] qemu-kvm: Fix guest resetting v2

From: Jan K. <jan...@we...> - 2008-05-12 11:39:00

Attachments: signature.asc

Avi Kivity wrote:
> Jan Kiszka wrote:
>> Resetting guests used to be racy, deadlock-prone, or simply broken (for
>> SMP). This patch fixes the issues, following Marcelo's suggestion to
>> consolidate the reset activity in the I/O thread. All vcpus are cleanly
>> stopped before the emulated hardware is reset, and kvm_arch_cpu_reset is
>> introduced and invoked to ensure that non-boot cpus are put into the
>> right state on x86. Note that other arch may need to look into this
>> service as well to get SMP reset right.
>>
>>   
> 
> hmm.  This means that reset is executed asynchronously of the calling cpu.

Nope, the calling cpu is put in stopped state after informing the I/O
thread about the reset request. We just cannot control when the request
is handled /wrt to other cpus, but that's not different from real life I
guess.

Jan

[kvm-devel] [RFC] Reworking KVM_DEBUG_GUEST

From: Jan K. <jan...@we...> - 2008-05-12 11:35:04

Attachments: signature.asc

Hi,

before going wild with my idea, I would like to collect some comments on
this approach:

While doing first kernel debugging with my debug register patches for
kvm, I quickly ran into the 4-breakpoints-only limitation that comes
from the fact that we blindly map software to hardware breakpoints.
Unhandy, simply suboptimal. Also, having 4 breakpoint slots hard-coded
in the generic interface is not fair to arch that may support more.
Moreover, we do not support watchpoints although this would easily be
feasible. But if we supported watchpoints (via debug registers on x86),
we would need the break out of the 4 slots limitations even earlier. In
short, I came to the conclusion that a rewrite of the KVM_DEBUG_GUEST
interface is required.

Why do we set breakpoints in the kernel? Why not simply catching all
debug traps, inserting software breakpoint ops into the guest code, and
handling all this stuff as normal debuggers do? And the hardware
breakpoints should just be pushed through the kernel interface like
ptrace does.

The new KVM_DEBUG_GUEST interface I currently have in mind would look
like this:

#define KVM_DBGGUEST_ENABLE		0x01
#define KVM_DBGGUEST_SINGLESTEP		0x02

struct kvm_debug_guest {
	__u32 control;
	struct kvm_debug_guest_arch arch;
}

Setting KVM_DBGGUEST_ENABLE would forward all debug-related traps to
userspace first, which can then decide to handle or re-inject them.
KVM_DBGGUEST_SINGLESTEP would work as before. And the extension for x86
would look like this:

struct kvm_debug_guest_arch {
	__u32 use_hw_breakpoints;
	__u64 debugreg[8];
}

If use_hw_breakpoints is non-zero, KVM would completely overwrite the
guest's debug registers with the content of debugreg, giving full
control of this feature to the host-side debugger (faking the content of
debug registers, effectively disabling them for the guest - as we now do
all the time).

Questions:
- Does anyone see traps and pitfalls in this approach?
- May I replace the existing interface with this one, or am I overseeing
  some use case that already worked with the current code so that ABI
  compatibility is required (most debug stuff should have been simply
  broken so far, also due to bugs in userland)?

Jan

Re: [kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Avi K. <av...@qu...> - 2008-05-12 11:33:43

Jan Kiszka wrote:
> Having both kvm_eat_signal and kvm_eat_signals makes the code harder to
> read. Moreover, given the single caller of kvm_eat_signals, there is no
> real reason to keep it in a separate function.
>
>   

Given that with the iothread we spend very little time processing 
signals in vcpu threads, maybe it's better to drop the loop completely.  
The common case is zero or one pending signals.  The uncommon case of 
two or more pending signals will be handled by the KVM_RUN ioctl 
returning immediately with -EINTR (i.e. in the outer loop).

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH 1/2] qemu-kvm: Introduce qemu_cond_wait wrapper

From: Avi K. <av...@qu...> - 2008-05-12 11:30:53

Jan Kiszka wrote:
> As suggested by Anthony, this patch encapsulates the sequence "save
> cpu_single_env, temporarily drop qemu_mutex, restore cpu_single_env" for
> condition variables in a helper function. It also adds a safety check to
> the open-coded kvm_mutex_lock that the caller is not a vcpu thread (as
> kvm_mutex_unlock clears cpu_single_env).
>   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH 2/2] qemu-kvm: Fix guest resetting v2

From: Avi K. <av...@qu...> - 2008-05-12 11:29:23

Jan Kiszka wrote:
> Resetting guests used to be racy, deadlock-prone, or simply broken (for
> SMP). This patch fixes the issues, following Marcelo's suggestion to
> consolidate the reset activity in the I/O thread. All vcpus are cleanly
> stopped before the emulated hardware is reset, and kvm_arch_cpu_reset is
> introduced and invoked to ensure that non-boot cpus are put into the
> right state on x86. Note that other arch may need to look into this
> service as well to get SMP reset right.
>
>   

hmm.  This means that reset is executed asynchronously of the calling cpu.

Well, in a real machine a keyboard controller reset is asynchronous as 
well, so I guess that's not an issue.

-- 
error compiling committee.c: too many arguments to function

[kvm-devel] [PATCH] kvm-qemu: Fix monitor and gdbstub deadlocks v2

From: Jan K. <jan...@we...> - 2008-05-12 10:51:04

Refreshed version of the earlier posted patch, now also adding a safety
check about the correct vm state to kvm_load_registers.

Some monitor commands as well as the vm_stop() issued by the gdbstub on
external interruption so far deadlock on vcpu locks in the kernel. Patch
below resolves the issue by temporarily or permanently stopping all vcpu
threads before issuing the related KVM IOCTLs. At this chance the patch
also removes redundant checks for kvm_enabled.

Among other things, the patch now allows to break into the BIOS or other
guest code spinning in the vcpu and to use things like "info cpus" in
the monitor.

Signed-off-by: Jan Kiszka <jan...@si...>
---
 qemu/gdbstub.c  |   12 ++++--------
 qemu/monitor.c  |    3 +--
 qemu/qemu-kvm.c |   41 +++++++++++++++++++++++++++--------------
 qemu/vl.c       |    2 +-
 4 files changed, 33 insertions(+), 25 deletions(-)

Index: b/qemu/monitor.c
===================================================================
--- a/qemu/monitor.c
+++ b/qemu/monitor.c
@@ -286,8 +286,7 @@ static CPUState *mon_get_cpu(void)
         mon_set_cpu(0);
     }
 
-    if (kvm_enabled())
-	kvm_save_registers(mon_cpu);
+    kvm_save_registers(mon_cpu);
 
     return mon_cpu;
 }
Index: b/qemu/qemu-kvm.c
===================================================================
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -137,18 +137,6 @@ static int pre_kvm_run(void *opaque, int
     return 0;
 }
 
-void kvm_load_registers(CPUState *env)
-{
-    if (kvm_enabled())
-	kvm_arch_load_regs(env);
-}
-
-void kvm_save_registers(CPUState *env)
-{
-    if (kvm_enabled())
-	kvm_arch_save_regs(env);
-}
-
 int kvm_cpu_exec(CPUState *env)
 {
     int r;
@@ -276,6 +264,26 @@ static void kvm_vm_state_change_handler(
 	pause_all_threads();
 }
 
+void kvm_load_registers(CPUState *env)
+{
+    assert(!vm_running);
+
+    if (kvm_enabled())
+	kvm_arch_load_regs(env);
+}
+
+void kvm_save_registers(CPUState *env)
+{
+    if (!kvm_enabled())
+        return;
+
+    if (vm_running)
+        pause_all_threads();
+    kvm_arch_save_regs(env);
+    if (vm_running)
+        resume_all_threads();
+}
+
 static void update_regs_for_sipi(CPUState *env)
 {
     kvm_arch_update_regs_for_sipi(env);
@@ -764,7 +772,7 @@ int kvm_qemu_init_env(CPUState *cenv)
 int kvm_update_debugger(CPUState *env)
 {
     struct kvm_debug_guest dbg;
-    int i;
+    int i, r;
 
     dbg.enabled = 0;
     if (env->nb_breakpoints || env->singlestep_enabled) {
@@ -775,7 +783,12 @@ int kvm_update_debugger(CPUState *env)
 	}
 	dbg.singlestep = env->singlestep_enabled;
     }
-    return kvm_guest_debug(kvm_context, env->cpu_index, &dbg);
+    if (vm_running)
+        pause_all_threads();
+    r = kvm_guest_debug(kvm_context, env->cpu_index, &dbg);
+    if (vm_running)
+        resume_all_threads();
+    return r;
 }
 
 
Index: b/qemu/vl.c
===================================================================
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -7225,12 +7225,12 @@ void vm_stop(int reason)
 {
     if (vm_running) {
         cpu_disable_ticks();
-        vm_running = 0;
         if (reason != 0) {
             if (vm_stop_cb) {
                 vm_stop_cb(vm_stop_opaque, reason);
             }
         }
+        vm_running = 0;
         vm_state_notify(0);
     }
 }
Index: b/qemu/gdbstub.c
===================================================================
--- a/qemu/gdbstub.c
+++ b/qemu/gdbstub.c
@@ -904,8 +904,7 @@ static int gdb_handle_packet(GDBState *s
             addr = strtoull(p, (char **)&p, 16);
 #if defined(TARGET_I386)
             env->eip = addr;
-	    if (kvm_enabled())
-		kvm_load_registers(env);
+            kvm_load_registers(env);
 #elif defined (TARGET_PPC)
             env->nip = addr;
 #elif defined (TARGET_SPARC)
@@ -928,8 +927,7 @@ static int gdb_handle_packet(GDBState *s
             addr = strtoull(p, (char **)&p, 16);
 #if defined(TARGET_I386)
             env->eip = addr;
-	    if (kvm_enabled())
-		kvm_load_registers(env);
+            kvm_load_registers(env);
 #elif defined (TARGET_PPC)
             env->nip = addr;
 #elif defined (TARGET_SPARC)
@@ -973,8 +971,7 @@ static int gdb_handle_packet(GDBState *s
         }
         break;
     case 'g':
-	if (kvm_enabled())
-	    kvm_save_registers(env);
+        kvm_save_registers(env);
         reg_size = cpu_gdb_read_registers(env, mem_buf);
         memtohex(buf, mem_buf, reg_size);
         put_packet(s, buf);
@@ -984,8 +981,7 @@ static int gdb_handle_packet(GDBState *s
         len = strlen(p) / 2;
         hextomem((uint8_t *)registers, p, len);
         cpu_gdb_write_registers(env, mem_buf, len);
-	if (kvm_enabled())
-	    kvm_load_registers(env);
+        kvm_load_registers(env);
         put_packet(s, "OK");
         break;
     case 'm':

[kvm-devel] [PATCH] qemu-kvm: Consolidate kvm_eat_signals

From: Jan K. <jan...@we...> - 2008-05-12 10:50:06

Having both kvm_eat_signal and kvm_eat_signals makes the code harder to
read. Moreover, given the single caller of kvm_eat_signals, there is no
real reason to keep it in a separate function.

Signed-off-by: Jan Kiszka <jan...@we...>
---
 qemu/qemu-kvm.c |   11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

Index: b/qemu/qemu-kvm.c
===================================================================
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -210,11 +210,12 @@ static int kvm_eat_signal(CPUState *env,
     return ret;
 }
 
-
-static void kvm_eat_signals(CPUState *env, int timeout)
+static void kvm_main_loop_wait(CPUState *env, int timeout)
 {
     int r = 0;
 
+    pthread_mutex_unlock(&qemu_mutex);
+
     while (kvm_eat_signal(env, 0))
 	r = 1;
     if (!r && timeout) {
@@ -223,14 +224,10 @@ static void kvm_eat_signals(CPUState *en
 	    while (kvm_eat_signal(env, 0))
 		;
     }
-}
 
-static void kvm_main_loop_wait(CPUState *env, int timeout)
-{
-    pthread_mutex_unlock(&qemu_mutex);
-    kvm_eat_signals(env, timeout);
     pthread_mutex_lock(&qemu_mutex);
     cpu_single_env = env;
+
     vcpu_info[env->cpu_index].signalled = 0;
 }

[kvm-devel] [PATCH 2/2] qemu-kvm: Fix guest resetting v2

From: Jan K. <jan...@we...> - 2008-05-12 10:49:39

Resetting guests used to be racy, deadlock-prone, or simply broken (for
SMP). This patch fixes the issues, following Marcelo's suggestion to
consolidate the reset activity in the I/O thread. All vcpus are cleanly
stopped before the emulated hardware is reset, and kvm_arch_cpu_reset is
introduced and invoked to ensure that non-boot cpus are put into the
right state on x86. Note that other arch may need to look into this
service as well to get SMP reset right.

Moreover, a safety check is added to pause/resume_all_threads to ensure
that they are only invoked in I/O thread context.

Signed-off-by: Jan Kiszka <jan...@si...>
---
 qemu/qemu-kvm-ia64.c    |    4 ++++
 qemu/qemu-kvm-powerpc.c |    4 ++++
 qemu/qemu-kvm-x86.c     |   16 ++++++++++++++++
 qemu/qemu-kvm.c         |   34 +++++++++++++++++-----------------
 qemu/qemu-kvm.h         |    1 +
 qemu/vl.c               |    6 +-----
 6 files changed, 43 insertions(+), 22 deletions(-)

Index: b/qemu/qemu-kvm-ia64.c
===================================================================
--- a/qemu/qemu-kvm-ia64.c
+++ b/qemu/qemu-kvm-ia64.c
@@ -61,3 +61,7 @@ int kvm_arch_try_push_interrupts(void *o
 void kvm_arch_update_regs_for_sipi(CPUState *env)
 {
 }
+
+void kvm_arch_cpu_reset(CPUState *env)
+{
+}
Index: b/qemu/qemu-kvm-powerpc.c
===================================================================
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -213,3 +213,7 @@ int handle_powerpc_dcr_write(int vcpu, u
 
     return 0; /* XXX ignore failed DCR ops */
 }
+
+void kvm_arch_cpu_reset(CPUState *env)
+{
+}
Index: b/qemu/qemu-kvm-x86.c
===================================================================
--- a/qemu/qemu-kvm-x86.c
+++ b/qemu/qemu-kvm-x86.c
@@ -671,3 +671,19 @@ int handle_tpr_access(void *opaque, int 
     kvm_tpr_access_report(cpu_single_env, rip, is_write);
     return 0;
 }
+
+void kvm_arch_cpu_reset(CPUState *env)
+{
+    struct kvm_mp_state mp_state = { .mp_state = KVM_MP_STATE_UNINITIALIZED };
+
+    kvm_arch_load_regs(env);
+    if (env->cpu_index != 0) {
+	if (kvm_irqchip_in_kernel(kvm_context))
+	    kvm_set_mpstate(kvm_context, env->cpu_index, &mp_state);
+	else {
+	    env->interrupt_request &= ~CPU_INTERRUPT_HARD;
+	    env->hflags |= HF_HALTED_MASK;
+	    env->exception_index = EXCP_HLT;
+	}
+    }
+}
Index: b/qemu/qemu-kvm.c
===================================================================
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -32,8 +32,6 @@ kvm_context_t kvm_context;
 
 extern int smp_cpus;
 
-static int qemu_kvm_reset_requested;
-
 pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER;
 pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER;
 pthread_cond_t qemu_vcpu_cond = PTHREAD_COND_INITIALIZER;
@@ -53,7 +51,6 @@ struct vcpu_info {
     int signalled;
     int stop;
     int stopped;
-    int reload_regs;
     int created;
 } vcpu_info[256];
 
@@ -251,6 +248,8 @@ static void pause_all_threads(void)
 {
     int i;
 
+    assert(!cpu_single_env);
+
     for (i = 0; i < smp_cpus; ++i) {
 	vcpu_info[i].stop = 1;
 	pthread_kill(vcpu_info[i].thread, SIG_IPI);
@@ -263,6 +262,8 @@ static void resume_all_threads(void)
 {
     int i;
 
+    assert(!cpu_single_env);
+
     for (i = 0; i < smp_cpus; ++i) {
 	vcpu_info[i].stop = 0;
 	vcpu_info[i].stopped = 0;
@@ -307,15 +308,18 @@ static void setup_kernel_sigmask(CPUStat
     kvm_set_signal_mask(kvm_context, env->cpu_index, &set);
 }
 
-void qemu_kvm_system_reset_request(void)
+void qemu_kvm_system_reset(void)
 {
     int i;
 
-    for (i = 0; i < smp_cpus; ++i) {
-	vcpu_info[i].reload_regs = 1;
-	pthread_kill(vcpu_info[i].thread, SIG_IPI);
-    }
+    pause_all_threads();
+
     qemu_system_reset();
+
+    for (i = 0; i < smp_cpus; ++i)
+	kvm_arch_cpu_reset(vcpu_info[i].env);
+
+    resume_all_threads();
 }
 
 static int kvm_main_loop_cpu(CPUState *env)
@@ -348,11 +352,6 @@ static int kvm_main_loop_cpu(CPUState *e
 	    kvm_cpu_exec(env);
 	env->interrupt_request &= ~CPU_INTERRUPT_EXIT;
 	kvm_main_loop_wait(env, 0);
-        if (info->reload_regs) {
-	    info->reload_regs = 0;
-	    if (env->cpu_index == 0) /* ap needs to be placed in INIT */
-		kvm_arch_load_regs(env);
-	}
     }
     pthread_mutex_unlock(&qemu_mutex);
     return 0;
@@ -535,10 +534,8 @@ int kvm_main_loop(void)
             break;
         else if (qemu_powerdown_requested())
             qemu_system_powerdown();
-        else if (qemu_reset_requested()) {
-            pthread_kill(vcpu_info[0].thread, SIG_IPI);
-            qemu_kvm_reset_requested = 1;
-        }
+        else if (qemu_reset_requested())
+	    qemu_kvm_system_reset();
     }
 
     pause_all_threads();
@@ -647,6 +644,9 @@ static int kvm_halt(void *opaque, int vc
 
 static int kvm_shutdown(void *opaque, int vcpu)
 {
+    /* stop the current vcpu from going back to guest mode */
+    vcpu_info[cpu_single_env->cpu_index].stopped = 1;
+
     qemu_system_reset_request();
     return 1;
 }
Index: b/qemu/qemu-kvm.h
===================================================================
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -59,6 +59,7 @@ void kvm_arch_post_kvm_run(void *opaque,
 int kvm_arch_has_work(CPUState *env);
 int kvm_arch_try_push_interrupts(void *opaque);
 void kvm_arch_update_regs_for_sipi(CPUState *env);
+void kvm_arch_cpu_reset(CPUState *env);
 
 CPUState *qemu_kvm_cpu_env(int index);
 
Index: b/qemu/vl.c
===================================================================
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -7302,11 +7302,7 @@ void qemu_system_reset_request(void)
     }
     if (cpu_single_env)
         cpu_interrupt(cpu_single_env, CPU_INTERRUPT_EXIT);
-#ifdef USE_KVM
-    if (kvm_allowed)
-        if (!no_reboot)
-            qemu_kvm_system_reset_request();
-#endif
+    main_loop_break();
 }
 
 void qemu_system_shutdown_request(void)

[kvm-devel] [PATCH 1/2] qemu-kvm: Introduce qemu_cond_wait wrapper

From: Jan K. <jan...@we...> - 2008-05-12 10:49:32

As suggested by Anthony, this patch encapsulates the sequence "save
cpu_single_env, temporarily drop qemu_mutex, restore cpu_single_env" for
condition variables in a helper function. It also adds a safety check to
the open-coded kvm_mutex_lock that the caller is not a vcpu thread (as
kvm_mutex_unlock clears cpu_single_env).

Signed-off-by: Jan Kiszka <jan...@we...>
---
 qemu/qemu-kvm.c |   29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

Index: b/qemu/qemu-kvm.c
===================================================================
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -12,6 +12,7 @@ int kvm_allowed = 1;
 int kvm_irqchip = 1;
 int kvm_pit = 1;
 
+#include <assert.h>
 #include <string.h>
 #include "hw/hw.h"
 #include "sysemu.h"
@@ -65,6 +66,14 @@ static inline unsigned long kvm_get_thre
     return syscall(SYS_gettid);
 }
 
+static void qemu_cond_wait(pthread_cond_t *cond)
+{
+    CPUState *env = cpu_single_env;
+
+    pthread_cond_wait(cond, &qemu_mutex);
+    cpu_single_env = env;
+}
+
 CPUState *qemu_kvm_cpu_env(int index)
 {
     return vcpu_info[index].env;
@@ -246,11 +255,8 @@ static void pause_all_threads(void)
 	vcpu_info[i].stop = 1;
 	pthread_kill(vcpu_info[i].thread, SIG_IPI);
     }
-    while (!all_threads_paused()) {
-	CPUState *env = cpu_single_env;
-	pthread_cond_wait(&qemu_pause_cond, &qemu_mutex);
-	cpu_single_env = env;
-    }
+    while (!all_threads_paused())
+	qemu_cond_wait(&qemu_pause_cond);
 }
 
 static void resume_all_threads(void)
@@ -372,7 +378,7 @@ static void *ap_main_loop(void *_env)
 
     /* and wait for machine initialization */
     while (!qemu_system_ready)
-	pthread_cond_wait(&qemu_system_cond, &qemu_mutex);
+	qemu_cond_wait(&qemu_system_cond);
     pthread_mutex_unlock(&qemu_mutex);
 
     kvm_main_loop_cpu(env);
@@ -384,7 +390,7 @@ void kvm_init_new_ap(int cpu, CPUState *
     pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
 
     while (vcpu_info[cpu].created == 0)
-	pthread_cond_wait(&qemu_vcpu_cond, &qemu_mutex);
+	qemu_cond_wait(&qemu_vcpu_cond);
 }
 
 int kvm_init_ap(void)
@@ -892,8 +898,6 @@ void qemu_kvm_aio_wait_start(void)
 
 void qemu_kvm_aio_wait(void)
 {
-    CPUState *cpu_single = cpu_single_env;
-
     if (!cpu_single_env) {
 	if (io_thread_sigfd != -1) {
 	    fd_set rfds;
@@ -910,10 +914,8 @@ void qemu_kvm_aio_wait(void)
 		sigfd_handler((void *)(unsigned long)io_thread_sigfd);
 	}
 	qemu_aio_poll();
-    } else {
-        pthread_cond_wait(&qemu_aio_cond, &qemu_mutex);
-        cpu_single_env = cpu_single;
-    }
+    } else
+        qemu_cond_wait(&qemu_aio_cond);
 }
 
 void qemu_kvm_aio_wait_end(void)
@@ -939,6 +941,7 @@ void kvm_cpu_destroy_phys_mem(target_phy
 
 void kvm_mutex_unlock(void)
 {
+    assert(!cpu_single_env);
     pthread_mutex_unlock(&qemu_mutex);
 }

Re: [kvm-devel] [kvm-ia64-devel] [PATCH] KVM: Qemu: Build fix for kvm/ia64

From: Avi K. <av...@qu...> - 2008-05-12 08:10:43

Zhang, Xiantao wrote:
>  Could you help to try the attached one. 
>   

Sure, applied, thanks.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [RFC][PATCH 3/4] KVM: VMX: Enable NMI with in-kernel irqchip

From: Avi K. <av...@qu...> - 2008-05-12 07:54:00

Yang, Sheng wrote:
> On Friday 09 May 2008 23:49:13 Avi Kivity wrote:
>   
>> Yang, Sheng wrote:
>>     
>>> From 4942a5c35c97e5edb6fe1303e04fb86f25cac345 Mon Sep 17 00:00:00 2001
>>> From: Sheng Yang <she...@in...>
>>> Date: Thu, 8 May 2008 16:00:57 +0800
>>> Subject: [PATCH 3/4] KVM: VMX: Enable NMI with in-kernel irqchip
>>>
>>>
>>>  static void kvm_do_inject_irq(struct kvm_vcpu *vcpu)
>>>  {
>>>  	int word_index = __ffs(vcpu->arch.irq_summary);
>>> @@ -2146,9 +2159,11 @@ static void do_interrupt_requests(struct kvm_vcpu
>>> *vcpu,
>>>  		/*
>>>  		 * Interrupts blocked.  Wait for unblock.
>>>  		 */
>>> -		cpu_based_vm_exec_control |= CPU_BASED_VIRTUAL_INTR_PENDING;
>>> +		cpu_based_vm_exec_control |=
>>> +			CPU_BASED_VIRTUAL_INTR_PENDING;
>>>  	else
>>> -		cpu_based_vm_exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING;
>>> +		cpu_based_vm_exec_control &=
>>> +			~CPU_BASED_VIRTUAL_INTR_PENDING;
>>>       
>> This seems spurious.
>>     
>
> Sorry, seems I am too anxious to keep it in hand... I would like to check it 
> much careful in the future.
>
>   
>>>  	/* We need to handle NMIs before interrupts are enabled */
>>> -	if ((intr_info & INTR_INFO_INTR_TYPE_MASK) == 0x200) { /* nmi */
>>> +	if ((intr_info & INTR_INFO_INTR_TYPE_MASK) == 0x200) {
>>>  		KVMTRACE_0D(NMI, vcpu, handler);
>>> -		asm("int $2");
>>> +		if (!cpu_has_virtual_nmis())
>>> +			asm("int $2");
>>>  	}
>>>  }
>>>       
>> That's a host nmi.  So does the PIN_BASED_VIRTUAL_NMI mean NMIs are
>> handled like unacked host interrupts?
>>     
>
> Not exactly. No host NMI here if Virtual_NMI is set. Copy from SDM 3B table 
> 20-5:
>
> "If this control(Virtual NMIs) is 1, NMIs are never blocked and the “blocking 
> by NMI” bit (bit 3) in the interruptibility-state field 
> indicates “virtual-NMI blocking” (see Table 20-3). This control also 
> interacts with the “NMI-window exiting” VM-execution control (see Section 
> 20.6.2)."
>   

I still don't understand.  What does "NMIs are never blocked" mean?  
what happens if an NMI occurs while in guest mode?  Obviously we don't 
want it to be delivered to the guest.

-- 
error compiling committee.c: too many arguments to function

157 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 7 8 9 10 11 .. 703 > >> (Page 9 of 703)

2006	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (33)	Nov (325)	Dec (320)
2007	Jan (484)	Feb (438)	Mar (407)	Apr (713)	May (831)	Jun (806)	Jul (1023)	Aug (1184)	Sep (1118)	Oct (1461)	Nov (1224)	Dec (1042)
2008	Jan (1449)	Feb (1110)	Mar (1428)	Apr (1643)	May (682)	Jun	Jul	Aug	Sep	Oct	Nov	Dec