kvm-devel Mailing List for kernel virtual machine (Page 31)

Brought to you by: avik, mtosatti

kvm-devel — kernel virtual machine development

You can subscribe to this list here.

2006	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul	_Aug	_Sep	_Oct (33)	_Nov (325)	_Dec (320)
2007	_Jan (484)	_Feb (438)	_Mar (407)	_Apr (713)	_May (831)	_Jun (806)	_Jul (1023)	_Aug (1184)	_Sep (1118)	_Oct (1461)	_Nov (1224)	_Dec (1042)
2008	_Jan (1449)	_Feb (1110)	_Mar (1428)	_Apr (1643)	_May (682)	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 .. 29 30 31 32 33 .. 703 > >> (Page 31 of 703)

Re: [kvm-devel] Moving kvm lists to kernel.org?

From: David M. <da...@da...> - 2008-04-30 03:14:47

From: Avi Kivity <av...@qu...>
Date: Wed, 30 Apr 2008 00:46:13 +0300

> I sent an email a couple of days ago to pos...@vg...:
> 
> > Hi, please create the following lists for kvm:
> >
> >  kvm (x86 and general discussion)
> >  kvm-ppc (powerpc, managed by Hollis Blanchard)
> >  kvm-ia64 (ia64)
> >  kvm-commits (read-only, tracks commits to kvm git HEAD)
> >
> > Thanks. 

I've created (and tested) all of these lists.

Re: [kvm-devel] [PATCH] [RFC] try to reduce kvm impact in core qemu code.

From: Glauber C. <gc...@re...> - 2008-04-30 03:05:48

Anthony Liguori wrote:
> Glauber Costa wrote:
>> Avi Kivity wrote:
>>  
>>> Glauber Costa wrote:
>>>    
>>>> Hi. This is a proposal for reducing the impact of kvm functions in 
>>>> core qemu
>>>> code. This is by all means not ready, but I felt like posting it, so 
>>>> a discussion
>>>> on it could follow.
>>>>
>>>> The idea in this patch is to replace the specific kvm details from 
>>>> core qemu files
>>>> like vl.c, with driver_yyy() functions. When kvm is not running, 
>>>> those functions would
>>>> just return (most of time), absolutely reducing the impact of kvm code.
>>>>
>>>> As I wanted to test it, in this patch I changed the kvm functions to 
>>>> be called driver_yyy(),
>>>> but that's not my final goal. I intend to use a function pointer 
>>>> schema, similar to what the linux
>>>> kernel already do for a lot of its subsystem, to isolate the changes.
>>>>
>>>> Comments deeply welcome.
>>>>         
>>> While I would be very annoyed if someone referred to kvm as a qemu 
>>> accelerator, I think accelerator_yyy() is more descriptive than 
>>> driver_yyy().
>>>     
>> How about booster? ;-)
>>   
> 
> I don't think the concern from a QEMU perspective is that QEMU is too 
> intimately tied to KVM.  The concern is that overtime, it will be very 
> difficult to make changes to QEMU without breaking KVM support because 
> of the shear number of hooks we require.  Fabrice had actually suggested 
> merging libkvm into QEMU.  We just need to reduce the overall number of 
> if (kvm_enabled()) blocks.

They are not mutually exclusive. Even if we do merge libkvm into qemu, 
having it more modular and separated will be way better than having kvm 
hooks all over.

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Marcelo T. <mto...@re...> - 2008-04-30 02:13:35

On Tue, Apr 29, 2008 at 07:47:54PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >>>Moving the signal handling + pipe write to a separate thread should get
> >>>rid of it.
> >>> 
> >>>      
> >>Yeah, but then you just introduce buffering problems since if you're 
> >>getting that many signals, the pipe will get full.
> >>    
> >
> >It is OK to lose signals if you have at least one queued in the pipe.
> >  
> 
> If you're getting so many signals that you can't make forward progress 
> on any system call, you're application is not going to function 
> anymore.  A use of signals in this manner is broken by design.
> 
> >>No point in designing for something that isn't likely to happen in 
> >>practice.
> >>    
> >
> >You should not design something making the assumption that this scenario
> >won't happen.
> >
> >For example this could happen in high throughput guests using POSIX AIO, 
> >actually pretty likely to happen if data is cached in hosts pagecache.
> >  
> 
> We really just need to move away from signals as best as we can.  I've 
> got a patch started that implements a thread-pool based AIO mechanism 
> for QEMU.  Notifications are done over a pipe so we don't have to deal 
> with the unreliability of signals.
> 
> I can't imagine a guest trying to do so much IO though that this would 
> really ever happen.  POSIX AIO can only have one outstanding request 
> per-fd.  To complete the IO request, you would have to eventually go 
> back to the guest and during that time, the IO thread is going to be 
> able to make forward progress. 

No. POSIX AIO can have one _in flight_ AIO per-fd, but many pending AIO
requests per-fd. You don't have to go back to guest mode to get more
AIO completions.

And then you have drivers arming timers.

I just don't like the idea.

> You won't get a signal again until a new  IO request is submitted.


> 
> Regards,
> 
> Anthony Liguori
> 
> >Its somewhat similar to what happens with NAPI and interrupt mitigation.
> >
> >

[kvm-devel] [ kvm-Bugs-1953353 ] could not load PC BIOS '/path/to/bios.bin' on "-m 4096"

From: SourceForge.net <no...@so...> - 2008-04-30 01:34:09

Bugs item #1953353, was opened at 2008-04-28 07:50
Message generated for change (Comment added) made by mtosatti
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1953353&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Rafal Wijata (ravpl)
Assigned to: Nobody/Anonymous (nobody)
Summary: could not load PC BIOS '/path/to/bios.bin' on "-m 4096"

Initial Comment:
The maximum amount of memory I can give to kvm is ~3560M

I run custom compiled kvm-66 on F8 box with
Linux mailhub 2.6.24.4-64.fc8 #1 SMP Sat Mar 29 09:15:49 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
The modules are loaded from F8 kernel, rather than those shipped with kvm-66

----------------------------------------------------------------------

Comment By: Marcelo Tosatti (mtosatti)
Date: 2008-04-29 21:34

Message:
Logged In: YES 
user_id=2022487
Originator: NO

Can you reproduce the problem with the modules shipped with kvm-66?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1953353&group_id=180599

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Anthony L. <ali...@us...> - 2008-04-30 00:49:14

Marcelo Tosatti wrote:
>>> Moving the signal handling + pipe write to a separate thread should get
>>> rid of it.
>>>  
>>>       
>> Yeah, but then you just introduce buffering problems since if you're 
>> getting that many signals, the pipe will get full.
>>     
>
> It is OK to lose signals if you have at least one queued in the pipe.
>   

If you're getting so many signals that you can't make forward progress 
on any system call, you're application is not going to function 
anymore.  A use of signals in this manner is broken by design.

>> No point in designing for something that isn't likely to happen in practice.
>>     
>
> You should not design something making the assumption that this scenario
> won't happen.
>
> For example this could happen in high throughput guests using POSIX AIO, 
> actually pretty likely to happen if data is cached in hosts pagecache.
>   

We really just need to move away from signals as best as we can.  I've 
got a patch started that implements a thread-pool based AIO mechanism 
for QEMU.  Notifications are done over a pipe so we don't have to deal 
with the unreliability of signals.

I can't imagine a guest trying to do so much IO though that this would 
really ever happen.  POSIX AIO can only have one outstanding request 
per-fd.  To complete the IO request, you would have to eventually go 
back to the guest and during that time, the IO thread is going to be 
able to make forward progress.  You won't get a signal again until a new 
IO request is submitted.

Regards,

Anthony Liguori

> Its somewhat similar to what happens with NAPI and interrupt mitigation.
>
>

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Marcelo T. <mto...@re...> - 2008-04-30 00:34:49

On Tue, Apr 29, 2008 at 07:22:53PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >True.
> >
> >Either way, this starvation due to signals can't happen with the current
> >scheme because signals are blocked. It seems a significant drawback.
> >  
> 
> Practically speaking, I don't see signal starving being a problem.  Part 
> of the benefit of this approach is that we're reducing the overall 
> number of signals received.  With the in-kernel PIT, the number of 
> userspace signals is even further reduced.
> 
> >Moving the signal handling + pipe write to a separate thread should get
> >rid of it.
> >  
> 
> Yeah, but then you just introduce buffering problems since if you're 
> getting that many signals, the pipe will get full.

It is OK to lose signals if you have at least one queued in the pipe.

> No point in designing for something that isn't likely to happen in practice.

You should not design something making the assumption that this scenario
won't happen.

For example this could happen in high throughput guests using POSIX AIO, 
actually pretty likely to happen if data is cached in hosts pagecache.

Its somewhat similar to what happens with NAPI and interrupt mitigation.

Re: [kvm-devel] [PATCH] [RFC] try to reduce kvm impact in core qemu code.

From: Anthony L. <an...@co...> - 2008-04-30 00:27:50

Glauber Costa wrote:
> Avi Kivity wrote:
>   
>> Glauber Costa wrote:
>>     
>>> Hi. This is a proposal for reducing the impact of kvm functions in 
>>> core qemu
>>> code. This is by all means not ready, but I felt like posting it, so a 
>>> discussion
>>> on it could follow.
>>>
>>> The idea in this patch is to replace the specific kvm details from 
>>> core qemu files
>>> like vl.c, with driver_yyy() functions. When kvm is not running, those 
>>> functions would
>>> just return (most of time), absolutely reducing the impact of kvm code.
>>>
>>> As I wanted to test it, in this patch I changed the kvm functions to 
>>> be called driver_yyy(),
>>> but that's not my final goal. I intend to use a function pointer 
>>> schema, similar to what the linux
>>> kernel already do for a lot of its subsystem, to isolate the changes.
>>>
>>> Comments deeply welcome.
>>>   
>>>       
>> While I would be very annoyed if someone referred to kvm as a qemu 
>> accelerator, I think accelerator_yyy() is more descriptive than 
>> driver_yyy().
>>     
> How about booster? ;-)
>   

I don't think the concern from a QEMU perspective is that QEMU is too 
intimately tied to KVM.  The concern is that overtime, it will be very 
difficult to make changes to QEMU without breaking KVM support because 
of the shear number of hooks we require.  Fabrice had actually suggested 
merging libkvm into QEMU.  We just need to reduce the overall number of 
if (kvm_enabled()) blocks.

You have to understand a lot about KVM to know why it's necessary to do 
an register reload call-out in the vmport hw for instance.  Instead of 
introducing generic hooks in vmport, a more appropriate solution may be 
to add wrappers to read individual register values within the QEMU 
device code.   We can then add our hooks to that.  I think a good 
argument can be made that devices should never access env-> directly.

Regards,

Anthony Liguori

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Anthony L. <ali...@us...> - 2008-04-30 00:23:22

Marcelo Tosatti wrote:
> True.
>
> Either way, this starvation due to signals can't happen with the current
> scheme because signals are blocked. It seems a significant drawback.
>   

Practically speaking, I don't see signal starving being a problem.  Part 
of the benefit of this approach is that we're reducing the overall 
number of signals received.  With the in-kernel PIT, the number of 
userspace signals is even further reduced.

> Moving the signal handling + pipe write to a separate thread should get
> rid of it.
>   

Yeah, but then you just introduce buffering problems since if you're 
getting that many signals, the pipe will get full.

No point in designing for something that isn't likely to happen in practice.

Regards,

Anthony Liguori

>> No, they're independent of the patch.  The symptom is that the guest 
>> tends to hang during boot for prolonged periods of time.  It tends to be 
>> random whether it will hang at all, how long it will hang, and at what 
>> point it will hang.  It tends to hang more often in particular places.  
>> In my ubuntu server guest, for instance, it tends to hang right after 
>> partition probing, right after "Loading hardware drivers", and right 
>> before spawning a getty on /dev/tty0.
>>
>> Normally, it will only hang for a few seconds.  However, with -smp N 
>> where N > number of cpus in the host, it tends to happen more 
>> frequently.  Using clock=jiffies in the guest makes the problem go away.
>>     
>
> I'll try to reproduce that, thanks.
>
>   
>> -no-kvm-pit doesn't make a difference (the guest is normally using the 
>> pm-timer).
>>
>> Regards,
>>
>> Anthony Liguori
>>

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Marcelo T. <mto...@re...> - 2008-04-30 00:05:18

On Tue, Apr 29, 2008 at 06:44:29PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >Why? If you leave data in the pipe the next select() won't block.
> >
> >Isnt there the possibility that this loop can be stuck for significant
> >amounts of time? If you're receiving lots of notifications through
> >signals.
> >  
> 
> If  you're getting that many signals, then select() is never going to 
> run anyway.

True.

Either way, this starvation due to signals can't happen with the current
scheme because signals are blocked. It seems a significant drawback.

Moving the signal handling + pipe write to a separate thread should get
rid of it.

> No, they're independent of the patch.  The symptom is that the guest 
> tends to hang during boot for prolonged periods of time.  It tends to be 
> random whether it will hang at all, how long it will hang, and at what 
> point it will hang.  It tends to hang more often in particular places.  
> In my ubuntu server guest, for instance, it tends to hang right after 
> partition probing, right after "Loading hardware drivers", and right 
> before spawning a getty on /dev/tty0.
> 
> Normally, it will only hang for a few seconds.  However, with -smp N 
> where N > number of cpus in the host, it tends to happen more 
> frequently.  Using clock=jiffies in the guest makes the problem go away.

I'll try to reproduce that, thanks.

> -no-kvm-pit doesn't make a difference (the guest is normally using the 
> pm-timer).
> 
> Regards,
> 
> Anthony Liguori

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Anthony L. <ali...@us...> - 2008-04-29 23:44:56

Marcelo Tosatti wrote:
> Why? If you leave data in the pipe the next select() won't block.
>
> Isnt there the possibility that this loop can be stuck for significant
> amounts of time? If you're receiving lots of notifications through
> signals.
>   

If  you're getting that many signals, then select() is never going to 
run anyway.

>>>>>> -        kvm_eat_signal(&io_signal_table, NULL, 1000);
>>>>>>        pthread_mutex_lock(&qemu_mutex);
>>>>>> -        cpu_single_env = NULL;
>>>>>> -        main_loop_wait(0);
>>>>>> +	main_loop_wait(10);
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> Increase that 1000 or something. Will make it easier to spot bugs.
>>>>>
>>>>>      
>>>>>           
>>>> I have actually and it does introduce some bugs.  I'm not entirely clear 
>>>> what is causing them though.
>>>>    
>>>>         
>>> Should indicate that some event previously delivered through signals and
>>> received by sigtimedwait is not waking up the IO thread.
>>>  
>>>       
>> I'll take a look and see.  I'm having time keeping issues in the guest 
>> so it's hard to tell what problems are caused by the IO thread verses time.
>>     
>
> Time issues only with the patch? If not, please share details.
>   

No, they're independent of the patch.  The symptom is that the guest 
tends to hang during boot for prolonged periods of time.  It tends to be 
random whether it will hang at all, how long it will hang, and at what 
point it will hang.  It tends to hang more often in particular places.  
In my ubuntu server guest, for instance, it tends to hang right after 
partition probing, right after "Loading hardware drivers", and right 
before spawning a getty on /dev/tty0.

Normally, it will only hang for a few seconds.  However, with -smp N 
where N > number of cpus in the host, it tends to happen more 
frequently.  Using clock=jiffies in the guest makes the problem go away.

-no-kvm-pit doesn't make a difference (the guest is normally using the 
pm-timer).

Regards,

Anthony Liguori

Re: [kvm-devel] [PATCH] [RFC] try to reduce kvm impact in core qemu code.

From: Glauber C. <gc...@re...> - 2008-04-29 23:41:59

Avi Kivity wrote:
> Glauber Costa wrote:
>> Hi. This is a proposal for reducing the impact of kvm functions in 
>> core qemu
>> code. This is by all means not ready, but I felt like posting it, so a 
>> discussion
>> on it could follow.
>>
>> The idea in this patch is to replace the specific kvm details from 
>> core qemu files
>> like vl.c, with driver_yyy() functions. When kvm is not running, those 
>> functions would
>> just return (most of time), absolutely reducing the impact of kvm code.
>>
>> As I wanted to test it, in this patch I changed the kvm functions to 
>> be called driver_yyy(),
>> but that's not my final goal. I intend to use a function pointer 
>> schema, similar to what the linux
>> kernel already do for a lot of its subsystem, to isolate the changes.
>>
>> Comments deeply welcome.
>>   
> 
> While I would be very annoyed if someone referred to kvm as a qemu 
> accelerator, I think accelerator_yyy() is more descriptive than 
> driver_yyy().
How about booster? ;-)

> I did not see any references to kqemu, but I imagine you mean this to 
> abstract kqemu support as well.

Yeah, even the kvm part is not complete. As I said, just wanted to get 
it going.
> 
> Other than that, looks really good.
> 
thanks

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Marcelo T. <mto...@re...> - 2008-04-29 23:34:12

On Tue, Apr 29, 2008 at 06:15:58PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >Problem is if the IO thread _receives_ SIGIPI instead of some vcpu
> >thread. 
> >
> >So there is potential harm in not blocking it.
> >  
> 
> Hrm, aren't SIG_IPIs delivered to a particular thread-id though?  When 
> would the IO thread receive a SIG_IPI?

Right, they are only delivered to specific threads. Still not harmful 
to make it explicit.

> >>>What is the reason for this loop instead of a straight read? 
> >>>
> >>>Its alright to be interrupted by a signal.
> >>> 
> >>>      
> >>Just general habit with QEMU.
> >>    
> >
> >Please don't :-)
> >  
> 
> I don't see the harm.  In fact, I think it's more correct.  Otherwise, 
> we have to wait for another invocation of the fd callback.

Why? If you leave data in the pipe the next select() won't block.

Isnt there the possibility that this loop can be stuck for significant
amounts of time? If you're receiving lots of notifications through
signals.

> >>>>-        kvm_eat_signal(&io_signal_table, NULL, 1000);
> >>>>        pthread_mutex_lock(&qemu_mutex);
> >>>>-        cpu_single_env = NULL;
> >>>>-        main_loop_wait(0);
> >>>>+	main_loop_wait(10);
> >>>>   
> >>>>        
> >>>Increase that 1000 or something. Will make it easier to spot bugs.
> >>> 
> >>>      
> >>I have actually and it does introduce some bugs.  I'm not entirely clear 
> >>what is causing them though.
> >>    
> >
> >Should indicate that some event previously delivered through signals and
> >received by sigtimedwait is not waking up the IO thread.
> >  
> 
> I'll take a look and see.  I'm having time keeping issues in the guest 
> so it's hard to tell what problems are caused by the IO thread verses time.

Time issues only with the patch? If not, please share details.

Re: [kvm-devel] [PATCH] Don't race while creating a VCPU

From: Avi K. <av...@qu...> - 2008-04-29 23:31:25

Anthony Liguori wrote:
> We hold qemu_mutex while machine->init() executes, which issues a VCPU create.
> We need to make sure to not return from the VCPU creation until the VCPU
> file descriptor is valid to ensure that APIC creation succeeds.
>
> However, we also need to make sure that the VCPU thread doesn't start running
> until the machine->init() is complete.  This is addressed today because the
> VCPU thread tries to grab the qemu_mutex before doing anything interesting.
> If we release qemu_mutex to wait for VCPU creation, then we open a window for
> a race to occur.
>
> This patch introduces two wait conditions.  The first lets the VCPU create
> code that runs in the IO thread to wait for a VCPU to initialize.  The second
> condition lets the VCPU thread wait for the machine to fully initialize before
> running.
>
>   

Applied, thanks.


> An added benefit of this patch is it makes the dependencies now explicit.
>
>   

Indeed.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

Re: [kvm-devel] Wiki: Add SMP Count to "Guest Support Status"

From: Avi K. <av...@qu...> - 2008-04-29 23:26:01

Fabian Deutsch wrote:
> Hey.
>
> I've been trying Microsoft Windows 2003 a couple of times. The wiki
> tells me that "everything" should work okay. It does, when using -smp 1,
> but gets ugly when using -smp 2 or so.
>
> SO might it be useful, to add the column "smp" to the "Guest Support
> Status" Page in the wiki?
>   

SMP Windows work best if you have FlexPriority on your hardware.  What 
host cpu are you using?

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

From: Avi K. <av...@qu...> - 2008-04-29 23:22:24

Laurent Vivier wrote:
>> Why dst.val is not 0x53e10 ?
>>     
>
> I can answer myself to this one:
>
> 	emulate_2op_SrcB("sal", c->src, c->dst, ctxt->eflags);
>
> does nothing if dst.byte == 0
>
> So next question is the good question...
>
>   
>> Why dst.byte is 0 ?
>>
>>     

Because dst.bytes is only set if dst.type == OP_MEM, or ad hoc in the 
instruction itself.

Better to set it unconditionally (and adjust in the instruction if 
necessary).

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Anthony L. <ali...@us...> - 2008-04-29 23:16:10

Marcelo Tosatti wrote:
> Problem is if the IO thread _receives_ SIGIPI instead of some vcpu
> thread. 
>
> So there is potential harm in not blocking it.
>   

Hrm, aren't SIG_IPIs delivered to a particular thread-id though?  When 
would the IO thread receive a SIG_IPI?

>>> What is the reason for this loop instead of a straight read? 
>>>
>>> Its alright to be interrupted by a signal.
>>>  
>>>       
>> Just general habit with QEMU.
>>     
>
> Please don't :-)
>   

I don't see the harm.  In fact, I think it's more correct.  Otherwise, 
we have to wait for another invocation of the fd callback.

>>>> -        kvm_eat_signal(&io_signal_table, NULL, 1000);
>>>>         pthread_mutex_lock(&qemu_mutex);
>>>> -        cpu_single_env = NULL;
>>>> -        main_loop_wait(0);
>>>> +	main_loop_wait(10);
>>>>    
>>>>         
>>> Increase that 1000 or something. Will make it easier to spot bugs.
>>>  
>>>       
>> I have actually and it does introduce some bugs.  I'm not entirely clear 
>> what is causing them though.
>>     
>
> Should indicate that some event previously delivered through signals and
> received by sigtimedwait is not waking up the IO thread.
>   

I'll take a look and see.  I'm having time keeping issues in the guest 
so it's hard to tell what problems are caused by the IO thread verses time.

Regards,

Anthony Liguori

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

From: Anthony L. <ali...@us...> - 2008-04-29 23:12:53

Hollis Blanchard wrote:
> On Tuesday 29 April 2008 17:17:49 Avi Kivity wrote:
>   
>> I like this very much, as it only affects accessors and not the mmu core 
>> itself.
>>
>> Hollis/Xiantao/Carsten, can you confirm that this approach works for 
>> you?  Carsten, I believe you don't have mmio, but at least this 
>> shouldn't interfere.
>>     
>
> OK, so the idea is to mmap /sys/bus/pci/.../region within the guest RAM area, 
> and just include it within the normal guest RAM memslot?
>   

In the case of x86, since the PCI IO region is pretty far away from 
normal RAM, we'll probably use a different memory slot, but yes, that's 
the general idea.

IIUC PPC correctly, all IO pages have corresponding struct pages.  This 
means that get_user_pages() would succeed and you can reference count 
them?  In this case, we would never take the VM_PFNMAP path.

Is that correct?

> How will the IOMMU be programmed? Wouldn't you still need to register a 
> special type of memslot for that?
>   

That's independent of this patchset.  For non-aware guests, we'll have 
to pin all of physical memory up front and then create an IOMMU table 
from the pinned physical memory.  For aware guests with a PV DMA window 
API, we'll be able to build that mapping on the fly (enforcing mlock 
allocation limits).

Regards,

Anthony Liguori

Re: [kvm-devel] State of debug register emulation

From: Avi K. <av...@qu...> - 2008-04-29 23:11:55

Jan Kiszka wrote:
> This still leaves me with the question how to handle the case when the
> host sets and arms some debug registers to debug the guest and the
> latter does the same to debug itself. Guest access will be trapped, OK,
> but KVM will then have to decide which value should actually be
> transfered into the registers. Hmm, does SVM virtualizes all debug
> registers, leaving the real ones to the host?
>   

There's no way this can work.  There are still only four debug 
registers, and the guest and host together can ask for eight different 
addresses.  It is theoretically doable by hiding all mappings to pages 
that are debug targets, but it would probably double the kvm code size.

A good short-term compomise is to abort if the guest starts using 
enabling a debug address register.  A better solution might be to place 
host debug addresses into unused guest debug registers, so that as long 
as nr_guest_debug + nr_host_debug <= 4, we can still proceed.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Marcelo T. <mto...@re...> - 2008-04-29 23:10:01

On Tue, Apr 29, 2008 at 05:42:51PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >Hi Anthony,
> >
> >How is -no-kvm-irqchip working with the patch?
> >  
> 
> Seems to work fine.  What is your expectation?

Just wondering if vcpu's are being properly awake.

> >Make sure the IO thread has SIG_IPI blocked (those are for APIC vcpu
> >initialization only).
> >  
> 
> Just so I'm clear, there's really no harm in not blocking SIG_IPI 
> because it would just be ignored by the IO thread (since the SIG_IPI 
> handler is a nop).  But yeah, we should explicitly block it.

Problem is if the IO thread _receives_ SIGIPI instead of some vcpu
thread. 

So there is potential harm in not blocking it.

> >What is the reason for this loop instead of a straight read? 
> >
> >Its alright to be interrupted by a signal.
> >  
> 
> Just general habit with QEMU.

Please don't :-)

> >>-        kvm_eat_signal(&io_signal_table, NULL, 1000);
> >>         pthread_mutex_lock(&qemu_mutex);
> >>-        cpu_single_env = NULL;
> >>-        main_loop_wait(0);
> >>+	main_loop_wait(10);
> >>    
> >
> >Increase that 1000 or something. Will make it easier to spot bugs.
> >  
> 
> I have actually and it does introduce some bugs.  I'm not entirely clear 
> what is causing them though.

Should indicate that some event previously delivered through signals and
received by sigtimedwait is not waking up the IO thread.

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

From: Hollis B. <ho...@us...> - 2008-04-29 22:57:27

On Tuesday 29 April 2008 17:17:49 Avi Kivity wrote:
> Anthony Liguori wrote:
> > This patch allows VMA's that contain no backing page to be used for guest
> > memory.  This is a drop-in replacement for Ben-Ami's first page in his 
direct
> > mmio series.  Here, we continue to allow mmio pages to be represented in 
the
> > rmap.
> >
> >   
> 
> I like this very much, as it only affects accessors and not the mmu core 
> itself.
> 
> Hollis/Xiantao/Carsten, can you confirm that this approach works for 
> you?  Carsten, I believe you don't have mmio, but at least this 
> shouldn't interfere.

OK, so the idea is to mmap /sys/bus/pci/.../region within the guest RAM area, 
and just include it within the normal guest RAM memslot?

How will the IOMMU be programmed? Wouldn't you still need to register a 
special type of memslot for that?

-- 
Hollis Blanchard
IBM Linux Technology Center

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

From: Avi K. <av...@qu...> - 2008-04-29 22:54:00

Anthony Liguori wrote:
> Avi Kivity wrote:
>> It depends on what's going on?  Does a page table point to mmio?  Or 
>> the glommerclock?
>>
>> Not sure there is a single answer.
>>
>>> Perhaps we should be replacing consumers of gfn_to_page() with 
>>> copy_to_user() instead?
>>
>> Indeed we should.  The problem is access in atomic contexts.  It's 
>> easy to detect failure, but not always easy to handle it.
>
> So I think we should replace it with a rate limited printk and 
> returning bad_page.  That way the guest can't exploit it and we'll 
> still hopefully get printk()s to track down instances of things going 
> bad.
>

Agreed.  Add a stacktrace so we can see what causes the badness.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

From: Anthony L. <ali...@us...> - 2008-04-29 22:51:48

Avi Kivity wrote:
> It depends on what's going on?  Does a page table point to mmio?  Or 
> the glommerclock?
>
> Not sure there is a single answer.
>
>> Perhaps we should be replacing consumers of gfn_to_page() with 
>> copy_to_user() instead?
>
> Indeed we should.  The problem is access in atomic contexts.  It's 
> easy to detect failure, but not always easy to handle it.

So I think we should replace it with a rate limited printk and returning 
bad_page.  That way the guest can't exploit it and we'll still hopefully 
get printk()s to track down instances of things going bad.

Regards,

Anthony Liguori

Re: [kvm-devel] [PATCH] KVM x86: Handle hypercalls for assigned PCI devices

From: Avi K. <av...@qu...> - 2008-04-29 22:50:09

Amit Shah wrote:
>
>>> +	if (is_error_page(host_page)) {
>>> +		printk(KERN_INFO "%s: gfn %p not valid\n",
>>> +		       __func__, (void *)page_gfn);
>>> +		r = -1;
>>>       
>> r = -1 is not really informative. Better use some meaningful error.
>>     
>
> The error's going to the guest. The guest, as we know, has already done a 
> successful DMA allocation. Something went wrong in the hypercall, and we 
> don't know why (bad page). Any kind of error here isn't going to be 
> intelligible to the guest anyway. It's mostly a host thing if we ever hit 
> this.
>
>   

If the guest is not able to handle it, why bother returning an error?  
Better to kill it.

But in any case, -1 is not a good error number.

>>> +	if (find_pci_pt_dev(&vcpu->kvm->arch.pci_pt_dev_head,
>>> +			    &pci_pt_info, 0, KVM_PT_SOURCE_ASSIGN))
>>> +		r++; /* We have assigned the device */
>>> +
>>> +	kunmap(host_page);
>>>       
>> better use atomic mappings here.
>>     
>
> We can't use atomic mappings for guest pages. They can be swapped out.
>   

kmap()ed pages can't be swapped out either.  The atomic in kmap_atomic() 
only refers to the context in which the pages are used.  Atmoic kmaps 
are greatly preferable to the nonatomic ones.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

From: Anthony L. <ali...@us...> - 2008-04-29 22:44:28

Marcelo Tosatti wrote:
> Hi Anthony,
>
> How is -no-kvm-irqchip working with the patch?
>   

Seems to work fine.  What is your expectation?

> On Tue, Apr 29, 2008 at 09:28:14AM -0500, Anthony Liguori wrote:
>   
>> This patch eliminates the use of sigtimedwait() in the IO thread.  To avoid the
>> signal/select race condition, we use a pipe that we write to in the signal
>> handlers.  This was suggested by Rusty and seems to work well.
>>
>> +static int kvm_eat_signal(CPUState *env, int timeout)
>>  {
>>      struct timespec ts;
>>      int r, e, ret = 0;
>>      siginfo_t siginfo;
>> +    sigset_t waitset;
>>  
>> +    sigemptyset(&waitset);
>> +    sigaddset(&waitset, SIG_IPI);
>>      ts.tv_sec = timeout / 1000;
>>      ts.tv_nsec = (timeout % 1000) * 1000000;
>> -    r = sigtimedwait(&waitset->sigset, &siginfo, &ts);
>> +    qemu_kvm_unlock();
>> +    r = sigtimedwait(&waitset, &siginfo, &ts);
>> +    qemu_kvm_lock(env);
>> +    cpu_single_env = env;
>>     
>
> This assignment seems redundant now.
>   

Yeah, I have a bigger patch which eliminates all of the explicit 
assignments to cpu_single_env.

>>  
>> @@ -263,12 +238,8 @@ static void pause_all_threads(void)
>>  	vcpu_info[i].stop = 1;
>>  	pthread_kill(vcpu_info[i].thread, SIG_IPI);
>>     
>
> Make sure the IO thread has SIG_IPI blocked (those are for APIC vcpu
> initialization only).
>   

Just so I'm clear, there's really no harm in not blocking SIG_IPI 
because it would just be ignored by the IO thread (since the SIG_IPI 
handler is a nop).  But yeah, we should explicitly block it.

>> +static void sig_aio_fd_read(void *opaque)
>> +{
>> +    int signum;
>> +    ssize_t len;
>> +
>> +    do { 
>> +	len = read(kvm_sigfd[0], &signum, sizeof(signum));
>> +    } while (len == -1 && errno == EINTR);
>>     
>
> What is the reason for this loop instead of a straight read? 
>
> Its alright to be interrupted by a signal.
>   

Just general habit with QEMU.

>> +    signal(SIGUSR1, sig_aio_handler);
>> +    signal(SIGUSR2, sig_aio_handler);
>> +    signal(SIGALRM, sig_aio_handler);
>> +    signal(SIGIO, sig_aio_handler);
>> +
>> +    if (pipe(kvm_sigfd) == -1)
>> +	abort();
>>     
>
> perror() would be nice.
>   

Yeah, everything needs proper error handling.

>> -        kvm_eat_signal(&io_signal_table, NULL, 1000);
>>          pthread_mutex_lock(&qemu_mutex);
>> -        cpu_single_env = NULL;
>> -        main_loop_wait(0);
>> +	main_loop_wait(10);
>>     
>
> Increase that 1000 or something. Will make it easier to spot bugs.
>   

I have actually and it does introduce some bugs.  I'm not entirely clear 
what is causing them though.

Regards,

Anthony Liguori

> Similarly in qemu_kvm_aio_wait().
>
>

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

From: Avi K. <av...@qu...> - 2008-04-29 22:43:51

Anthony Liguori wrote:
>>
>>  
>>>  
>>>  struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
>>>  {
>>> -    return pfn_to_page(gfn_to_pfn(kvm, gfn));
>>> +    pfn_t pfn;
>>> +
>>> +    pfn = gfn_to_pfn(kvm, gfn);
>>> +    if (pfn_valid(pfn))
>>> +        return pfn_to_page(pfn);
>>> +
>>> +    return NULL;
>>>  }
>>>       
>>
>> You're returning NULL here, not bad_page.
>>   
>
> My thinking was that bad_page indicates that the gfn is invalid.  This 
> is a different type of error though.  The problem is that the guest is 
> we are trying to kmap() a page that has no struct page associated with 
> it.  I'm not sure what the right thing to do here is.
>

It depends on what's going on?  Does a page table point to mmio?  Or the 
glommerclock?

Not sure there is a single answer.

> Perhaps we should be replacing consumers of gfn_to_page() with 
> copy_to_user() instead?

Indeed we should.  The problem is access in atomic contexts.  It's easy 
to detect failure, but not always easy to handle it.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

157 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 29 30 31 32 33 .. 703 > >> (Page 31 of 703)

2006	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (33)	Nov (325)	Dec (320)
2007	Jan (484)	Feb (438)	Mar (407)	Apr (713)	May (831)	Jun (806)	Jul (1023)	Aug (1184)	Sep (1118)	Oct (1461)	Nov (1224)	Dec (1042)
2008	Jan (1449)	Feb (1110)	Mar (1428)	Apr (1643)	May (682)	Jun	Jul	Aug	Sep	Oct	Nov	Dec