From: Laurent V. <Lau...@bu...> - 2008-04-23 14:59:23
|
Le mercredi 23 avril 2008 à 17:31 +0300, Avi Kivity a écrit : > Laurent Vivier wrote: > > This patch is the kernel part of the "batch writes to MMIO" patch. > > > > When kernel has to send MMIO writes to userspace, it stores them > > in memory until it has to pass the hand to userspace for another > > reason. This avoids to have too many context switches on operations > > that can wait. > > > > WARNING: this breaks compatibility with old userspace part. > > > > Signed-off-by: Laurent Vivier <Lau...@bu...> > > --- > > arch/x86/kvm/x86.c | 21 +++++++++++++++++++++ > > include/asm-x86/kvm_host.h | 2 ++ > > include/linux/kvm.h | 10 +++++++++- > > virt/kvm/kvm_main.c | 3 +++ > > 4 files changed, 35 insertions(+), 1 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 0ce5563..3881056 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -2942,8 +2942,21 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) > > kvm_x86_ops->decache_regs(vcpu); > > } > > > > +batch: > > r = __vcpu_run(vcpu, kvm_run); > > > > + if (!r && vcpu->mmio_is_write && > > + kvm_run->exit_reason == KVM_EXIT_MMIO && > > + kvm_run->batch_count < KVM_MAX_BATCH) { > > + struct kvm_batch *batch = vcpu->arch.batch_data; > > + int i = kvm_run->batch_count++; > > + > > + batch[i].phys_addr = vcpu->mmio_phys_addr; > > + batch[i].len = vcpu->mmio_size; > > + memcpy(batch[i].data, vcpu->mmio_data, batch[i].len); > > > > This breaks ordering on smp guests. batch_data needs to be a kvm thing, > not a vcpu thing, and locked, of course. - is ordering between vcpu important when we already delay operations ? - using vcpu avoids the lock - Why PIO (pio_data) are vcpu things and not kvm things, then ? > Also, you don't want to queue writes which trigger I/O since that will > affect latency. Userspace should tell the kernel which mmio addresses > are queuable. I agree (but in my first patch it was easier to ignore this) > > + > > + goto batch; > > > > If you move this to within __vcpu_run, you won't need to loop. Maybe > the best place is where we actually decide it's an mmio write. I agree > You also need to flush the queue each time you have an in-kernel mmio > write. For example: > > vcpu0 vcpu1 > > mmio write (queued) > apic write -> IPI > doesn't see effects of write I agree > > + } > > out: > > if (vcpu->sigset_active) > > sigprocmask(SIG_SETMASK, &sigsaved, NULL); > > @@ -3830,6 +3843,13 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) > > } > > vcpu->arch.pio_data = page_address(page); > > > > + page = alloc_page(GFP_KERNEL | __GFP_ZERO); > > + if (!page) { > > + r = -ENOMEM; > > + goto fail; > > + } > > + vcpu->arch.batch_data = page_address(page); > > + > > r = kvm_mmu_create(vcpu); > > if (r < 0) > > goto fail_free_pio_data; > > @@ -3857,6 +3877,7 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) > > kvm_mmu_destroy(vcpu); > > up_read(&vcpu->kvm->slots_lock); > > free_page((unsigned long)vcpu->arch.pio_data); > > + free_page((unsigned long)vcpu->arch.batch_data); > > } > > > > #define CR3_PAE_RESERVED_BITS ((X86_CR3_PWT | X86_CR3_PCD) - 1) > > #define CR3_NONPAE_RESERVED_BITS ((PAGE_SIZE-1) & ~(X86_CR3_PWT | X86_CR3_PCD)) > > @@ -255,6 +256,7 @@ struct kvm_vcpu_arch { > > gva_t mmio_fault_cr2; > > struct kvm_pio_request pio; > > void *pio_data; > > + void *batch_data; > > > > > > It's an array of structs, no? So it shouldn't be a void *. Yes (I love cut&paste...) > > > > +#define KVM_MAX_BATCH (PAGE_SIZE / sizeof(struct kvm_batch)) > > +struct kvm_batch { > > + __u64 phys_addr; > > + __u32 len; > > + __u8 data[8]; > > +}; > > > > Size is 24 on 64-bit and 20 on 32-bit. Please pad (after len, so data > is nicely aligned). OK Thank you, Laurent -- ------------- Lau...@bu... --------------- "The best way to predict the future is to invent it." - Alan Kay |