From: Hollis B. <ho...@us...> - 2008-01-09 23:07:49
|
Add an "is_bigendian" flag to the kvm_run.mmio structure. This is needed for architectures that can make both little- and big-endian memory accesses. Signed-off-by: Hollis Blanchard <ho...@us...> --- PowerPC has different instructions for native and byte-reversed memory accesses, and some implementations can also can map individual pages as byte-reversed. Right now in the PowerPC KVM implementation the kernel detects byte-reversed MMIO from the guest and converts the data as appropriate so that userland only ever deals with big-endian data. That's fine and all, but I started thinking about supporting MMIO passthrough, in which userland wouldn't emulate an MMIO at all, but rather execute it on the real hardware (via mmap /dev/mem, for example). In that case, it's actually very important that the endianness of the access be preserved, since we need that information to access the real hardware. I don't think this patch has any serious x86 ABI implications, since current x86 code just ignores the flag. I guess x86 could continue to ignore it in the future, or it could explicitly zero the new flag. Comments? diff --git a/include/linux/kvm.h b/include/linux/kvm.h --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -123,6 +123,7 @@ struct kvm_run { __u8 data[8]; __u32 len; __u8 is_write; + __u8 is_bigendian; } mmio; /* KVM_EXIT_HYPERCALL */ struct { -- Hollis Blanchard IBM Linux Technology Center |
From: Avi K. <av...@qu...> - 2008-01-10 06:56:21
|
Hollis Blanchard wrote: > Add an "is_bigendian" flag to the kvm_run.mmio structure. > > This is needed for architectures that can make both little- and > big-endian memory accesses. > > Signed-off-by: Hollis Blanchard <ho...@us...> > --- > > PowerPC has different instructions for native and byte-reversed memory > accesses, and some implementations can also can map individual pages as > byte-reversed. Right now in the PowerPC KVM implementation the kernel > detects byte-reversed MMIO from the guest and converts the data as > appropriate so that userland only ever deals with big-endian data. > > That's fine and all, but I started thinking about supporting MMIO > passthrough, in which userland wouldn't emulate an MMIO at all, but > rather execute it on the real hardware (via mmap /dev/mem, for example). > > In that case, it's actually very important that the endianness of the > access be preserved, since we need that information to access the real > hardware. > > I don't think this patch has any serious x86 ABI implications, since > current x86 code just ignores the flag. I guess x86 could continue to > ignore it in the future, or it could explicitly zero the new flag. > Ignoring the field is better since older kernels won't zero it. IIRC endianness is a per-page attribute on ppc, no? Otherwise you'd have a global attribute instead of per-access. -- error compiling committee.c: too many arguments to function |
From: Hollis B. <ho...@us...> - 2008-01-10 15:25:26
|
On Thu, 2008-01-10 at 08:56 +0200, Avi Kivity wrote: > > IIRC endianness is a per-page attribute on ppc, no? Otherwise you'd > have a global attribute instead of per-access. The MMU in some PowerPC can have per-page endianness, but not all. On a processor that supports this attribute, I expect that when an MMIO trap occurs we'll need to inspect the guest MMU state in order to set the is_bigendian flag correctly. The real issue I'm looking at right now is byte-reversed loads and stores. For example, "lwzx" (Load Word and Zero Indexed) does a big-endian 4-byte load, while "lwbrx" (Load Word Byte-Reverse Indexed) does a little-endian 4-byte load. These instructions exist on all PowerPC, and they can be issued at any time and do not depend on MMU mappings. -- Hollis Blanchard IBM Linux Technology Center |
From: Jimi X. <ji...@po...> - 2008-01-10 15:37:20
|
On Jan 10, 2008, at 10:23 AM, Hollis Blanchard wrote: > On Thu, 2008-01-10 at 08:56 +0200, Avi Kivity wrote: >> >> IIRC endianness is a per-page attribute on ppc, no? Otherwise you'd >> have a global attribute instead of per-access. > > The MMU in some PowerPC can have per-page endianness, but not all. > On a > processor that supports this attribute, I expect that when an MMIO > trap > occurs we'll need to inspect the guest MMU state in order to set the > is_bigendian flag correctly. Hey.. don't forget MSR[LE]. -JX > > The real issue I'm looking at right now is byte-reversed loads and > stores. For example, "lwzx" (Load Word and Zero Indexed) does a > big-endian 4-byte load, while "lwbrx" (Load Word Byte-Reverse Indexed) > does a little-endian 4-byte load. These instructions exist on all > PowerPC, and they can be issued at any time and do not depend on MMU > mappings. > > -- > Hollis Blanchard > IBM Linux Technology Center > > > ---------------------------------------------------------------------- > --- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/ > marketplace > _______________________________________________ > kvm-ppc-devel mailing list > kvm...@li... > https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel |
From: Hollis B. <ho...@us...> - 2008-01-10 22:58:26
|
On Thu, 2008-01-10 at 17:28 +0200, Avi Kivity wrote: > I'll apply that patch (with a #ifdef CONFIG_PPC so other archs don't > use it by mistake). I don't think that's the right ifdef. For example, I believe IA64 can run in BE mode and so will have the same issue, and there are certainly other architectures (less relevant to the current code) that definitely are in the same situation. We need to plumb this through to the libkvm users anyways. Take a look at the patch below and tell me if you think it's not the right approach. x86 simply won't consider 'is_bigendian'. I spent a lot of time on this, and it's by far the cleanest solution I could come up with. diff --git a/libkvm/config-i386.mak b/libkvm/config-i386.mak --- a/libkvm/config-i386.mak +++ b/libkvm/config-i386.mak @@ -2,5 +2,6 @@ LIBDIR := /lib LIBDIR := /lib CFLAGS += -m32 CFLAGS += -D__i386__ +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE libkvm-$(ARCH)-objs := libkvm-x86.o diff --git a/libkvm/config-ia64.mak b/libkvm/config-ia64.mak --- a/libkvm/config-ia64.mak +++ b/libkvm/config-ia64.mak @@ -1,5 +1,6 @@ LIBDIR := /lib CFLAGS += -D__ia64__ +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE -DARCH_MMIO_ENDIAN_BIG libkvm-$(ARCH)-objs := libkvm-ia64.o diff --git a/libkvm/config-powerpc.mak b/libkvm/config-powerpc.mak --- a/libkvm/config-powerpc.mak +++ b/libkvm/config-powerpc.mak @@ -2,5 +2,6 @@ LIBDIR := /lib LIBDIR := /lib CFLAGS += -m32 CFLAGS += -D__powerpc__ +CFLAGS += -DARCH_MMIO_ENDIAN_BIG -DARCH_MMIO_ENDIAN_LITTLE libkvm-$(ARCH)-objs := libkvm-powerpc.o diff --git a/libkvm/config-x86_64.mak b/libkvm/config-x86_64.mak --- a/libkvm/config-x86_64.mak +++ b/libkvm/config-x86_64.mak @@ -2,5 +2,6 @@ LIBDIR := /lib64 LIBDIR := /lib64 CFLAGS += -m64 CFLAGS += -D__x86_64__ +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE libkvm-$(ARCH)-objs := libkvm-x86.o diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -774,21 +774,56 @@ int kvm_set_sregs(kvm_context_t kvm, int return ioctl(kvm->vcpu_fd[vcpu], KVM_SET_SREGS, sregs); } +#ifdef ARCH_MMIO_ENDIAN_BIG +static int handle_mmio_bigendian(kvm_context_t kvm, struct kvm_run *kvm_run) +{ + if (kvm_run->mmio.is_write) + return kvm->callbacks->mmio_write_be(kvm->opaque, + kvm_run->mmio.phys_addr, + kvm_run->mmio.data, + kvm_run->mmio.len); + else + return kvm->callbacks->mmio_read_be(kvm->opaque, + kvm_run->mmio.phys_addr, + kvm_run->mmio.data, + kvm_run->mmio.len); +} +#endif + +#ifdef ARCH_MMIO_ENDIAN_LITTLE +static int handle_mmio_littleendian(kvm_context_t kvm, struct kvm_run *kvm_run) +{ + if (kvm_run->mmio.is_write) + return kvm->callbacks->mmio_write_le(kvm->opaque, + kvm_run->mmio.phys_addr, + kvm_run->mmio.data, + kvm_run->mmio.len); + else + return kvm->callbacks->mmio_read_le(kvm->opaque, + kvm_run->mmio.phys_addr, + kvm_run->mmio.data, + kvm_run->mmio.len); +} +#endif + static int handle_mmio(kvm_context_t kvm, struct kvm_run *kvm_run) { unsigned long addr = kvm_run->mmio.phys_addr; - void *data = kvm_run->mmio.data; /* hack: Red Hat 7.1 generates these weird accesses. */ if ((addr > 0xa0000-4 && addr <= 0xa0000) && kvm_run->mmio.len == 3) return 0; - if (kvm_run->mmio.is_write) - return kvm->callbacks->mmio_write(kvm->opaque, addr, data, - kvm_run->mmio.len); +#if defined(ARCH_MMIO_ENDIAN_BIG) && defined(ARCH_MMIO_ENDIAN_LITTLE) + if (kvm_run->mmio.is_bigendian) + return handle_mmio_bigendian(kvm, kvm_run); else - return kvm->callbacks->mmio_read(kvm->opaque, addr, data, - kvm_run->mmio.len); + return handle_mmio_littleendian(kvm, kvm_run); +#elif ARCH_MMIO_ENDIAN_LITTLE + return handle_mmio_littleendian(kvm, kvm_run); +#elif ARCH_MMIO_ENDIAN_BIG + return handle_mmio_bigendian(kvm, kvm_run); +#endif } int handle_io_window(kvm_context_t kvm) diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -46,11 +46,11 @@ struct kvm_callbacks { /// For 32bit IO writes from the guest (Usually when executing 'outl') int (*outl)(void *opaque, uint16_t addr, uint32_t data); /// generic memory reads to unmapped memory (For MMIO devices) - int (*mmio_read)(void *opaque, uint64_t addr, uint8_t *data, - int len); + int (*mmio_read_be)(void *opaque, uint64_t addr, uint8_t *data, int len); + int (*mmio_read_le)(void *opaque, uint64_t addr, uint8_t *data, int len); /// generic memory writes to unmapped memory (For MMIO devices) - int (*mmio_write)(void *opaque, uint64_t addr, uint8_t *data, - int len); + int (*mmio_write_be)(void *opaque, uint64_t addr, uint8_t *data, int len); + int (*mmio_write_le)(void *opaque, uint64_t addr, uint8_t *data, int len); int (*debug)(void *opaque, int vcpu); /*! * \brief Called when the VCPU issues an 'hlt' instruction. diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -21,6 +21,7 @@ int kvm_irqchip = 1; #include <libkvm.h> #include <pthread.h> #include <sys/utsname.h> +#include <endian.h> extern void perror(const char *s); @@ -533,8 +534,13 @@ static struct kvm_callbacks qemu_kvm_ops .outb = kvm_outb, .outw = kvm_outw, .outl = kvm_outl, - .mmio_read = kvm_mmio_read, - .mmio_write = kvm_mmio_write, +#if __BYTE_ORDER == __LITTLE_ENDIAN + .mmio_read_le = kvm_mmio_read, + .mmio_write_le = kvm_mmio_write, +#else + .mmio_read_be = kvm_mmio_read, + .mmio_write_be = kvm_mmio_write, +#endif .halt = kvm_halt, .shutdown = kvm_shutdown, .io_window = kvm_io_window, diff --git a/user/main-ppc.c b/user/main-ppc.c --- a/user/main-ppc.c +++ b/user/main-ppc.c @@ -92,14 +92,14 @@ static int test_pre_kvm_run(void *opaque return 0; } -static int test_mem_read(void *opaque, uint64_t addr, uint8_t *data, int len) +static int test_mem_read_be(void *opaque, uint64_t addr, uint8_t *data, int len) { printf("%s: addr %"PRIx64" len %d\n", __func__, addr, len); memset(data, 0, len); return 0; } -static int test_mem_write(void *opaque, uint64_t addr, uint8_t *data, int len) +static int test_mem_write_be(void *opaque, uint64_t addr, uint8_t *data, int len) { printf("%s: addr %"PRIx64" len %d data %"PRIx64"\n", __func__, addr, len, *(uint64_t *)data); @@ -120,8 +120,8 @@ static int test_dcr_write(kvm_context_t } static struct kvm_callbacks test_callbacks = { - .mmio_read = test_mem_read, - .mmio_write = test_mem_write, + .mmio_read_be = test_mem_read_be, + .mmio_write_be = test_mem_write_be, .debug = test_debug, .halt = test_halt, .io_window = test_io_window, diff --git a/user/main.c b/user/main.c --- a/user/main.c +++ b/user/main.c @@ -389,8 +389,8 @@ static struct kvm_callbacks test_callbac .outb = test_outb, .outw = test_outw, .outl = test_outl, - .mmio_read = test_mem_read, - .mmio_write = test_mem_write, + .mmio_read_le = test_mem_read, + .mmio_write_le = test_mem_write, .debug = test_debug, .halt = test_halt, .io_window = test_io_window, -- Hollis Blanchard IBM Linux Technology Center |
From: Xu, A. <ant...@in...> - 2008-01-11 02:04:55
|
Hi all, That's a good start to consider BE. Yes, IA64 support BE and LE. I have below comments. What does is_bigendian mean? Host is runing with BE or guest is running with BE. Who will set is_bigendian? For supporing BE, We need to consider host BE and guest BE. For IA64, most OS is running with LE, and Application can run with BE or LE, For example, Qemu can run with BE or LE. IMHO, we need two flags,=20 host_is_bigendian indicates Qemu is running with BE Guest_is_bigendian indecates the guest application who is accessing MMIO Is running with LE. - Anthony Hollis Blanchard wrote: > On Thu, 2008-01-10 at 17:28 +0200, Avi Kivity wrote: >> I'll apply that patch (with a #ifdef CONFIG_PPC so other archs don't >> use it by mistake). >=20 > I don't think that's the right ifdef. For example, I believe IA64 can > run in BE mode and so will have the same issue, and there are > certainly=20 > other architectures (less relevant to the current code) that > definitely=20 > are in the same situation. >=20 > We need to plumb this through to the libkvm users anyways. Take a look > at the patch below and tell me if you think it's not the right > approach.=20 > x86 simply won't consider 'is_bigendian'. I spent a lot of time on > this,=20 > and it's by far the cleanest solution I could come up with. >=20 >=20 >=20 > diff --git a/libkvm/config-i386.mak b/libkvm/config-i386.mak > --- a/libkvm/config-i386.mak > +++ b/libkvm/config-i386.mak > @@ -2,5 +2,6 @@ LIBDIR :=3D /lib > LIBDIR :=3D /lib > CFLAGS +=3D -m32 > CFLAGS +=3D -D__i386__ > +CFLAGS +=3D -DARCH_MMIO_ENDIAN_LITTLE >=20 > libkvm-$(ARCH)-objs :=3D libkvm-x86.o > diff --git a/libkvm/config-ia64.mak b/libkvm/config-ia64.mak > --- a/libkvm/config-ia64.mak > +++ b/libkvm/config-ia64.mak > @@ -1,5 +1,6 @@ >=20 > LIBDIR :=3D /lib > CFLAGS +=3D -D__ia64__ > +CFLAGS +=3D -DARCH_MMIO_ENDIAN_LITTLE -DARCH_MMIO_ENDIAN_BIG >=20 > libkvm-$(ARCH)-objs :=3D libkvm-ia64.o > diff --git a/libkvm/config-powerpc.mak b/libkvm/config-powerpc.mak > --- a/libkvm/config-powerpc.mak > +++ b/libkvm/config-powerpc.mak > @@ -2,5 +2,6 @@ LIBDIR :=3D /lib > LIBDIR :=3D /lib > CFLAGS +=3D -m32 > CFLAGS +=3D -D__powerpc__ > +CFLAGS +=3D -DARCH_MMIO_ENDIAN_BIG -DARCH_MMIO_ENDIAN_LITTLE >=20 > libkvm-$(ARCH)-objs :=3D libkvm-powerpc.o > diff --git a/libkvm/config-x86_64.mak b/libkvm/config-x86_64.mak > --- a/libkvm/config-x86_64.mak > +++ b/libkvm/config-x86_64.mak > @@ -2,5 +2,6 @@ LIBDIR :=3D /lib64 > LIBDIR :=3D /lib64 > CFLAGS +=3D -m64 > CFLAGS +=3D -D__x86_64__ > +CFLAGS +=3D -DARCH_MMIO_ENDIAN_LITTLE >=20 > libkvm-$(ARCH)-objs :=3D libkvm-x86.o > diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c > --- a/libkvm/libkvm.c > +++ b/libkvm/libkvm.c > @@ -774,21 +774,56 @@ int kvm_set_sregs(kvm_context_t kvm, int > return ioctl(kvm->vcpu_fd[vcpu], KVM_SET_SREGS, sregs); > } >=20 > +#ifdef ARCH_MMIO_ENDIAN_BIG > +static int handle_mmio_bigendian(kvm_context_t kvm, struct kvm_run > *kvm_run) +{ > + if (kvm_run->mmio.is_write) > + return kvm->callbacks->mmio_write_be(kvm->opaque, > + kvm_run->mmio.phys_addr, > + kvm_run->mmio.data, > + kvm_run->mmio.len); > + else > + return kvm->callbacks->mmio_read_be(kvm->opaque, > + kvm_run->mmio.phys_addr, > + kvm_run->mmio.data, > + kvm_run->mmio.len); > +} > +#endif > + > +#ifdef ARCH_MMIO_ENDIAN_LITTLE > +static int handle_mmio_littleendian(kvm_context_t kvm, struct > kvm_run *kvm_run) +{ > + if (kvm_run->mmio.is_write) > + return kvm->callbacks->mmio_write_le(kvm->opaque, > + kvm_run->mmio.phys_addr, > + kvm_run->mmio.data, > + kvm_run->mmio.len); > + else > + return kvm->callbacks->mmio_read_le(kvm->opaque, > + kvm_run->mmio.phys_addr, > + kvm_run->mmio.data, > + kvm_run->mmio.len); > +} > +#endif > + > static int handle_mmio(kvm_context_t kvm, struct kvm_run *kvm_run) > { > unsigned long addr =3D kvm_run->mmio.phys_addr; > - void *data =3D kvm_run->mmio.data; >=20 > /* hack: Red Hat 7.1 generates these weird accesses. */ > if ((addr > 0xa0000-4 && addr <=3D 0xa0000) && kvm_run->mmio.len =3D=3D 3) > return 0; >=20 > - if (kvm_run->mmio.is_write) > - return kvm->callbacks->mmio_write(kvm->opaque, addr, data, > - kvm_run->mmio.len); > +#if defined(ARCH_MMIO_ENDIAN_BIG) && defined(ARCH_MMIO_ENDIAN_LITTLE) > + if (kvm_run->mmio.is_bigendian) > + return handle_mmio_bigendian(kvm, kvm_run); > else > - return kvm->callbacks->mmio_read(kvm->opaque, addr, data, > - kvm_run->mmio.len); > + return handle_mmio_littleendian(kvm, kvm_run); > +#elif ARCH_MMIO_ENDIAN_LITTLE > + return handle_mmio_littleendian(kvm, kvm_run); > +#elif ARCH_MMIO_ENDIAN_BIG > + return handle_mmio_bigendian(kvm, kvm_run); > +#endif > } >=20 > int handle_io_window(kvm_context_t kvm) > diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h > --- a/libkvm/libkvm.h > +++ b/libkvm/libkvm.h > @@ -46,11 +46,11 @@ struct kvm_callbacks { > /// For 32bit IO writes from the guest (Usually when executing > 'outl') int (*outl)(void *opaque, uint16_t addr, uint32_t data); > /// generic memory reads to unmapped memory (For MMIO devices) > - int (*mmio_read)(void *opaque, uint64_t addr, uint8_t *data, > - int len); > + int (*mmio_read_be)(void *opaque, uint64_t addr, uint8_t *data, > int len); + int (*mmio_read_le)(void *opaque, uint64_t addr, > uint8_t *data, int len); /// generic memory writes to unmapped > memory (For MMIO devices)=20 > - int (*mmio_write)(void *opaque, uint64_t addr, uint8_t *data, > - int len); > + int (*mmio_write_be)(void *opaque, uint64_t addr, uint8_t *data, > int len); + int (*mmio_write_le)(void *opaque, uint64_t addr, > uint8_t *data, int len); int (*debug)(void *opaque, int vcpu); > /*! > * \brief Called when the VCPU issues an 'hlt' instruction. > diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c > --- a/qemu/qemu-kvm.c > +++ b/qemu/qemu-kvm.c > @@ -21,6 +21,7 @@ int kvm_irqchip =3D 1; > #include <libkvm.h> > #include <pthread.h> > #include <sys/utsname.h> > +#include <endian.h> >=20 > extern void perror(const char *s); >=20 > @@ -533,8 +534,13 @@ static struct kvm_callbacks qemu_kvm_ops > .outb =3D kvm_outb, > .outw =3D kvm_outw, > .outl =3D kvm_outl, > - .mmio_read =3D kvm_mmio_read, > - .mmio_write =3D kvm_mmio_write, > +#if __BYTE_ORDER =3D=3D __LITTLE_ENDIAN > + .mmio_read_le =3D kvm_mmio_read, > + .mmio_write_le =3D kvm_mmio_write, > +#else > + .mmio_read_be =3D kvm_mmio_read, > + .mmio_write_be =3D kvm_mmio_write, > +#endif > .halt =3D kvm_halt, > .shutdown =3D kvm_shutdown, > .io_window =3D kvm_io_window, > diff --git a/user/main-ppc.c b/user/main-ppc.c > --- a/user/main-ppc.c > +++ b/user/main-ppc.c > @@ -92,14 +92,14 @@ static int test_pre_kvm_run(void *opaque > return 0; > } >=20 > -static int test_mem_read(void *opaque, uint64_t addr, uint8_t *data, > int len) +static int test_mem_read_be(void *opaque, uint64_t addr, > uint8_t *data, int len) { > printf("%s: addr %"PRIx64" len %d\n", __func__, addr, len); > memset(data, 0, len); > return 0; > } >=20 > -static int test_mem_write(void *opaque, uint64_t addr, uint8_t > *data, int len) +static int test_mem_write_be(void *opaque, uint64_t > addr, uint8_t *data, int len) { > printf("%s: addr %"PRIx64" len %d data %"PRIx64"\n", > __func__, addr, len, *(uint64_t *)data); > @@ -120,8 +120,8 @@ static int test_dcr_write(kvm_context_t > } >=20 > static struct kvm_callbacks test_callbacks =3D { > - .mmio_read =3D test_mem_read, > - .mmio_write =3D test_mem_write, > + .mmio_read_be =3D test_mem_read_be, > + .mmio_write_be =3D test_mem_write_be, > .debug =3D test_debug, > .halt =3D test_halt, > .io_window =3D test_io_window, > diff --git a/user/main.c b/user/main.c > --- a/user/main.c > +++ b/user/main.c > @@ -389,8 +389,8 @@ static struct kvm_callbacks test_callbac > .outb =3D test_outb, > .outw =3D test_outw, > .outl =3D test_outl, > - .mmio_read =3D test_mem_read, > - .mmio_write =3D test_mem_write, > + .mmio_read_le =3D test_mem_read, > + .mmio_write_le =3D test_mem_write, > .debug =3D test_debug, > .halt =3D test_halt, > .io_window =3D test_io_window, |
From: Hollis B. <ho...@us...> - 2008-01-11 14:56:55
|
We're just talking about a flag in the kvm_run.mmio structure, so it does not represent the state of any software, guest or host, other than that single MMIO access. This flag is only used for communication between host kernel and host userland, so the host kernel is always responsible for setting it. "is_bigendian" is just one more necessary piece of information for MMIO emulation. kvm_run already tells you that you are loading 4 bytes from address 0. What you don't know today is if byte 0 is the least significant or most significant byte. If "is_bigendian" is set, you know that byte 0 is the MSB and byte 3 is the LSB. If not, the opposite is true. In the simplest case, IA64's in-kernel MMIO emulation code could look something like: kvm_run->mmio.phys_addr = addr; kvm_run->mmio.len = len; ... kvm_run->mmio.is_bigendian = vcpu->arch.some_register & BIGENDIAN; If IA64 has reverse-endian load/store instructions like PowerPC, then you would also need to consider the particular instruction used as well as the guest state. -- Hollis Blanchard IBM Linux Technology Center On Fri, 2008-01-11 at 10:02 +0800, Xu, Anthony wrote: > Hi all, > > That's a good start to consider BE. > Yes, IA64 support BE and LE. > > I have below comments. > > What does is_bigendian mean? > Host is runing with BE or guest is running with BE. > Who will set is_bigendian? > > For supporing BE, > We need to consider host BE and guest BE. > For IA64, most OS is running with LE, and > Application can run with BE or LE, > For example, Qemu can run with BE or LE. > > IMHO, we need two flags, > host_is_bigendian indicates Qemu is running with BE > Guest_is_bigendian indecates the guest application who is accessing MMIO > > Is running with LE. > > > - Anthony > > > > > > > Hollis Blanchard wrote: > > On Thu, 2008-01-10 at 17:28 +0200, Avi Kivity wrote: > >> I'll apply that patch (with a #ifdef CONFIG_PPC so other archs don't > >> use it by mistake). > > > > I don't think that's the right ifdef. For example, I believe IA64 can > > run in BE mode and so will have the same issue, and there are > > certainly > > other architectures (less relevant to the current code) that > > definitely > > are in the same situation. > > > > We need to plumb this through to the libkvm users anyways. Take a look > > at the patch below and tell me if you think it's not the right > > approach. > > x86 simply won't consider 'is_bigendian'. I spent a lot of time on > > this, > > and it's by far the cleanest solution I could come up with. > > > > > > > > diff --git a/libkvm/config-i386.mak b/libkvm/config-i386.mak > > --- a/libkvm/config-i386.mak > > +++ b/libkvm/config-i386.mak > > @@ -2,5 +2,6 @@ LIBDIR := /lib > > LIBDIR := /lib > > CFLAGS += -m32 > > CFLAGS += -D__i386__ > > +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE > > > > libkvm-$(ARCH)-objs := libkvm-x86.o > > diff --git a/libkvm/config-ia64.mak b/libkvm/config-ia64.mak > > --- a/libkvm/config-ia64.mak > > +++ b/libkvm/config-ia64.mak > > @@ -1,5 +1,6 @@ > > > > LIBDIR := /lib > > CFLAGS += -D__ia64__ > > +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE -DARCH_MMIO_ENDIAN_BIG > > > > libkvm-$(ARCH)-objs := libkvm-ia64.o > > diff --git a/libkvm/config-powerpc.mak b/libkvm/config-powerpc.mak > > --- a/libkvm/config-powerpc.mak > > +++ b/libkvm/config-powerpc.mak > > @@ -2,5 +2,6 @@ LIBDIR := /lib > > LIBDIR := /lib > > CFLAGS += -m32 > > CFLAGS += -D__powerpc__ > > +CFLAGS += -DARCH_MMIO_ENDIAN_BIG -DARCH_MMIO_ENDIAN_LITTLE > > > > libkvm-$(ARCH)-objs := libkvm-powerpc.o > > diff --git a/libkvm/config-x86_64.mak b/libkvm/config-x86_64.mak > > --- a/libkvm/config-x86_64.mak > > +++ b/libkvm/config-x86_64.mak > > @@ -2,5 +2,6 @@ LIBDIR := /lib64 > > LIBDIR := /lib64 > > CFLAGS += -m64 > > CFLAGS += -D__x86_64__ > > +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE > > > > libkvm-$(ARCH)-objs := libkvm-x86.o > > diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c > > --- a/libkvm/libkvm.c > > +++ b/libkvm/libkvm.c > > @@ -774,21 +774,56 @@ int kvm_set_sregs(kvm_context_t kvm, int > > return ioctl(kvm->vcpu_fd[vcpu], KVM_SET_SREGS, sregs); > > } > > > > +#ifdef ARCH_MMIO_ENDIAN_BIG > > +static int handle_mmio_bigendian(kvm_context_t kvm, struct kvm_run > > *kvm_run) +{ > > + if (kvm_run->mmio.is_write) > > + return kvm->callbacks->mmio_write_be(kvm->opaque, > > + > kvm_run->mmio.phys_addr, > > + kvm_run->mmio.data, > > + kvm_run->mmio.len); > > + else > > + return kvm->callbacks->mmio_read_be(kvm->opaque, > > + > kvm_run->mmio.phys_addr, > > + kvm_run->mmio.data, > > + kvm_run->mmio.len); > > +} > > +#endif > > + > > +#ifdef ARCH_MMIO_ENDIAN_LITTLE > > +static int handle_mmio_littleendian(kvm_context_t kvm, struct > > kvm_run *kvm_run) +{ > > + if (kvm_run->mmio.is_write) > > + return kvm->callbacks->mmio_write_le(kvm->opaque, > > + > kvm_run->mmio.phys_addr, > > + kvm_run->mmio.data, > > + kvm_run->mmio.len); > > + else > > + return kvm->callbacks->mmio_read_le(kvm->opaque, > > + > kvm_run->mmio.phys_addr, > > + kvm_run->mmio.data, > > + kvm_run->mmio.len); > > +} > > +#endif > > + > > static int handle_mmio(kvm_context_t kvm, struct kvm_run *kvm_run) > > { > > unsigned long addr = kvm_run->mmio.phys_addr; > > - void *data = kvm_run->mmio.data; > > > > /* hack: Red Hat 7.1 generates these weird accesses. */ > > if ((addr > 0xa0000-4 && addr <= 0xa0000) && kvm_run->mmio.len > == 3) > > return 0; > > > > - if (kvm_run->mmio.is_write) > > - return kvm->callbacks->mmio_write(kvm->opaque, addr, > data, > > - kvm_run->mmio.len); > > +#if defined(ARCH_MMIO_ENDIAN_BIG) && defined(ARCH_MMIO_ENDIAN_LITTLE) > > + if (kvm_run->mmio.is_bigendian) > > + return handle_mmio_bigendian(kvm, kvm_run); > > else > > - return kvm->callbacks->mmio_read(kvm->opaque, addr, > data, > > - kvm_run->mmio.len); > > + return handle_mmio_littleendian(kvm, kvm_run); > > +#elif ARCH_MMIO_ENDIAN_LITTLE > > + return handle_mmio_littleendian(kvm, kvm_run); > > +#elif ARCH_MMIO_ENDIAN_BIG > > + return handle_mmio_bigendian(kvm, kvm_run); > > +#endif > > } > > > > int handle_io_window(kvm_context_t kvm) > > diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h > > --- a/libkvm/libkvm.h > > +++ b/libkvm/libkvm.h > > @@ -46,11 +46,11 @@ struct kvm_callbacks { > > /// For 32bit IO writes from the guest (Usually when executing > > 'outl') int (*outl)(void *opaque, uint16_t addr, uint32_t data); > > /// generic memory reads to unmapped memory (For MMIO devices) > > - int (*mmio_read)(void *opaque, uint64_t addr, uint8_t *data, > > - int len); > > + int (*mmio_read_be)(void *opaque, uint64_t addr, uint8_t *data, > > int len); + int (*mmio_read_le)(void *opaque, uint64_t addr, > > uint8_t *data, int len); /// generic memory writes to unmapped > > memory (For MMIO devices) > > - int (*mmio_write)(void *opaque, uint64_t addr, uint8_t *data, > > - int len); > > + int (*mmio_write_be)(void *opaque, uint64_t addr, uint8_t *data, > > int len); + int (*mmio_write_le)(void *opaque, uint64_t addr, > > uint8_t *data, int len); int (*debug)(void *opaque, int vcpu); > > /*! > > * \brief Called when the VCPU issues an 'hlt' instruction. > > diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c > > --- a/qemu/qemu-kvm.c > > +++ b/qemu/qemu-kvm.c > > @@ -21,6 +21,7 @@ int kvm_irqchip = 1; > > #include <libkvm.h> > > #include <pthread.h> > > #include <sys/utsname.h> > > +#include <endian.h> > > > > extern void perror(const char *s); > > > > @@ -533,8 +534,13 @@ static struct kvm_callbacks qemu_kvm_ops > > .outb = kvm_outb, > > .outw = kvm_outw, > > .outl = kvm_outl, > > - .mmio_read = kvm_mmio_read, > > - .mmio_write = kvm_mmio_write, > > +#if __BYTE_ORDER == __LITTLE_ENDIAN > > + .mmio_read_le = kvm_mmio_read, > > + .mmio_write_le = kvm_mmio_write, > > +#else > > + .mmio_read_be = kvm_mmio_read, > > + .mmio_write_be = kvm_mmio_write, > > +#endif > > .halt = kvm_halt, > > .shutdown = kvm_shutdown, > > .io_window = kvm_io_window, > > diff --git a/user/main-ppc.c b/user/main-ppc.c > > --- a/user/main-ppc.c > > +++ b/user/main-ppc.c > > @@ -92,14 +92,14 @@ static int test_pre_kvm_run(void *opaque > > return 0; > > } > > > > -static int test_mem_read(void *opaque, uint64_t addr, uint8_t *data, > > int len) +static int test_mem_read_be(void *opaque, uint64_t addr, > > uint8_t *data, int len) { > > printf("%s: addr %"PRIx64" len %d\n", __func__, addr, len); > > memset(data, 0, len); > > return 0; > > } > > > > -static int test_mem_write(void *opaque, uint64_t addr, uint8_t > > *data, int len) +static int test_mem_write_be(void *opaque, uint64_t > > addr, uint8_t *data, int len) { > > printf("%s: addr %"PRIx64" len %d data %"PRIx64"\n", > > __func__, addr, len, *(uint64_t *)data); > > @@ -120,8 +120,8 @@ static int test_dcr_write(kvm_context_t > > } > > > > static struct kvm_callbacks test_callbacks = { > > - .mmio_read = test_mem_read, > > - .mmio_write = test_mem_write, > > + .mmio_read_be = test_mem_read_be, > > + .mmio_write_be = test_mem_write_be, > > .debug = test_debug, > > .halt = test_halt, > > .io_window = test_io_window, > > diff --git a/user/main.c b/user/main.c > > --- a/user/main.c > > +++ b/user/main.c > > @@ -389,8 +389,8 @@ static struct kvm_callbacks test_callbac > > .outb = test_outb, > > .outw = test_outw, > > .outl = test_outl, > > - .mmio_read = test_mem_read, > > - .mmio_write = test_mem_write, > > + .mmio_read_le = test_mem_read, > > + .mmio_write_le = test_mem_write, > > .debug = test_debug, > > .halt = test_halt, > > .io_window = test_io_window, > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > kvm-devel mailing list > kvm...@li... > https://lists.sourceforge.net/lists/listinfo/kvm-devel |
From: Xu, A. <ant...@in...> - 2008-01-14 05:46:17
|
> kvm_run->mmio.is_bigendian =3D vcpu->arch.some_register & >From your example code, I can know is_bigendian indicate whether guest is in bigendian mode when accessing MMIO. Qemu is responsible for handling MMIO emulation, Qemus always assume it is running on littleendian mode. So if guest accesses MMIO in bigendian mode, kvm need to transform it to littleendian before delivering this MMIO request to Qemu. This works for IA64 side. Thanks, - Anthony Hollis Blanchard wrote: > We're just talking about a flag in the kvm_run.mmio structure, so it > does not represent the state of any software, guest or host, other > than that single MMIO access. >=20 > This flag is only used for communication between host kernel and host > userland, so the host kernel is always responsible for setting it. >=20 > "is_bigendian" is just one more necessary piece of information for > MMIO emulation. kvm_run already tells you that you are loading 4 > bytes from address 0. What you don't know today is if byte 0 is the > least significant or most significant byte. If "is_bigendian" is set, > you know that byte 0 is the MSB and byte 3 is the LSB. If not, the > opposite is true. >=20 > In the simplest case, IA64's in-kernel MMIO emulation code could look > something like: > kvm_run->mmio.phys_addr =3D addr; > kvm_run->mmio.len =3D len; > ... > kvm_run->mmio.is_bigendian =3D vcpu->arch.some_register & > BIGENDIAN; >=20 > If IA64 has reverse-endian load/store instructions like PowerPC, then > you would also need to consider the particular instruction used as > well as the guest state. >=20 >=20 > On Fri, 2008-01-11 at 10:02 +0800, Xu, Anthony wrote: >> Hi all, >>=20 >> That's a good start to consider BE. >> Yes, IA64 support BE and LE. >>=20 >> I have below comments. >>=20 >> What does is_bigendian mean? >> Host is runing with BE or guest is running with BE. >> Who will set is_bigendian? >>=20 >> For supporing BE, >> We need to consider host BE and guest BE. >> For IA64, most OS is running with LE, and >> Application can run with BE or LE, >> For example, Qemu can run with BE or LE. >>=20 >> IMHO, we need two flags, >> host_is_bigendian indicates Qemu is running with BE >> Guest_is_bigendian indecates the guest application who is accessing >> MMIO=20 >>=20 >> Is running with LE. >>=20 >>=20 >> - Anthony >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> Hollis Blanchard wrote: >>> On Thu, 2008-01-10 at 17:28 +0200, Avi Kivity wrote: >>>> I'll apply that patch (with a #ifdef CONFIG_PPC so other archs >>>> don't use it by mistake). >>>=20 >>> I don't think that's the right ifdef. For example, I believe IA64 >>> can run in BE mode and so will have the same issue, and there are >>> certainly other architectures (less relevant to the current code) >>> that definitely are in the same situation. >>>=20 >>> We need to plumb this through to the libkvm users anyways. Take a >>> look at the patch below and tell me if you think it's not the right >>> approach. x86 simply won't consider 'is_bigendian'. I spent a lot >>> of time on this, and it's by far the cleanest solution I could come >>> up with.=20 >>>=20 >>>=20 >>>=20 >>> diff --git a/libkvm/config-i386.mak b/libkvm/config-i386.mak --- >>> a/libkvm/config-i386.mak +++ b/libkvm/config-i386.mak >>> @@ -2,5 +2,6 @@ LIBDIR :=3D /lib >>> LIBDIR :=3D /lib >>> CFLAGS +=3D -m32 >>> CFLAGS +=3D -D__i386__ >>> +CFLAGS +=3D -DARCH_MMIO_ENDIAN_LITTLE >>>=20 >>> libkvm-$(ARCH)-objs :=3D libkvm-x86.o >>> diff --git a/libkvm/config-ia64.mak b/libkvm/config-ia64.mak --- >>> a/libkvm/config-ia64.mak +++ b/libkvm/config-ia64.mak >>> @@ -1,5 +1,6 @@ >>>=20 >>> LIBDIR :=3D /lib >>> CFLAGS +=3D -D__ia64__ >>> +CFLAGS +=3D -DARCH_MMIO_ENDIAN_LITTLE -DARCH_MMIO_ENDIAN_BIG >>>=20 >>> libkvm-$(ARCH)-objs :=3D libkvm-ia64.o >>> diff --git a/libkvm/config-powerpc.mak b/libkvm/config-powerpc.mak >>> --- a/libkvm/config-powerpc.mak >>> +++ b/libkvm/config-powerpc.mak >>> @@ -2,5 +2,6 @@ LIBDIR :=3D /lib >>> LIBDIR :=3D /lib >>> CFLAGS +=3D -m32 >>> CFLAGS +=3D -D__powerpc__ >>> +CFLAGS +=3D -DARCH_MMIO_ENDIAN_BIG -DARCH_MMIO_ENDIAN_LITTLE >>>=20 >>> libkvm-$(ARCH)-objs :=3D libkvm-powerpc.o >>> diff --git a/libkvm/config-x86_64.mak b/libkvm/config-x86_64.mak >>> --- a/libkvm/config-x86_64.mak >>> +++ b/libkvm/config-x86_64.mak >>> @@ -2,5 +2,6 @@ LIBDIR :=3D /lib64 >>> LIBDIR :=3D /lib64 >>> CFLAGS +=3D -m64 >>> CFLAGS +=3D -D__x86_64__ >>> +CFLAGS +=3D -DARCH_MMIO_ENDIAN_LITTLE >>>=20 >>> libkvm-$(ARCH)-objs :=3D libkvm-x86.o >>> diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c >>> --- a/libkvm/libkvm.c >>> +++ b/libkvm/libkvm.c >>> @@ -774,21 +774,56 @@ int kvm_set_sregs(kvm_context_t kvm, int >>> return ioctl(kvm->vcpu_fd[vcpu], KVM_SET_SREGS, sregs); } >>>=20 >>> +#ifdef ARCH_MMIO_ENDIAN_BIG >>> +static int handle_mmio_bigendian(kvm_context_t kvm, struct kvm_run >>> *kvm_run) +{ + if (kvm_run->mmio.is_write) >>> + return kvm->callbacks->mmio_write_be(kvm->opaque, >>> + >> kvm_run->mmio.phys_addr, >>> + kvm_run->mmio.data, >>> + kvm_run->mmio.len); >>> + else >>> + return kvm->callbacks->mmio_read_be(kvm->opaque, >>> + >> kvm_run->mmio.phys_addr, >>> + kvm_run->mmio.data, >>> + kvm_run->mmio.len); >>> +} >>> +#endif >>> + >>> +#ifdef ARCH_MMIO_ENDIAN_LITTLE >>> +static int handle_mmio_littleendian(kvm_context_t kvm, struct >>> kvm_run *kvm_run) +{ + if (kvm_run->mmio.is_write) >>> + return kvm->callbacks->mmio_write_le(kvm->opaque, >>> + >> kvm_run->mmio.phys_addr, >>> + kvm_run->mmio.data, >>> + kvm_run->mmio.len); >>> + else >>> + return kvm->callbacks->mmio_read_le(kvm->opaque, >>> + >> kvm_run->mmio.phys_addr, >>> + kvm_run->mmio.data, >>> + kvm_run->mmio.len); >>> +} >>> +#endif >>> + >>> static int handle_mmio(kvm_context_t kvm, struct kvm_run *kvm_run) >>> { unsigned long addr =3D kvm_run->mmio.phys_addr; >>> - void *data =3D kvm_run->mmio.data; >>>=20 >>> /* hack: Red Hat 7.1 generates these weird accesses. */ >>> if ((addr > 0xa0000-4 && addr <=3D 0xa0000) && kvm_run->mmio.len =3D=3D >>> 3) return 0;=20 >>>=20 >>> - if (kvm_run->mmio.is_write) >>> - return kvm->callbacks->mmio_write(kvm->opaque, addr, data, >>> - kvm_run->mmio.len); >>> +#if defined(ARCH_MMIO_ENDIAN_BIG) && >>> defined(ARCH_MMIO_ENDIAN_LITTLE) + if (kvm_run->mmio.is_bigendian) >>> + return handle_mmio_bigendian(kvm, kvm_run); >>> else >>> - return kvm->callbacks->mmio_read(kvm->opaque, addr, data, >>> - kvm_run->mmio.len); >>> + return handle_mmio_littleendian(kvm, kvm_run); >>> +#elif ARCH_MMIO_ENDIAN_LITTLE >>> + return handle_mmio_littleendian(kvm, kvm_run); >>> +#elif ARCH_MMIO_ENDIAN_BIG >>> + return handle_mmio_bigendian(kvm, kvm_run); >>> +#endif >>> } >>>=20 >>> int handle_io_window(kvm_context_t kvm) >>> diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h >>> --- a/libkvm/libkvm.h >>> +++ b/libkvm/libkvm.h >>> @@ -46,11 +46,11 @@ struct kvm_callbacks { >>> /// For 32bit IO writes from the guest (Usually when executing >>> 'outl') int (*outl)(void *opaque, uint16_t addr, uint32_t >>> data); /// generic memory reads to unmapped memory (For MMIO >>> devices)=20 >>> - int (*mmio_read)(void *opaque, uint64_t addr, uint8_t *data, >>> - int len); >>> + int (*mmio_read_be)(void *opaque, uint64_t addr, uint8_t *data, >>> int len); + int (*mmio_read_le)(void *opaque, uint64_t addr, >>> uint8_t *data, int len); /// generic memory writes to unmapped >>> memory (For MMIO devices)=20 >>> - int (*mmio_write)(void *opaque, uint64_t addr, uint8_t *data, >>> - int len); >>> + int (*mmio_write_be)(void *opaque, uint64_t addr, uint8_t >>> *data, int len); + int (*mmio_write_le)(void *opaque, uint64_t >>> addr, uint8_t *data, int len); int (*debug)(void *opaque, int >>> vcpu); /*! * \brief Called when the VCPU issues an 'hlt' >>> instruction.=20 >>> diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c >>> --- a/qemu/qemu-kvm.c >>> +++ b/qemu/qemu-kvm.c >>> @@ -21,6 +21,7 @@ int kvm_irqchip =3D 1; >>> #include <libkvm.h> >>> #include <pthread.h> >>> #include <sys/utsname.h> >>> +#include <endian.h> >>>=20 >>> extern void perror(const char *s); >>>=20 >>> @@ -533,8 +534,13 @@ static struct kvm_callbacks qemu_kvm_ops =20 >>> .outb =3D kvm_outb, .outw =3D kvm_outw, >>> .outl =3D kvm_outl, >>> - .mmio_read =3D kvm_mmio_read, >>> - .mmio_write =3D kvm_mmio_write, >>> +#if __BYTE_ORDER =3D=3D __LITTLE_ENDIAN >>> + .mmio_read_le =3D kvm_mmio_read, >>> + .mmio_write_le =3D kvm_mmio_write, >>> +#else >>> + .mmio_read_be =3D kvm_mmio_read, >>> + .mmio_write_be =3D kvm_mmio_write, >>> +#endif >>> .halt =3D kvm_halt, >>> .shutdown =3D kvm_shutdown, >>> .io_window =3D kvm_io_window, >>> diff --git a/user/main-ppc.c b/user/main-ppc.c >>> --- a/user/main-ppc.c >>> +++ b/user/main-ppc.c >>> @@ -92,14 +92,14 @@ static int test_pre_kvm_run(void *opaque=20 >>> return 0; } >>>=20 >>> -static int test_mem_read(void *opaque, uint64_t addr, uint8_t >>> *data, int len) +static int test_mem_read_be(void *opaque, uint64_t >>> addr, uint8_t *data, int len) { printf("%s: addr %"PRIx64" len >>> %d\n", __func__, addr, len); memset(data, 0, len); return 0; >>> } >>>=20 >>> -static int test_mem_write(void *opaque, uint64_t addr, uint8_t >>> *data, int len) +static int test_mem_write_be(void *opaque, uint64_t >>> addr, uint8_t *data, int len) { >>> printf("%s: addr %"PRIx64" len %d data %"PRIx64"\n", >>> __func__, addr, len, *(uint64_t *)data); >>> @@ -120,8 +120,8 @@ static int test_dcr_write(kvm_context_t } >>>=20 >>> static struct kvm_callbacks test_callbacks =3D { >>> - .mmio_read =3D test_mem_read, >>> - .mmio_write =3D test_mem_write, >>> + .mmio_read_be =3D test_mem_read_be, >>> + .mmio_write_be =3D test_mem_write_be, >>> .debug =3D test_debug, >>> .halt =3D test_halt, >>> .io_window =3D test_io_window, >>> diff --git a/user/main.c b/user/main.c >>> --- a/user/main.c >>> +++ b/user/main.c >>> @@ -389,8 +389,8 @@ static struct kvm_callbacks test_callbac=20 >>> .outb =3D test_outb, .outw =3D test_outw, >>> .outl =3D test_outl, >>> - .mmio_read =3D test_mem_read, >>> - .mmio_write =3D test_mem_write, >>> + .mmio_read_le =3D test_mem_read, >>> + .mmio_write_le =3D test_mem_write, >>> .debug =3D test_debug, >>> .halt =3D test_halt, >>> .io_window =3D test_io_window, >>=20 >> ------------------------------------------------------------------------ - >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketp lace >> _______________________________________________ >> kvm-devel mailing list >> kvm...@li... >> https://lists.sourceforge.net/lists/listinfo/kvm-devel |
From: Hollis B. <ho...@us...> - 2008-01-14 16:54:16
|
I hope I've explained in the other mail I just sent why Qemu assuming little-endian for everything is not OK. One other important clarification: kvm_run->mmio.is_bigendian is about *one* *particular* *MMIO* *access*. It has only coincidental relationship to the "endianness mode" of the guest. -- Hollis Blanchard IBM Linux Technology Center On Mon, 2008-01-14 at 13:42 +0800, Xu, Anthony wrote: > > kvm_run->mmio.is_bigendian = vcpu->arch.some_register & > From your example code, I can know is_bigendian indicate whether guest > is in bigendian mode when accessing MMIO. > > Qemu is responsible for handling MMIO emulation, Qemus always assume it > is running on littleendian mode. > So if guest accesses MMIO in bigendian mode, > kvm need to transform it to littleendian before delivering this > MMIO request to Qemu. > > > This works for IA64 side. > > > Thanks, > - Anthony > > > Hollis Blanchard wrote: > > We're just talking about a flag in the kvm_run.mmio structure, so it > > does not represent the state of any software, guest or host, other > > than that single MMIO access. > > > > This flag is only used for communication between host kernel and host > > userland, so the host kernel is always responsible for setting it. > > > > "is_bigendian" is just one more necessary piece of information for > > MMIO emulation. kvm_run already tells you that you are loading 4 > > bytes from address 0. What you don't know today is if byte 0 is the > > least significant or most significant byte. If "is_bigendian" is set, > > you know that byte 0 is the MSB and byte 3 is the LSB. If not, the > > opposite is true. > > > > In the simplest case, IA64's in-kernel MMIO emulation code could look > > something like: > > kvm_run->mmio.phys_addr = addr; > > kvm_run->mmio.len = len; > > ... > > kvm_run->mmio.is_bigendian = vcpu->arch.some_register & > > BIGENDIAN; > > > > If IA64 has reverse-endian load/store instructions like PowerPC, then > > you would also need to consider the particular instruction used as > > well as the guest state. > > > > > > On Fri, 2008-01-11 at 10:02 +0800, Xu, Anthony wrote: > >> Hi all, > >> > >> That's a good start to consider BE. > >> Yes, IA64 support BE and LE. > >> > >> I have below comments. > >> > >> What does is_bigendian mean? > >> Host is runing with BE or guest is running with BE. > >> Who will set is_bigendian? > >> > >> For supporing BE, > >> We need to consider host BE and guest BE. > >> For IA64, most OS is running with LE, and > >> Application can run with BE or LE, > >> For example, Qemu can run with BE or LE. > >> > >> IMHO, we need two flags, > >> host_is_bigendian indicates Qemu is running with BE > >> Guest_is_bigendian indecates the guest application who is accessing > >> MMIO > >> > >> Is running with LE. > >> > >> > >> - Anthony > >> > >> > >> > >> > >> > >> > >> Hollis Blanchard wrote: > >>> On Thu, 2008-01-10 at 17:28 +0200, Avi Kivity wrote: > >>>> I'll apply that patch (with a #ifdef CONFIG_PPC so other archs > >>>> don't use it by mistake). > >>> > >>> I don't think that's the right ifdef. For example, I believe IA64 > >>> can run in BE mode and so will have the same issue, and there are > >>> certainly other architectures (less relevant to the current code) > >>> that definitely are in the same situation. > >>> > >>> We need to plumb this through to the libkvm users anyways. Take a > >>> look at the patch below and tell me if you think it's not the right > >>> approach. x86 simply won't consider 'is_bigendian'. I spent a lot > >>> of time on this, and it's by far the cleanest solution I could come > >>> up with. > >>> > >>> > >>> > >>> diff --git a/libkvm/config-i386.mak b/libkvm/config-i386.mak --- > >>> a/libkvm/config-i386.mak +++ b/libkvm/config-i386.mak > >>> @@ -2,5 +2,6 @@ LIBDIR := /lib > >>> LIBDIR := /lib > >>> CFLAGS += -m32 > >>> CFLAGS += -D__i386__ > >>> +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE > >>> > >>> libkvm-$(ARCH)-objs := libkvm-x86.o > >>> diff --git a/libkvm/config-ia64.mak b/libkvm/config-ia64.mak --- > >>> a/libkvm/config-ia64.mak +++ b/libkvm/config-ia64.mak > >>> @@ -1,5 +1,6 @@ > >>> > >>> LIBDIR := /lib > >>> CFLAGS += -D__ia64__ > >>> +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE -DARCH_MMIO_ENDIAN_BIG > >>> > >>> libkvm-$(ARCH)-objs := libkvm-ia64.o > >>> diff --git a/libkvm/config-powerpc.mak b/libkvm/config-powerpc.mak > >>> --- a/libkvm/config-powerpc.mak > >>> +++ b/libkvm/config-powerpc.mak > >>> @@ -2,5 +2,6 @@ LIBDIR := /lib > >>> LIBDIR := /lib > >>> CFLAGS += -m32 > >>> CFLAGS += -D__powerpc__ > >>> +CFLAGS += -DARCH_MMIO_ENDIAN_BIG -DARCH_MMIO_ENDIAN_LITTLE > >>> > >>> libkvm-$(ARCH)-objs := libkvm-powerpc.o > >>> diff --git a/libkvm/config-x86_64.mak b/libkvm/config-x86_64.mak > >>> --- a/libkvm/config-x86_64.mak > >>> +++ b/libkvm/config-x86_64.mak > >>> @@ -2,5 +2,6 @@ LIBDIR := /lib64 > >>> LIBDIR := /lib64 > >>> CFLAGS += -m64 > >>> CFLAGS += -D__x86_64__ > >>> +CFLAGS += -DARCH_MMIO_ENDIAN_LITTLE > >>> > >>> libkvm-$(ARCH)-objs := libkvm-x86.o > >>> diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c > >>> --- a/libkvm/libkvm.c > >>> +++ b/libkvm/libkvm.c > >>> @@ -774,21 +774,56 @@ int kvm_set_sregs(kvm_context_t kvm, int > >>> return ioctl(kvm->vcpu_fd[vcpu], KVM_SET_SREGS, sregs); } > >>> > >>> +#ifdef ARCH_MMIO_ENDIAN_BIG > >>> +static int handle_mmio_bigendian(kvm_context_t kvm, struct kvm_run > >>> *kvm_run) +{ + if (kvm_run->mmio.is_write) > >>> + return kvm->callbacks->mmio_write_be(kvm->opaque, > >>> + > >> kvm_run->mmio.phys_addr, > >>> + kvm_run->mmio.data, > >>> + kvm_run->mmio.len); > >>> + else > >>> + return kvm->callbacks->mmio_read_be(kvm->opaque, > >>> + > >> kvm_run->mmio.phys_addr, > >>> + kvm_run->mmio.data, > >>> + kvm_run->mmio.len); > >>> +} > >>> +#endif > >>> + > >>> +#ifdef ARCH_MMIO_ENDIAN_LITTLE > >>> +static int handle_mmio_littleendian(kvm_context_t kvm, struct > >>> kvm_run *kvm_run) +{ + if (kvm_run->mmio.is_write) > >>> + return kvm->callbacks->mmio_write_le(kvm->opaque, > >>> + > >> kvm_run->mmio.phys_addr, > >>> + kvm_run->mmio.data, > >>> + kvm_run->mmio.len); > >>> + else > >>> + return kvm->callbacks->mmio_read_le(kvm->opaque, > >>> + > >> kvm_run->mmio.phys_addr, > >>> + kvm_run->mmio.data, > >>> + kvm_run->mmio.len); > >>> +} > >>> +#endif > >>> + > >>> static int handle_mmio(kvm_context_t kvm, struct kvm_run *kvm_run) > >>> { unsigned long addr = kvm_run->mmio.phys_addr; > >>> - void *data = kvm_run->mmio.data; > >>> > >>> /* hack: Red Hat 7.1 generates these weird accesses. */ > >>> if ((addr > 0xa0000-4 && addr <= 0xa0000) && kvm_run->mmio.len > == > >>> 3) return 0; > >>> > >>> - if (kvm_run->mmio.is_write) > >>> - return kvm->callbacks->mmio_write(kvm->opaque, addr, > data, > >>> - kvm_run->mmio.len); > >>> +#if defined(ARCH_MMIO_ENDIAN_BIG) && > >>> defined(ARCH_MMIO_ENDIAN_LITTLE) + if (kvm_run->mmio.is_bigendian) > >>> + return handle_mmio_bigendian(kvm, kvm_run); > >>> else > >>> - return kvm->callbacks->mmio_read(kvm->opaque, addr, > data, > >>> - kvm_run->mmio.len); > >>> + return handle_mmio_littleendian(kvm, kvm_run); > >>> +#elif ARCH_MMIO_ENDIAN_LITTLE > >>> + return handle_mmio_littleendian(kvm, kvm_run); > >>> +#elif ARCH_MMIO_ENDIAN_BIG > >>> + return handle_mmio_bigendian(kvm, kvm_run); > >>> +#endif > >>> } > >>> > >>> int handle_io_window(kvm_context_t kvm) > >>> diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h > >>> --- a/libkvm/libkvm.h > >>> +++ b/libkvm/libkvm.h > >>> @@ -46,11 +46,11 @@ struct kvm_callbacks { > >>> /// For 32bit IO writes from the guest (Usually when executing > >>> 'outl') int (*outl)(void *opaque, uint16_t addr, uint32_t > >>> data); /// generic memory reads to unmapped memory (For MMIO > >>> devices) > >>> - int (*mmio_read)(void *opaque, uint64_t addr, uint8_t *data, > >>> - int len); > >>> + int (*mmio_read_be)(void *opaque, uint64_t addr, uint8_t *data, > >>> int len); + int (*mmio_read_le)(void *opaque, uint64_t addr, > >>> uint8_t *data, int len); /// generic memory writes to unmapped > >>> memory (For MMIO devices) > >>> - int (*mmio_write)(void *opaque, uint64_t addr, uint8_t *data, > >>> - int len); > >>> + int (*mmio_write_be)(void *opaque, uint64_t addr, uint8_t > >>> *data, int len); + int (*mmio_write_le)(void *opaque, uint64_t > >>> addr, uint8_t *data, int len); int (*debug)(void *opaque, int > >>> vcpu); /*! * \brief Called when the VCPU issues an > 'hlt' > >>> instruction. > >>> diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c > >>> --- a/qemu/qemu-kvm.c > >>> +++ b/qemu/qemu-kvm.c > >>> @@ -21,6 +21,7 @@ int kvm_irqchip = 1; > >>> #include <libkvm.h> > >>> #include <pthread.h> > >>> #include <sys/utsname.h> > >>> +#include <endian.h> > >>> > >>> extern void perror(const char *s); > >>> > >>> @@ -533,8 +534,13 @@ static struct kvm_callbacks qemu_kvm_ops > >>> .outb = kvm_outb, .outw = kvm_outw, > >>> .outl = kvm_outl, > >>> - .mmio_read = kvm_mmio_read, > >>> - .mmio_write = kvm_mmio_write, > >>> +#if __BYTE_ORDER == __LITTLE_ENDIAN > >>> + .mmio_read_le = kvm_mmio_read, > >>> + .mmio_write_le = kvm_mmio_write, > >>> +#else > >>> + .mmio_read_be = kvm_mmio_read, > >>> + .mmio_write_be = kvm_mmio_write, > >>> +#endif > >>> .halt = kvm_halt, > >>> .shutdown = kvm_shutdown, > >>> .io_window = kvm_io_window, > >>> diff --git a/user/main-ppc.c b/user/main-ppc.c > >>> --- a/user/main-ppc.c > >>> +++ b/user/main-ppc.c > >>> @@ -92,14 +92,14 @@ static int test_pre_kvm_run(void *opaque > >>> return 0; } > >>> > >>> -static int test_mem_read(void *opaque, uint64_t addr, uint8_t > >>> *data, int len) +static int test_mem_read_be(void *opaque, uint64_t > >>> addr, uint8_t *data, int len) { printf("%s: addr %"PRIx64" len > >>> %d\n", __func__, addr, len); memset(data, 0, len); return 0; > >>> } > >>> > >>> -static int test_mem_write(void *opaque, uint64_t addr, uint8_t > >>> *data, int len) +static int test_mem_write_be(void *opaque, uint64_t > >>> addr, uint8_t *data, int len) { > >>> printf("%s: addr %"PRIx64" len %d data %"PRIx64"\n", > >>> __func__, addr, len, *(uint64_t *)data); > >>> @@ -120,8 +120,8 @@ static int test_dcr_write(kvm_context_t } > >>> > >>> static struct kvm_callbacks test_callbacks = { > >>> - .mmio_read = test_mem_read, > >>> - .mmio_write = test_mem_write, > >>> + .mmio_read_be = test_mem_read_be, > >>> + .mmio_write_be = test_mem_write_be, > >>> .debug = test_debug, > >>> .halt = test_halt, > >>> .io_window = test_io_window, > >>> diff --git a/user/main.c b/user/main.c > >>> --- a/user/main.c > >>> +++ b/user/main.c > >>> @@ -389,8 +389,8 @@ static struct kvm_callbacks test_callbac > >>> .outb = test_outb, .outw = test_outw, > >>> .outl = test_outl, > >>> - .mmio_read = test_mem_read, > >>> - .mmio_write = test_mem_write, > >>> + .mmio_read_le = test_mem_read, > >>> + .mmio_write_le = test_mem_write, > >>> .debug = test_debug, > >>> .halt = test_halt, > >>> .io_window = test_io_window, > >> > >> > ------------------------------------------------------------------------ > - > >> Check out the new SourceForge.net Marketplace. > >> It's the best place to buy or sell services for > >> just about anything Open Source. > >> > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketp > lace > >> _______________________________________________ > >> kvm-devel mailing list > >> kvm...@li... > >> https://lists.sourceforge.net/lists/listinfo/kvm-devel |
From: Hollis B. <ho...@us...> - 2008-01-14 16:53:16
|
On Sun, 2008-01-13 at 11:42 +0200, Avi Kivity wrote: > Do we really need to propagate endianness all the way to the user? > Perhaps libkvm could call the regular mmio functions and do the > transformation itself. > > Or maybe even the kernel can do this by itself? The kernel *already* does this by itself, and I'm attempting to explain why that is not sufficient. My point is precisely that the endianness information must be propagated to the user, otherwise the user may not have all the information it needs to emulate it. Here is the concrete example: * guest writes to MMIO * KVM passes MMIO information (physical address, number of bytes, value) to qemu * Qemu knows from the address that this access is for a passthough device, a special case the administrator has pre-configured * Qemu does mmap(/dev/mem), and writes "length" bytes of "value" at offset "address". Now here's the catch: what endianness does qemu use when doing the write? If qemu only does BE, then a LE access from the guest will be byte-reversed when presented to the real hardware. -- Hollis Blanchard IBM Linux Technology Center |
From: Hollis B. <ho...@us...> - 2008-01-14 20:44:18
|
On Mon, 2008-01-14 at 19:30 +0200, Avi Kivity wrote: > Hollis Blanchard wrote: > > On Sun, 2008-01-13 at 11:42 +0200, Avi Kivity wrote: > > > > > >> Do we really need to propagate endianness all the way to the user? > >> Perhaps libkvm could call the regular mmio functions and do the > >> transformation itself. > >> > >> Or maybe even the kernel can do this by itself? > >> > > > > The kernel *already* does this by itself, and I'm attempting to explain > > why that is not sufficient. > > > > My point is precisely that the endianness information must be propagated > > to the user, otherwise the user may not have all the information it > > needs to emulate it. > > > > Here is the concrete example: > > * guest writes to MMIO > > * KVM passes MMIO information (physical address, number of bytes, > > value) to qemu > > * Qemu knows from the address that this access is for a passthough > > device, a special case the administrator has pre-configured > > * Qemu does mmap(/dev/mem), and writes "length" bytes of "value" > > at offset "address". > > > > Now here's the catch: what endianness does qemu use when doing the > > write? If qemu only does BE, then a LE access from the guest will be > > byte-reversed when presented to the real hardware. > > > > Right, I forgot that bit when replying. > > So, the data is always in natural endianness? If so, we can add an > endianness indicator to the mmio callbacks, which qemu will mostly > ignore, and only examine it on passthrough. "Natural endianness" is not a well-defined term, since many architectures can do both, and change on the fly. The data will be the contents of a guest register, so I guess you could say it is in "guest endianness" (as determined by a guest control register). Of course, there's no necessary connection between guest endianness and qemu endianness. We always need to know if the data requires byteswapping or not, regardless of passthrough. MMIO accesses can be issued in either endianness to an emulated device too. > btw, isn't passthrough better handled through the tlb? i.e. actually > let the guest access the specially-configured memory? You can have qemu > mmap /dev/mem and install it as a memslot, and things should work, no? > (well, you might need to set some cachablility flag or other). Hmm, yes you're right. Of course, qemu offers greater flexibility than MMUs (which are limited to page-sized granularity, for example), so it might still be useful to have qemu intercede. Since we're defining a stable ABI, I'd rather have the information present than miss it in the future... -- Hollis Blanchard IBM Linux Technology Center |
From: Avi K. <av...@qu...> - 2008-01-10 15:28:09
|
Hollis Blanchard wrote: > On Thu, 2008-01-10 at 08:56 +0200, Avi Kivity wrote: > >> IIRC endianness is a per-page attribute on ppc, no? Otherwise you'd >> have a global attribute instead of per-access. >> > > The MMU in some PowerPC can have per-page endianness, but not all. On a > processor that supports this attribute, I expect that when an MMIO trap > occurs we'll need to inspect the guest MMU state in order to set the > is_bigendian flag correctly. > > The real issue I'm looking at right now is byte-reversed loads and > stores. For example, "lwzx" (Load Word and Zero Indexed) does a > big-endian 4-byte load, while "lwbrx" (Load Word Byte-Reverse Indexed) > does a little-endian 4-byte load. These instructions exist on all > PowerPC, and they can be issued at any time and do not depend on MMU > mappings. > Okay, so it's at instruction granularity _and_ page granularity. I'll apply that patch (with a #ifdef CONFIG_PPC so other archs don't use it by mistake). -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-01-13 09:42:32
|
Hollis Blanchard wrote: > On Thu, 2008-01-10 at 17:28 +0200, Avi Kivity wrote: > >> I'll apply that patch (with a #ifdef CONFIG_PPC so other archs don't >> use it by mistake). >> > > I don't think that's the right ifdef. For example, I believe IA64 can > run in BE mode and so will have the same issue, and there are certainly > other architectures (less relevant to the current code) that definitely > are in the same situation. > > We need to plumb this through to the libkvm users anyways. Take a look > at the patch below and tell me if you think it's not the right approach. > x86 simply won't consider 'is_bigendian'. I spent a lot of time on this, > and it's by far the cleanest solution I could come up with. > > > > > +#ifdef ARCH_MMIO_ENDIAN_BIG > +static int handle_mmio_bigendian(kvm_context_t kvm, struct kvm_run *kvm_run) > +{ > + if (kvm_run->mmio.is_write) > + return kvm->callbacks->mmio_write_be(kvm->opaque, > + kvm_run->mmio.phys_addr, > + kvm_run->mmio.data, > + kvm_run->mmio.len); > + else > + return kvm->callbacks->mmio_read_be(kvm->opaque, > + kvm_run->mmio.phys_addr, > + kvm_run->mmio.data, > + kvm_run->mmio.len); > +} > +#endif > Do we really need to propagate endianness all the way to the user? Perhaps libkvm could call the regular mmio functions and do the transformation itself. Or maybe even the kernel can do this by itself? -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-01-14 17:30:40
|
Hollis Blanchard wrote: > On Sun, 2008-01-13 at 11:42 +0200, Avi Kivity wrote: > > >> Do we really need to propagate endianness all the way to the user? >> Perhaps libkvm could call the regular mmio functions and do the >> transformation itself. >> >> Or maybe even the kernel can do this by itself? >> > > The kernel *already* does this by itself, and I'm attempting to explain > why that is not sufficient. > > My point is precisely that the endianness information must be propagated > to the user, otherwise the user may not have all the information it > needs to emulate it. > > Here is the concrete example: > * guest writes to MMIO > * KVM passes MMIO information (physical address, number of bytes, > value) to qemu > * Qemu knows from the address that this access is for a passthough > device, a special case the administrator has pre-configured > * Qemu does mmap(/dev/mem), and writes "length" bytes of "value" > at offset "address". > > Now here's the catch: what endianness does qemu use when doing the > write? If qemu only does BE, then a LE access from the guest will be > byte-reversed when presented to the real hardware. > Right, I forgot that bit when replying. So, the data is always in natural endianness? If so, we can add an endianness indicator to the mmio callbacks, which qemu will mostly ignore, and only examine it on passthrough. btw, isn't passthrough better handled through the tlb? i.e. actually let the guest access the specially-configured memory? You can have qemu mmap /dev/mem and install it as a memslot, and things should work, no? (well, you might need to set some cachablility flag or other). -- error compiling committee.c: too many arguments to function |
From: Xu, A. <ant...@in...> - 2008-01-15 02:45:22
|
> Here is the concrete example: > * guest writes to MMIO > * KVM passes MMIO information (physical address, number of > bytes, value) to qemu The value is saved in memory, is it bigendian or littleendian? > * Qemu knows from the address that this access is for a > passthough device, a special case the administrator has > pre-configured * Qemu does mmap(/dev/mem), and writes "length" When qemu writes value, Can qemu know what mode(bigendian/littleendian it is running)? Qemu can run on bigendian in IA64. > bytes of "value" at offset "address". >=20 Hollis Blanchard wrote: > On Sun, 2008-01-13 at 11:42 +0200, Avi Kivity wrote: >=20 >> Do we really need to propagate endianness all the way to the user? >> Perhaps libkvm could call the regular mmio functions and do the >> transformation itself.=20 >>=20 >> Or maybe even the kernel can do this by itself? >=20 > The kernel *already* does this by itself, and I'm attempting to > explain why that is not sufficient. >=20 > My point is precisely that the endianness information must be > propagated to the user, otherwise the user may not have all the > information it needs to emulate it. >=20 > Here is the concrete example: > * guest writes to MMIO > * KVM passes MMIO information (physical address, number of > bytes, value) to qemu > * Qemu knows from the address that this access is for a > passthough device, a special case the administrator has > pre-configured * Qemu does mmap(/dev/mem), and writes "length" > bytes of "value" at offset "address". >=20 > Now here's the catch: what endianness does qemu use when doing the > write? If qemu only does BE, then a LE access from the guest will be > byte-reversed when presented to the real hardware. |
From: Hollis B. <ho...@us...> - 2008-01-15 03:54:55
|
On Tue, 2008-01-15 at 10:43 +0800, Xu, Anthony wrote: > > Here is the concrete example: > > * guest writes to MMIO > > * KVM passes MMIO information (physical address, number of > > bytes, value) to qemu > The value is saved in memory, is it bigendian or > littleendian? The value in memory is copied from the value in the register when the guest was executing, so its format is probably dependent on the state of a control register. > > * Qemu knows from the address that this access is for a > > passthough device, a special case the administrator has > > pre-configured * Qemu does mmap(/dev/mem), and writes "length" > > When qemu writes value, Can qemu know what > mode(bigendian/littleendian it is running)? > Qemu can run on bigendian in IA64. /usr/include/endian.h will #define __BYTE_ORDER as either __LITTLE_ENDIAN or __BIG_ENDIAN. I have no idea if this is defined in a standard or is glibc-specific. You could also test at runtime with a construct like: union { int i; char c[4]; } u; u.i = 1; if (u.c[0] == 1) { ... } else { ... } -- Hollis Blanchard IBM Linux Technology Center |
From: Avi K. <av...@qu...> - 2008-01-15 14:57:46
|
Hollis Blanchard wrote: >> btw, isn't passthrough better handled through the tlb? i.e. actually >> let the guest access the specially-configured memory? You can have qemu >> mmap /dev/mem and install it as a memslot, and things should work, no? >> (well, you might need to set some cachablility flag or other). >> > > Hmm, yes you're right. Of course, qemu offers greater flexibility than > MMUs (which are limited to page-sized granularity, for example), so it > might still be useful to have qemu intercede. > > With the endian-aware instructions that doesn't matter, since you set the endianness on a per-instruction granularity. And with guest tlb controlled endianness, surely you get page granularity as well? > Since we're defining a stable ABI, I'd rather have the information > present than miss it in the future... > > So now the question is, do we see the need for qemu to intercept writes to pass-through devices? IMO the answer is no. If it doesn't understand anything about the device, it would be better off doing a real pass through. If it does understand the device, it should know which endianness it likes. -- error compiling committee.c: too many arguments to function |
From: Hollis B. <ho...@us...> - 2008-01-15 16:23:21
|
On Tue, 2008-01-15 at 16:57 +0200, Avi Kivity wrote: > Hollis Blanchard wrote: > >> btw, isn't passthrough better handled through the tlb? i.e. actually > >> let the guest access the specially-configured memory? You can have qemu > >> mmap /dev/mem and install it as a memslot, and things should work, no? > >> (well, you might need to set some cachablility flag or other). > >> > > > > Hmm, yes you're right. Of course, qemu offers greater flexibility than > > MMUs (which are limited to page-sized granularity, for example), so it > > might still be useful to have qemu intercede. > > > > > > With the endian-aware instructions that doesn't matter, since you set > the endianness on a per-instruction granularity. And with guest tlb > controlled endianness, surely you get page granularity as well? > > > > Since we're defining a stable ABI, I'd rather have the information > > present than miss it in the future... > > So now the question is, do we see the need for qemu to intercept writes > to pass-through devices? IMO the answer is no. If it doesn't > understand anything about the device, it would be better off doing a > real pass through. If it does understand the device, it should know > which endianness it likes. OK, I'm willing to go along with this, and hope that we don't run into another use case for an endianness flag in the future. -- Hollis Blanchard IBM Linux Technology Center |