You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(33) |
Nov
(325) |
Dec
(320) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(484) |
Feb
(438) |
Mar
(407) |
Apr
(713) |
May
(831) |
Jun
(806) |
Jul
(1023) |
Aug
(1184) |
Sep
(1118) |
Oct
(1461) |
Nov
(1224) |
Dec
(1042) |
2008 |
Jan
(1449) |
Feb
(1110) |
Mar
(1428) |
Apr
(1643) |
May
(682) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jan L. <jl...@la...> - 2008-05-04 22:39:09
|
On Fri, 2008-05-02 at 13:12 +0300, Avi Kivity wrote: > > Call Trace: > > [<c0146d5b>] kmem_cache_create+0x15e/0x410 > > Code: c3 57 56 53 89 c6 9c 5f fa 8b 08 83 39 00 74 12 c7 41 0c 01 00 00 > > 00 8b 01 > > 48 89 01 8b 5c 81 10 eb 07 e8 a5 fb ff ff 89 c3 57 9d <0f> 0d 0b 90 85 > > db 74 1b > > 8b 56 10 31 c0 89 d1 c1 e9 02 89 df f3 > > EIP: [<c01467be>] kmem_cache_zalloc+0x2a/0x53 SS:ESP 0068:c030ff80 > > <0>Kernel panic - not syncing: Attempted to kill the idle task! > > > 0f 0d 0b prefetchw (%ebx) > > This is an AMD 3Dnow! instruction, which is not supported on Intel > processors. I guess the 3Dnow! cpuid bit leaked in via the qemu merge. > > I guess two fixes are needed: > - remove the 3Dnow! bit > - add emulation for prefetchw (easy, as it doesn't need to do anything) > to support live migration from AMD to Intel This problem still occours with kvm-68. Which CPUs will be affected by this (is it only the Core Duo)? I'm currently delaying the upload of a new kvm package to debian because of this. Thanks, Jan |
From: Dor L. <dor...@qu...> - 2008-05-04 22:16:37
|
On Sun, 2008-05-04 at 15:21 -0500, Anthony Liguori wrote: > Normally, tap always reads packets and simply lets the client drop them if it > cannot receive them. For virtio-net, this results in massive packet loss and > about an 80% performance loss in TCP throughput. > > This patch modifies qemu_send_packet() to only deliver a packet to a VLAN > client if it doesn't have a fd_can_read method or the fd_can_read method > indicates that it can receive packets. We also return a status of whether > any clients were able to receive the packet. > > If no clients were able to receive a packet, we buffer the packet until a > client indicates that it can receive packets again. > > This patch also modifies the tap code to only read from the tap fd if at least > one client on the VLAN is able to receive a packet. > > Finally, this patch changes the tap code to drain all possible packets from > the tap device when the tap fd is readable. > > Signed-off-by: Anthony Liguori <ali...@us...> Patchset looks good and reduces some nasty hacks. It probably also improves other devices like e1000 et al. Cheers, Dor |
From: Andrea A. <an...@qu...> - 2008-05-04 22:08:29
|
On Sun, May 04, 2008 at 02:13:45PM -0500, Robin Holt wrote: > > diff --git a/mm/Kconfig b/mm/Kconfig > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -205,3 +205,6 @@ config VIRT_TO_BUS > > config VIRT_TO_BUS > > def_bool y > > depends on !ARCH_NO_VIRT_TO_BUS > > + > > +config MMU_NOTIFIER > > + bool > > Without some text following the bool keyword, I am not even asked for > this config setting on my ia64 build. Yes, this was explicitly asked by Andrew after his review. This is the explanation pasted from the changelog. 3) It'd be a waste to add branches in the VM if nobody could possibly run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of mmu notifiers, but this already allows to compile a KVM external module against a kernel with mmu notifiers enabled and from the next pull from kvm.git we'll start using them. And GRU/XPMEM will also be able to continue the development by enabling KVM=m in their config, until they submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n). This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM are all =n. |
From: Anthony L. <ali...@us...> - 2008-05-04 20:24:07
|
While it has served us well, it is long overdue that we eliminate the virtio-net tap hack. It turns out that zero-copy has very little impact on performance. The tap hack was gaining such a significant performance boost not because of zero-copy, but because it avoided dropping packets on receive which is apparently a significant problem with the tap implementation in QEMU. Patches 3 and 4 in this series address the packet dropping issue and the net result is a 25% boost in RX performance even in the absence of zero-copy. Also worth mentioning, is that this makes merging virtio into upstream QEMU significantly easier. Since v1, we're just rebasing on the new io thread patch set. This series depends on my IO thread series. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/hw/pc.h b/qemu/hw/pc.h index 57d2123..f5157bd 100644 --- a/qemu/hw/pc.h +++ b/qemu/hw/pc.h @@ -154,7 +154,6 @@ void isa_ne2000_init(int base, qemu_irq irq, NICInfo *nd); /* virtio-net.c */ PCIDevice *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn); -void virtio_net_poll(void); /* virtio-blk.h */ void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device, diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c index f727b14..8d26832 100644 --- a/qemu/hw/virtio-net.c +++ b/qemu/hw/virtio-net.c @@ -13,7 +13,6 @@ #include "virtio.h" #include "net.h" -#include "pc.h" #include "qemu-timer.h" /* from Linux's virtio_net.h */ @@ -62,15 +61,10 @@ typedef struct VirtIONet VirtQueue *tx_vq; VLANClientState *vc; int can_receive; - int tap_fd; - struct VirtIONet *next; - int do_notify; QEMUTimer *tx_timer; int tx_timer_active; } VirtIONet; -static VirtIONet *VirtIONetHead = NULL; - static VirtIONet *to_virtio_net(VirtIODevice *vdev) { return (VirtIONet *)vdev; @@ -105,7 +99,6 @@ static int virtio_net_can_receive(void *opaque) return (n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK) && n->can_receive; } -/* -net user receive function */ static void virtio_net_receive(void *opaque, const uint8_t *buf, int size) { VirtIONet *n = opaque; @@ -144,87 +137,6 @@ static void virtio_net_receive(void *opaque, const uint8_t *buf, int size) virtio_notify(&n->vdev, n->rx_vq); } -/* -net tap receive handler */ -void virtio_net_poll(void) -{ - VirtIONet *vnet; - int len; - fd_set rfds; - struct timeval tv; - int max_fd = -1; - VirtQueueElement elem; - struct virtio_net_hdr *hdr; - int did_notify; - - FD_ZERO(&rfds); - tv.tv_sec = 0; - tv.tv_usec = 0; - - while (1) { - - // Prepare the set of device to select from - for (vnet = VirtIONetHead; vnet; vnet = vnet->next) { - - if (vnet->tap_fd == -1) - continue; - - vnet->do_notify = 0; - //first check if the driver is ok - if (!virtio_net_can_receive(vnet)) - continue; - - /* FIXME: the drivers really need to set their status better */ - if (vnet->rx_vq->vring.avail == NULL) { - vnet->can_receive = 0; - continue; - } - - FD_SET(vnet->tap_fd, &rfds); - if (max_fd < vnet->tap_fd) max_fd = vnet->tap_fd; - } - - if (select(max_fd + 1, &rfds, NULL, NULL, &tv) <= 0) - break; - - // Now check who has data pending in the tap - for (vnet = VirtIONetHead; vnet; vnet = vnet->next) { - - if (!FD_ISSET(vnet->tap_fd, &rfds)) - continue; - - if (virtqueue_pop(vnet->rx_vq, &elem) == 0) { - vnet->can_receive = 0; - continue; - } - - hdr = (void *)elem.in_sg[0].iov_base; - hdr->flags = 0; - hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; -again: - len = readv(vnet->tap_fd, &elem.in_sg[1], elem.in_num - 1); - if (len == -1) { - if (errno == EINTR || errno == EAGAIN) - goto again; - else - fprintf(stderr, "reading network error %d", len); - } - virtqueue_push(vnet->rx_vq, &elem, sizeof(*hdr) + len); - vnet->do_notify = 1; - } - - /* signal other side */ - did_notify = 0; - for (vnet = VirtIONetHead; vnet; vnet = vnet->next) - if (vnet->do_notify) { - virtio_notify(&vnet->vdev, vnet->rx_vq); - did_notify++; - } - if (!did_notify) - break; - } - -} - /* TX */ static void virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq) { @@ -303,12 +215,6 @@ PCIDevice *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn) memcpy(n->mac, nd->macaddr, 6); n->vc = qemu_new_vlan_client(nd->vlan, virtio_net_receive, virtio_net_can_receive, n); - n->tap_fd = hack_around_tap(n->vc->vlan->first_client); - if (n->tap_fd != -1) { - n->next = VirtIONetHead; - //push the device on top of the list - VirtIONetHead = n; - } n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n); n->tx_timer_active = 0; diff --git a/qemu/vl.c b/qemu/vl.c index bcf893f..b8ce485 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -3966,15 +3966,8 @@ typedef struct TAPState { VLANClientState *vc; int fd; char down_script[1024]; - int no_poll; } TAPState; -static int tap_read_poll(void *opaque) -{ - TAPState *s = opaque; - return (!s->no_poll); -} - static void tap_receive(void *opaque, const uint8_t *buf, int size) { TAPState *s = opaque; @@ -4008,22 +4001,6 @@ static void tap_send(void *opaque) } } -int hack_around_tap(void *opaque) -{ - VLANClientState *vc = opaque; - TAPState *ts = vc->opaque; - - if (vc->fd_read != tap_receive) - return -1; - - if (ts) { - ts->no_poll = 1; - return ts->fd; - } - - return -1; -} - /* fd support */ static TAPState *net_tap_fd_init(VLANState *vlan, int fd) @@ -4034,10 +4011,8 @@ static TAPState *net_tap_fd_init(VLANState *vlan, int fd) if (!s) return NULL; s->fd = fd; - s->no_poll = 0; - enable_sigio_timer(fd); s->vc = qemu_new_vlan_client(vlan, tap_receive, NULL, s); - qemu_set_fd_handler2(s->fd, tap_read_poll, tap_send, NULL, s); + qemu_set_fd_handler2(s->fd, NULL, tap_send, NULL, s); snprintf(s->vc->info_str, sizeof(s->vc->info_str), "tap: fd=%d", fd); return s; } @@ -7972,10 +7947,7 @@ void main_loop_wait(int timeout) slirp_select_poll(&rfds, &wfds, &xfds); } #endif - virtio_net_poll(); - qemu_aio_poll(); - if (vm_running) { qemu_run_timers(&active_timers[QEMU_TIMER_VIRTUAL], qemu_get_clock(vm_clock)); |
From: Anthony L. <ali...@us...> - 2008-05-04 20:22:22
|
In the final patch of this series, we rely on a VLAN client's fd_can_read method to avoid dropping packets. Unfortunately, virtio's fd_can_read method is not very accurate at the moment. This patch addresses this. It also generates a notification to the IO thread when more RX packets become available. If we say we can't receive a packet because no RX buffers are available, this may result in the tap file descriptor not being select()'d. Without notifying the IO thread, we may have to wait until the select() times out before we can receive a packet (even if there is one pending). This particular change makes RX performance very consistent. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c index 8d26832..5538979 100644 --- a/qemu/hw/virtio-net.c +++ b/qemu/hw/virtio-net.c @@ -14,6 +14,7 @@ #include "virtio.h" #include "net.h" #include "qemu-timer.h" +#include "qemu-kvm.h" /* from Linux's virtio_net.h */ @@ -60,11 +61,14 @@ typedef struct VirtIONet VirtQueue *rx_vq; VirtQueue *tx_vq; VLANClientState *vc; - int can_receive; QEMUTimer *tx_timer; int tx_timer_active; } VirtIONet; +/* TODO + * - we could suppress RX interrupt if we were so inclined. + */ + static VirtIONet *to_virtio_net(VirtIODevice *vdev) { return (VirtIONet *)vdev; @@ -88,15 +92,24 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev) static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq) { - VirtIONet *n = to_virtio_net(vdev); - n->can_receive = 1; + /* We now have RX buffers, signal to the IO thread to break out of the + select to re-poll the tap file descriptor */ + if (kvm_enabled()) + qemu_kvm_notify_work(); } static int virtio_net_can_receive(void *opaque) { VirtIONet *n = opaque; - return (n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK) && n->can_receive; + if (n->rx_vq->vring.avail == NULL || + !(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)) + return 0; + + if (n->rx_vq->vring.avail->idx == n->rx_vq->last_avail_idx) + return 0; + + return 1; } static void virtio_net_receive(void *opaque, const uint8_t *buf, int size) @@ -106,15 +119,8 @@ static void virtio_net_receive(void *opaque, const uint8_t *buf, int size) struct virtio_net_hdr *hdr; int offset, i; - /* FIXME: the drivers really need to set their status better */ - if (n->rx_vq->vring.avail == NULL) { - n->can_receive = 0; - return; - } - if (virtqueue_pop(n->rx_vq, &elem) == 0) { - /* wait until the guest adds some rx bufs */ - n->can_receive = 0; + fprintf(stderr, "virtio_net: this should not happen\n"); return; } @@ -209,9 +215,8 @@ PCIDevice *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn) n->vdev.update_config = virtio_net_update_config; n->vdev.get_features = virtio_net_get_features; - n->rx_vq = virtio_add_queue(&n->vdev, 512, virtio_net_handle_rx); + n->rx_vq = virtio_add_queue(&n->vdev, 128, virtio_net_handle_rx); n->tx_vq = virtio_add_queue(&n->vdev, 128, virtio_net_handle_tx); - n->can_receive = 0; memcpy(n->mac, nd->macaddr, 6); n->vc = qemu_new_vlan_client(nd->vlan, virtio_net_receive, virtio_net_can_receive, n); |
From: Anthony L. <ali...@us...> - 2008-05-04 20:21:47
|
The select() in the IO thread may wait a long time before rebuilding the fd set. Whenever we do something that changes the fd set, we should interrupt the IO thread. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/vl.c b/qemu/vl.c index 1192759..e9f0ca4 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -260,6 +260,16 @@ static int event_pending = 1; #define TFR(expr) do { if ((expr) != -1) break; } while (errno == EINTR) +/* KVM runs the main loop in a separate thread. If we update one of the lists + * that are polled before or after select(), we need to make sure to break out + * of the select() to ensure the new item is serviced. + */ +static void main_loop_break(void) +{ + if (kvm_enabled()) + qemu_kvm_notify_work(); +} + void decorate_application_name(char *appname, int max_len) { if (kvm_enabled()) @@ -5680,6 +5690,7 @@ int qemu_set_fd_handler2(int fd, ioh->opaque = opaque; ioh->deleted = 0; } + main_loop_break(); return 0; } @@ -7606,8 +7617,7 @@ void qemu_bh_schedule(QEMUBH *bh) if (env) { cpu_interrupt(env, CPU_INTERRUPT_EXIT); } - if (kvm_enabled()) - qemu_kvm_notify_work(); + main_loop_break(); } void qemu_bh_cancel(QEMUBH *bh) |
From: Anthony L. <ali...@us...> - 2008-05-04 20:21:37
|
QEMU is rather aggressive about exhausting the wait period when selecting. This is fine when the wait period is low and when there is significant delays in-between selects as it improves IO throughput. With the IO thread, there is a very small delay between selects and our wait period for select is very large. This patch changes main_loop_wait to only select once before doing the various other things in the main loop. This generally improves responsiveness of things like SDL but also improves individual file descriptor throughput quite dramatically. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c index e16b261..6a90e68 100644 --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -423,24 +423,6 @@ void qemu_kvm_notify_work(void) fprintf(stderr, "failed to notify io thread\n"); } -static int received_signal; - -/* QEMU relies on periodically breaking out of select via EINTR to poll for IO - and timer signals. Since we're now using a file descriptor to handle - signals, select() won't be interrupted by a signal. We need to forcefully - break the select() loop when a signal is received hence - kvm_check_received_signal(). */ - -int kvm_check_received_signal(void) -{ - if (received_signal) { - received_signal = 0; - return 1; - } - - return 0; -} - /* If we have signalfd, we mask out the signals we want to handle and then * use signalfd to listen for them. We rely on whatever the current signal * handler is to dispatch the signals when we receive them. @@ -474,8 +456,6 @@ static void sigfd_handler(void *opaque) pthread_cond_signal(&qemu_aio_cond); } } - - received_signal = 1; } /* Used to break IO thread out of select */ @@ -497,8 +477,6 @@ static void io_thread_wakeup(void *opaque) offset += len; } - - received_signal = 1; } int kvm_main_loop(void) diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h index e1e461a..34aabd2 100644 --- a/qemu/qemu-kvm.h +++ b/qemu/qemu-kvm.h @@ -114,15 +114,6 @@ static inline void kvm_sleep_end(void) kvm_mutex_lock(); } -int kvm_check_received_signal(void); - -static inline int kvm_received_signal(void) -{ - if (kvm_enabled()) - return kvm_check_received_signal(); - return 0; -} - #if !defined(SYS_signalfd) struct signalfd_siginfo { uint32_t ssi_signo; diff --git a/qemu/vl.c b/qemu/vl.c index e9f0ca4..6935a82 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -7946,23 +7946,18 @@ void main_loop_wait(int timeout) slirp_select_fill(&nfds, &rfds, &wfds, &xfds); } #endif - moreio: ret = qemu_select(nfds + 1, &rfds, &wfds, &xfds, &tv); if (ret > 0) { IOHandlerRecord **pioh; - int more = 0; for(ioh = first_io_handler; ioh != NULL; ioh = ioh->next) { if (!ioh->deleted && ioh->fd_read && FD_ISSET(ioh->fd, &rfds)) { ioh->fd_read(ioh->opaque); - if (!ioh->fd_read_poll || ioh->fd_read_poll(ioh->opaque)) - more = 1; - else + if (!(ioh->fd_read_poll && ioh->fd_read_poll(ioh->opaque))) FD_CLR(ioh->fd, &rfds); } if (!ioh->deleted && ioh->fd_write && FD_ISSET(ioh->fd, &wfds)) { ioh->fd_write(ioh->opaque); - more = 1; } } @@ -7976,8 +7971,6 @@ void main_loop_wait(int timeout) } else pioh = &ioh->next; } - if (more && !kvm_received_signal()) - goto moreio; } #if defined(CONFIG_SLIRP) if (slirp_inited) { |
From: Anthony L. <ali...@us...> - 2008-05-04 20:21:23
|
Normally, tap always reads packets and simply lets the client drop them if it cannot receive them. For virtio-net, this results in massive packet loss and about an 80% performance loss in TCP throughput. This patch modifies qemu_send_packet() to only deliver a packet to a VLAN client if it doesn't have a fd_can_read method or the fd_can_read method indicates that it can receive packets. We also return a status of whether any clients were able to receive the packet. If no clients were able to receive a packet, we buffer the packet until a client indicates that it can receive packets again. This patch also modifies the tap code to only read from the tap fd if at least one client on the VLAN is able to receive a packet. Finally, this patch changes the tap code to drain all possible packets from the tap device when the tap fd is readable. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/net.h b/qemu/net.h index 13daa27..dfdf9af 100644 --- a/qemu/net.h +++ b/qemu/net.h @@ -29,7 +29,7 @@ VLANClientState *qemu_new_vlan_client(VLANState *vlan, IOCanRWHandler *fd_can_read, void *opaque); int qemu_can_send_packet(VLANClientState *vc); -void qemu_send_packet(VLANClientState *vc, const uint8_t *buf, int size); +int qemu_send_packet(VLANClientState *vc, const uint8_t *buf, int size); void qemu_handler_true(void *opaque); void do_info_network(void); diff --git a/qemu/vl.c b/qemu/vl.c index c51d704..f4abea3 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -3760,10 +3760,11 @@ int qemu_can_send_packet(VLANClientState *vc1) return 0; } -void qemu_send_packet(VLANClientState *vc1, const uint8_t *buf, int size) +int qemu_send_packet(VLANClientState *vc1, const uint8_t *buf, int size) { VLANState *vlan = vc1->vlan; VLANClientState *vc; + int ret = -EAGAIN; #if 0 printf("vlan %d send:\n", vlan->id); @@ -3771,9 +3772,14 @@ void qemu_send_packet(VLANClientState *vc1, const uint8_t *buf, int size) #endif for(vc = vlan->first_client; vc != NULL; vc = vc->next) { if (vc != vc1) { - vc->fd_read(vc->opaque, buf, size); + if (!vc->fd_can_read || vc->fd_can_read(vc->opaque)) { + vc->fd_read(vc->opaque, buf, size); + ret = 0; + } } } + + return ret; } #if defined(CONFIG_SLIRP) @@ -3976,6 +3982,8 @@ typedef struct TAPState { VLANClientState *vc; int fd; char down_script[1024]; + char buf[4096]; + int size; } TAPState; static void tap_receive(void *opaque, const uint8_t *buf, int size) @@ -3991,24 +3999,70 @@ static void tap_receive(void *opaque, const uint8_t *buf, int size) } } +static int tap_can_send(void *opaque) +{ + TAPState *s = opaque; + VLANClientState *vc; + int can_receive = 0; + + /* Check to see if any of our clients can receive a packet */ + for (vc = s->vc->vlan->first_client; vc; vc = vc->next) { + /* Skip ourselves */ + if (vc == s->vc) + continue; + + if (!vc->fd_can_read) { + /* no fd_can_read handler, they always can receive */ + can_receive = 1; + } else + can_receive = vc->fd_can_read(vc->opaque); + + /* Once someone can receive, we try to send a packet */ + if (can_receive) + break; + } + + return can_receive; +} + static void tap_send(void *opaque) { TAPState *s = opaque; - uint8_t buf[4096]; - int size; + /* First try to send any buffered packet */ + if (s->size > 0) { + int err; + + /* If noone can receive the packet, buffer it */ + err = qemu_send_packet(s->vc, s->buf, s->size); + if (err == -EAGAIN) + return; + } + + /* Read packets until we hit EAGAIN */ + do { #ifdef __sun__ - struct strbuf sbuf; - int f = 0; - sbuf.maxlen = sizeof(buf); - sbuf.buf = buf; - size = getmsg(s->fd, NULL, &sbuf, &f) >=0 ? sbuf.len : -1; + struct strbuf sbuf; + int f = 0; + sbuf.maxlen = sizeof(s->buf); + sbuf.buf = s->buf; + s->size = getmsg(s->fd, NULL, &sbuf, &f) >=0 ? sbuf.len : -1; #else - size = read(s->fd, buf, sizeof(buf)); + s->size = read(s->fd, s->buf, sizeof(s->buf)); #endif - if (size > 0) { - qemu_send_packet(s->vc, buf, size); - } + + if (s->size == -1 && errno == EINTR) + continue; + + if (s->size > 0) { + int err; + + /* If noone can receive the packet, buffer it */ + err = qemu_send_packet(s->vc, s->buf, s->size); + if (err == -EAGAIN) + break; + } + } while (s->size > 0); } /* fd support */ @@ -4022,7 +4076,7 @@ static TAPState *net_tap_fd_init(VLANState *vlan, int fd) return NULL; s->fd = fd; s->vc = qemu_new_vlan_client(vlan, tap_receive, NULL, s); - qemu_set_fd_handler2(s->fd, NULL, tap_send, NULL, s); + qemu_set_fd_handler2(s->fd, tap_can_send, tap_send, NULL, s); snprintf(s->vc->info_str, sizeof(s->vc->info_str), "tap: fd=%d", fd); return s; } |
From: Anthony L. <ali...@us...> - 2008-05-04 20:20:52
|
This patch reworks the IO thread to use signalfd() instead of sigtimedwait(). This will eliminate the need to use SIGIO everywhere. In this version of the patch, we use signalfd() when it's available. When it isn't available, we create a separate thread and use sigwaitinfo() to simulate signalfd(). We cannot handle thread-specific signals with signalfd() emulation so also replace SIGUSR1 notifications to the io-thread with an eventfd. Since eventfd isn't always available, use pipe() to emulate eventfd. I've tested Windows and Linux guests with SMP without seeing an obvious regressions. Signed-off-by: Anthony Liguori <ali...@us...> diff --git a/qemu/Makefile.target b/qemu/Makefile.target index 2316c92..db6912e 100644 --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -203,7 +203,7 @@ CPPFLAGS+=-I$(SRC_PATH)/tcg/sparc endif ifeq ($(USE_KVM), 1) -LIBOBJS+=qemu-kvm.o +LIBOBJS+=qemu-kvm.o kvm-compatfd.o endif ifdef CONFIG_SOFTFLOAT LIBOBJS+=fpu/softfloat.o diff --git a/qemu/kvm-compatfd.c b/qemu/kvm-compatfd.c new file mode 100644 index 0000000..3c2be28 --- /dev/null +++ b/qemu/kvm-compatfd.c @@ -0,0 +1,127 @@ +/* + * signalfd/eventfd compatibility + * + * Copyright IBM, Corp. 2008 + * + * Authors: + * Anthony Liguori <ali...@us...> + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include "qemu-common.h" +#include "qemu-kvm.h" + +#include <sys/syscall.h> +#include <pthread.h> + +struct sigfd_compat_info +{ + sigset_t mask; + int fd; +}; + +static void *sigwait_compat(void *opaque) +{ + struct sigfd_compat_info *info = opaque; + int err; + + sigprocmask(SIG_BLOCK, &info->mask, NULL); + + do { + siginfo_t siginfo; + + kvm_sleep_begin(); + err = sigwaitinfo(&info->mask, &siginfo); + kvm_sleep_end(); + + if (err == -1 && errno == EINTR) + continue; + + if (err > 0) { + char buffer[128]; + size_t offset = 0; + + memcpy(buffer, &err, sizeof(err)); + while (offset < sizeof(buffer)) { + ssize_t len; + + len = write(info->fd, buffer + offset, + sizeof(buffer) - offset); + if (len == -1 && errno == EINTR) + continue; + + if (len <= 0) { + err = -1; + break; + } + + offset += len; + } + } + } while (err >= 0); + + return NULL; +} + +static int kvm_signalfd_compat(const sigset_t *mask) +{ + pthread_attr_t attr; + pthread_t tid; + struct sigfd_compat_info *info; + int fds[2]; + + info = malloc(sizeof(*info)); + if (info == NULL) { + errno = ENOMEM; + return -1; + } + + if (pipe(fds) == -1) { + free(info); + return -1; + } + + memcpy(&info->mask, mask, sizeof(*mask)); + info->fd = fds[1]; + + pthread_attr_init(&attr); + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); + + pthread_create(&tid, &attr, sigwait_compat, info); + + pthread_attr_destroy(&attr); + + return fds[0]; +} + +int kvm_signalfd(const sigset_t *mask) +{ +#if defined(SYS_signalfd) + int ret; + + ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8); + if (!(ret == -1 && errno == ENOSYS)) + return ret; +#endif + + return kvm_signalfd_compat(mask); +} + +int kvm_eventfd(int *fds) +{ +#if defined(SYS_eventfd) + int ret; + + ret = syscall(SYS_eventfd, 0); + if (ret >= 0) { + fds[0] = fds[1] = ret; + return 0; + } else if (!(ret == -1 && errno == ENOSYS)) + return ret; +#endif + + return pipe(fds); +} diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c index 9a9bf59..e16b261 100644 --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -12,6 +12,9 @@ int kvm_allowed = 1; int kvm_irqchip = 1; int kvm_pit = 1; +#include "qemu-common.h" +#include "console.h" + #include <string.h> #include "hw/hw.h" #include "sysemu.h" @@ -38,14 +41,6 @@ __thread struct vcpu_info *vcpu; static int qemu_system_ready; -struct qemu_kvm_signal_table { - sigset_t sigset; - sigset_t negsigset; -}; - -static struct qemu_kvm_signal_table io_signal_table; -static struct qemu_kvm_signal_table vcpu_signal_table; - #define SIG_IPI (SIGRTMIN+4) struct vcpu_info { @@ -61,6 +56,7 @@ struct vcpu_info { } vcpu_info[256]; pthread_t io_thread; +static int io_thread_fd = -1; static inline unsigned long kvm_get_thread_id(void) { @@ -169,37 +165,23 @@ static int has_work(CPUState *env) return kvm_arch_has_work(env); } -static int kvm_process_signal(int si_signo) -{ - struct sigaction sa; - - switch (si_signo) { - case SIGUSR2: - pthread_cond_signal(&qemu_aio_cond); - break; - case SIGALRM: - case SIGIO: - sigaction(si_signo, NULL, &sa); - sa.sa_handler(si_signo); - break; - } - - return 1; -} - -static int kvm_eat_signal(struct qemu_kvm_signal_table *waitset, CPUState *env, - int timeout) +static int kvm_eat_signal(CPUState *env, int timeout) { struct timespec ts; int r, e, ret = 0; siginfo_t siginfo; + sigset_t waitset; ts.tv_sec = timeout / 1000; ts.tv_nsec = (timeout % 1000) * 1000000; - r = sigtimedwait(&waitset->sigset, &siginfo, &ts); + sigemptyset(&waitset); + sigaddset(&waitset, SIG_IPI); + + r = sigtimedwait(&waitset, &siginfo, &ts); if (r == -1 && (errno == EAGAIN || errno == EINTR) && !timeout) return 0; e = errno; + pthread_mutex_lock(&qemu_mutex); if (env && vcpu) cpu_single_env = vcpu->env; @@ -208,12 +190,12 @@ static int kvm_eat_signal(struct qemu_kvm_signal_table *waitset, CPUState *env, exit(1); } if (r != -1) - ret = kvm_process_signal(siginfo.si_signo); + ret = 1; if (env && vcpu_info[env->cpu_index].stop) { vcpu_info[env->cpu_index].stop = 0; vcpu_info[env->cpu_index].stopped = 1; - pthread_kill(io_thread, SIGUSR1); + qemu_kvm_notify_work(); } pthread_mutex_unlock(&qemu_mutex); @@ -224,14 +206,13 @@ static int kvm_eat_signal(struct qemu_kvm_signal_table *waitset, CPUState *env, static void kvm_eat_signals(CPUState *env, int timeout) { int r = 0; - struct qemu_kvm_signal_table *waitset = &vcpu_signal_table; - while (kvm_eat_signal(waitset, env, 0)) + while (kvm_eat_signal(env, 0)) r = 1; if (!r && timeout) { - r = kvm_eat_signal(waitset, env, timeout); + r = kvm_eat_signal(env, timeout); if (r) - while (kvm_eat_signal(waitset, env, 0)) + while (kvm_eat_signal(env, 0)) ; } } @@ -264,9 +245,7 @@ static void pause_all_threads(void) pthread_kill(vcpu_info[i].thread, SIG_IPI); } while (!all_threads_paused()) { - pthread_mutex_unlock(&qemu_mutex); - kvm_eat_signal(&io_signal_table, NULL, 1000); - pthread_mutex_lock(&qemu_mutex); + main_loop_wait(1000); cpu_single_env = NULL; } } @@ -307,6 +286,12 @@ static void setup_kernel_sigmask(CPUState *env) { sigset_t set; + sigemptyset(&set); + sigaddset(&set, SIGUSR2); + sigaddset(&set, SIGIO); + sigaddset(&set, SIGALRM); + sigprocmask(SIG_BLOCK, &set, NULL); + sigprocmask(SIG_BLOCK, NULL, &set); sigdelset(&set, SIG_IPI); @@ -343,7 +328,7 @@ static int kvm_main_loop_cpu(CPUState *env) cpu_single_env = env; while (1) { while (!has_work(env)) - kvm_main_loop_wait(env, 10); + kvm_main_loop_wait(env, 1000); if (env->interrupt_request & CPU_INTERRUPT_HARD) env->hflags &= ~HF_HALTED_MASK; if (!kvm_irqchip_in_kernel(kvm_context) && info->sipi_needed) @@ -391,18 +376,6 @@ static void *ap_main_loop(void *_env) return NULL; } -static void qemu_kvm_init_signal_table(struct qemu_kvm_signal_table *sigtab) -{ - sigemptyset(&sigtab->sigset); - sigfillset(&sigtab->negsigset); -} - -static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum) -{ - sigaddset(&sigtab->sigset, signum); - sigdelset(&sigtab->negsigset, signum); -} - void kvm_init_new_ap(int cpu, CPUState *env) { pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env); @@ -411,28 +384,12 @@ void kvm_init_new_ap(int cpu, CPUState *env) pthread_cond_wait(&qemu_vcpu_cond, &qemu_mutex); } -static void qemu_kvm_init_signal_tables(void) -{ - qemu_kvm_init_signal_table(&io_signal_table); - qemu_kvm_init_signal_table(&vcpu_signal_table); - - kvm_add_signal(&io_signal_table, SIGIO); - kvm_add_signal(&io_signal_table, SIGALRM); - kvm_add_signal(&io_signal_table, SIGUSR1); - kvm_add_signal(&io_signal_table, SIGUSR2); - - kvm_add_signal(&vcpu_signal_table, SIG_IPI); - - sigprocmask(SIG_BLOCK, &io_signal_table.sigset, NULL); -} - int kvm_init_ap(void) { #ifdef TARGET_I386 kvm_tpr_opt_setup(); #endif qemu_add_vm_change_state_handler(kvm_vm_state_change_handler, NULL); - qemu_kvm_init_signal_tables(); signal(SIG_IPI, sig_ipi_handler); return 0; @@ -440,29 +397,152 @@ int kvm_init_ap(void) void qemu_kvm_notify_work(void) { - if (io_thread) - pthread_kill(io_thread, SIGUSR1); + uint64_t value = 1; + char buffer[8]; + size_t offset = 0; + + if (io_thread_fd == -1) + return; + + memcpy(buffer, &value, sizeof(value)); + + while (offset < 8) { + ssize_t len; + + len = write(io_thread_fd, buffer + offset, 8 - offset); + if (len == -1 && errno == EINTR) + continue; + + if (len <= 0) + break; + + offset += len; + } + + if (offset != 8) + fprintf(stderr, "failed to notify io thread\n"); +} + +static int received_signal; + +/* QEMU relies on periodically breaking out of select via EINTR to poll for IO + and timer signals. Since we're now using a file descriptor to handle + signals, select() won't be interrupted by a signal. We need to forcefully + break the select() loop when a signal is received hence + kvm_check_received_signal(). */ + +int kvm_check_received_signal(void) +{ + if (received_signal) { + received_signal = 0; + return 1; + } + + return 0; } -/* - * The IO thread has all signals that inform machine events - * blocked (io_signal_table), so it won't get interrupted - * while processing in main_loop_wait(). +/* If we have signalfd, we mask out the signals we want to handle and then + * use signalfd to listen for them. We rely on whatever the current signal + * handler is to dispatch the signals when we receive them. */ +static void sigfd_handler(void *opaque) +{ + int fd = (unsigned long)opaque; + struct signalfd_siginfo info; + struct sigaction action; + ssize_t len; + + while (1) { + do { + len = read(fd, &info, sizeof(info)); + } while (len == -1 && errno == EINTR); + + if (len == -1 && errno == EAGAIN) + break; + + if (len != sizeof(info)) { + printf("read from sigfd returned %ld: %m\n", len); + return; + } + + sigaction(info.ssi_signo, NULL, &action); + if (action.sa_handler) + action.sa_handler(info.ssi_signo); + + if (info.ssi_signo == SIGUSR2) { + pthread_cond_signal(&qemu_aio_cond); + } + } + + received_signal = 1; +} + +/* Used to break IO thread out of select */ +static void io_thread_wakeup(void *opaque) +{ + int fd = (unsigned long)opaque; + char buffer[8]; + size_t offset = 0; + + while (offset < 8) { + ssize_t len; + + len = read(fd, buffer + offset, 8 - offset); + if (len == -1 && errno == EINTR) + continue; + + if (len <= 0) + break; + + offset += len; + } + + received_signal = 1; +} + int kvm_main_loop(void) { + int fds[2]; + sigset_t mask; + int sigfd; + io_thread = pthread_self(); qemu_system_ready = 1; - pthread_mutex_unlock(&qemu_mutex); + + if (kvm_eventfd(fds) == -1) { + fprintf(stderr, "failed to create eventfd\n"); + return -errno; + } + + qemu_set_fd_handler2(fds[0], NULL, io_thread_wakeup, NULL, + (void *)(unsigned long)fds[0]); + + io_thread_fd = fds[1]; + + sigemptyset(&mask); + sigaddset(&mask, SIGIO); + sigaddset(&mask, SIGALRM); + sigaddset(&mask, SIGUSR2); + sigprocmask(SIG_BLOCK, &mask, NULL); + + sigfd = kvm_signalfd(&mask); + if (sigfd == -1) { + fprintf(stderr, "failed to create signalfd\n"); + return -errno; + } + + fcntl(sigfd, F_SETFL, O_NONBLOCK); + + qemu_set_fd_handler2(sigfd, NULL, sigfd_handler, NULL, + (void *)(unsigned long)sigfd); pthread_cond_broadcast(&qemu_system_cond); + cpu_single_env = NULL; + while (1) { - kvm_eat_signal(&io_signal_table, NULL, 1000); - pthread_mutex_lock(&qemu_mutex); - cpu_single_env = NULL; - main_loop_wait(0); + main_loop_wait(1000); if (qemu_shutdown_requested()) break; else if (qemu_powerdown_requested()) @@ -471,7 +551,6 @@ int kvm_main_loop(void) pthread_kill(vcpu_info[0].thread, SIG_IPI); qemu_kvm_reset_requested = 1; } - pthread_mutex_unlock(&qemu_mutex); } pause_all_threads(); @@ -834,10 +913,7 @@ void qemu_kvm_aio_wait(void) CPUState *cpu_single = cpu_single_env; if (!cpu_single_env) { - pthread_mutex_unlock(&qemu_mutex); - kvm_eat_signal(&io_signal_table, NULL, 1000); - pthread_mutex_lock(&qemu_mutex); - cpu_single_env = NULL; + main_loop_wait(1000); } else { pthread_cond_wait(&qemu_aio_cond, &qemu_mutex); cpu_single_env = cpu_single; @@ -864,3 +940,14 @@ void kvm_cpu_destroy_phys_mem(target_phys_addr_t start_addr, { kvm_destroy_phys_mem(kvm_context, start_addr, size); } + +void kvm_mutex_unlock(void) +{ + pthread_mutex_unlock(&qemu_mutex); +} + +void kvm_mutex_lock(void) +{ + pthread_mutex_lock(&qemu_mutex); + cpu_single_env = NULL; +} diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h index 024a653..e1e461a 100644 --- a/qemu/qemu-kvm.h +++ b/qemu/qemu-kvm.h @@ -10,6 +10,8 @@ #include "cpu.h" +#include <signal.h> + int kvm_main_loop(void); int kvm_qemu_init(void); int kvm_qemu_create_context(void); @@ -97,4 +99,40 @@ extern kvm_context_t kvm_context; #define qemu_kvm_pit_in_kernel() (0) #endif +void kvm_mutex_unlock(void); +void kvm_mutex_lock(void); + +static inline void kvm_sleep_begin(void) +{ + if (kvm_enabled()) + kvm_mutex_unlock(); +} + +static inline void kvm_sleep_end(void) +{ + if (kvm_enabled()) + kvm_mutex_lock(); +} + +int kvm_check_received_signal(void); + +static inline int kvm_received_signal(void) +{ + if (kvm_enabled()) + return kvm_check_received_signal(); + return 0; +} + +#if !defined(SYS_signalfd) +struct signalfd_siginfo { + uint32_t ssi_signo; + uint8_t pad[124]; +}; +#else +#include <linux/signalfd.h> +#endif + +int kvm_signalfd(const sigset_t *mask); +int kvm_eventfd(int *fds); + #endif diff --git a/qemu/vl.c b/qemu/vl.c index 74be059..1192759 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -7836,6 +7836,23 @@ void qemu_system_powerdown_request(void) cpu_interrupt(cpu_single_env, CPU_INTERRUPT_EXIT); } +static int qemu_select(int max_fd, fd_set *rfds, fd_set *wfds, fd_set *xfds, + struct timeval *tv) +{ + int ret; + + /* KVM holds a mutex while QEMU code is running, we need hooks to + release the mutex whenever QEMU code sleeps. */ + + kvm_sleep_begin(); + + ret = select(max_fd, rfds, wfds, xfds, tv); + + kvm_sleep_end(); + + return ret; +} + void main_loop_wait(int timeout) { IOHandlerRecord *ioh; @@ -7907,11 +7924,12 @@ void main_loop_wait(int timeout) } } - tv.tv_sec = 0; #ifdef _WIN32 + tv.tv_sec = 0; tv.tv_usec = 0; #else - tv.tv_usec = timeout * 1000; + tv.tv_sec = timeout / 1000; + tv.tv_usec = (timeout % 1000) * 1000; #endif #if defined(CONFIG_SLIRP) if (slirp_inited) { @@ -7919,7 +7937,7 @@ void main_loop_wait(int timeout) } #endif moreio: - ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv); + ret = qemu_select(nfds + 1, &rfds, &wfds, &xfds, &tv); if (ret > 0) { IOHandlerRecord **pioh; int more = 0; @@ -7948,7 +7966,7 @@ void main_loop_wait(int timeout) } else pioh = &ioh->next; } - if (more) + if (more && !kvm_received_signal()) goto moreio; } #if defined(CONFIG_SLIRP) |
From: Бухгалтерия <te...@be...> - 2008-05-04 19:54:29
|
Бухгалтeру о договорной работe организации - правовыe основы и налоговый аспeкт 7 мая 2008, г. Мoсква Пpoгpамма сeминаpа 1. Как пpавильнo oфopмить дoгoвop, oбязатeльныe и дoпoлнитeльныe услoвия дoгoвopoв. Кoгда мoжнo считать сoблюдeннoй пpoстую письмeнную фopму дoгoвopа. Кoгда дoгoвop тpeбуeт гoсудаpствeннoй peгистpации или нoтаpиальнoгo завepeния. pамoчныe дoгoвopы. oфepта (oднoстopoннee пpeдлoжeниe заключить сдeлку). Пoдписаниe дoгoвopа. 2. Гаpантийныe услoвия в дoгoвopах - залoг, задатoк, нeустoйка (налoгoвыe пpeимущeства гаpантий пo сpавнeнию с авансами пo дoгoвopу). peгулиpoваниe в дoгoвopах и ╚пo умoлчанию╩ вoпpoсoв вoзмeщeния ущepба oт нeиспoлнeния дoгoвopа. Упущeнная выгoда. Ничтoжнoсть пpoтивoзакoнных услoвий сдeлoк и ee налoгoвыe пoслeдствия. 3. pазpeшeниe спopoв пo дoгoвopам. Пpeтeнзиoнная pабoта. Пpизнаниe дoлга сoмнитeльным, бeзнадeжным, списаниe дoлга. Сpoки искoвoй давнoсти. 4. Цeна в дoгoвope - спoсoбы oбoзначeния (дoлжна ли быть указана oпpeдeлeнная цeна), eдиницы измeнeния (pубли, инoстpанная валюта, услoвныe eдиницы), oбoснoваниe pынoчнoй цeны. Скидки - pазoвыe и накoпитeльныe - пopядoк пpeдoставлeния и учeта. 5. Пopядoк pасчeтoв пo дoгoвopу, наличныe и бeзналичныe платeжи с учeтoм измeнeний пopядка pасчeта наличными сoгласнo Указанию ЦБ oт 20.06.2007 N 1843-У "o пpeдeльнoм pазмepe pасчeтoв наличными дeньгами и pасхoдoвании наличных дeнeг, пoступивших в кассу юpидичeскoгo лица или кассу индивидуальнoгo пpeдпpиниматeля". Даты пpизнания дoхoдoв и pасхoдoв пo дoгoвopам. pасчeтныe дoкумeнты в pублях, валютe, услoвных eдиницах. Сoпpoвoдитeльныe и pасчeтныe дoкумeнты в элeктpoннoм видe. Акты: oбязатeльнo ли сoставлять акт, ждать ли oкoнчания дoгoвopа или сoставлять акт пoэтапнo, фopма акта, пoзиция Минфина oтнoситeльнo пopядка запoлнeния актoв и дeтализации свeдeний в них. 6. Дoгoвoры мeжду юридичeскими лицами - oфoрмлeниe, учeт и налoгooблoжeниe. Зависимoсть налoгoвoгo брeмeни oт вида и сoдeржания дoгoвoра. Дoгoвoр и налoг на прибыль. Дoгoвoр и НДС. - дoгoвoр купли-прoдажи (прeдмeт, oбязатeльныe услoвия, oфoрмлeниe и пoслeдствия вoзврата тoвара), - дoгoвoр мeны (oбмeн и мeна - в чeм oтличия, рынoчная цeна сдeлки), - дoгoвoр арeнды (рeгистрация дoгoвoрoв, арeнда автoтранспoртнoгo срeдства, арeнда oфиса), - дoгoвoр страхoвания (личнoe и имущeствeннoe страхoваниe, страхoваниe oтвeтствeннoсти, налoгoвыe льгoты), - дoгoвoры займа (признаниe расхoдoв, бeспрoцeнтныe займы), - дoгoвoры бeзвoзмeзднoй пeрeдачи и бeзвoзмeзднoгo пoльзoвания (oграничeния в сфeрe примeнeния и признания расхoдoв), - пoсрeдничeскиe дoгoвoры (oсoбeннoсти дoгoвoрoв кoмиссии, агeнтирoвания, пoручeния), - дoгoвoр прoстoгo тoварищeства (участники, налoгoвыe прeимущeства, дoля участника и распрeдeлeниe расхoдoв и дoхoдoв), - дoгoвoр вoзмeзднoгo oказания услуг (сущeствeнныe услoвия дoгoвoра, разнoвиднoсти дoгoвoрoв услуг, прeимущeства пeрeд дoгoвoрoм пoдряда). 7. Дoгoвoры oрганизации с физичeскими лицами: кoллeктивныe, трудoвыe, гражданскo-правoвыe, дoгoвoры с индивидуальными прeдприниматeлями - учeт и налoгooблoжeниe, выплаты пo таким дoгoвoрам. Вoзмoжнoсть управлeния налoгoвoй нагрузкoй на прeдприятиe с пoмoщью таких дoгoвoрoв. Пpoдoлжитeльнoсть oбучeния: с 10 дo 17 часoв (с пepepывoм на oбeд и кoфe-паузу). Мeстo oбучeния: г. Мoсква, 5 мин. пeшкoм oт м. Акадeмичeская. Стoимoсть oбучeния: 4900 pуб. (с НДС). (В стoимoсть вxoдит: pаздатoчный матepиал, кoфe-пауза, oбeд в peстopанe). Пpи oтсутствии вoзмoжнoсти пoсeтить сeминаp, мы пpeдлагаeм пpиoбpeсти eгo видeoвepсию на DVD/CD дискаx или видeoкассeтаx (пpилагаeтся автopский pаздатoчный матepиал). Цeна видeoкуpса - 3500 pублeй, с учeтoм НДС. Для peгистpации на сeминаp нeoбxoдимo oтпpавить нам пo факсу: peквизиты opганизации, тeму и дату сeминаpа, пoлнoe ФИo участникoв, кoнтактный тeлeфoн и факс. Для заказа видeoкуpса нeoбxoдимo oтпpавить нам пo факсу: peквизиты opганизации, тeму видeoкуpса, указать нoситeль (ДВД или СД диски), тeлeфoн, факс, кoнтактнoe лицo и тoчный адpeс дoставки. Пoлучить дoпoлнитeльную инфopмацию и заpeгистpиpoваться мoжнo: пo т/ф: ( Ч 9 5 ) 5 Ч3 :: 8 8 :: Ч 6 |
From: Heiko C. <hei...@de...> - 2008-05-04 19:25:41
|
On Sat, May 03, 2008 at 08:47:17PM +0300, Adrian Bunk wrote: > Commit c45a6816c19dee67b8f725e6646d428901a6dc24 > (virtio: explicit advertisement of driver features) > and commit e976a2b997fc4ad70ccc53acfe62811c4aaec851 > (s390: KVM guest: virtio device support, and kvm hypercalls) > don't like each other: > > <-- snip --> > > ... > CC drivers/s390/kvm/kvm_virtio.o > /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/s390/kvm/kvm_virtio.c:224: error: unknown field 'feature' specified in initializer > /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/s390/kvm/kvm_virtio.c:224: warning: initialization from incompatible pointer type > make[3]: *** [drivers/s390/kvm/kvm_virtio.o] Error 1 > > <-- snip --> Hmm... this should help: --- drivers/s390/kvm/kvm_virtio.c | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-) Index: linux-2.6/drivers/s390/kvm/kvm_virtio.c =================================================================== --- linux-2.6.orig/drivers/s390/kvm/kvm_virtio.c +++ linux-2.6/drivers/s390/kvm/kvm_virtio.c @@ -78,27 +78,32 @@ static unsigned desc_size(const struct k + desc->config_len; } -/* - * This tests (and acknowleges) a feature bit. - */ -static bool kvm_feature(struct virtio_device *vdev, unsigned fbit) +/* This gets the device's feature bits. */ +static u32 kvm_get_features(struct virtio_device *vdev) { + unsigned int i; + u32 features = 0; struct kvm_device_desc *desc = to_kvmdev(vdev)->desc; - u8 *features; + u8 *in_features = kvm_vq_features(desc); - if (fbit / 8 > desc->feature_len) - return false; + for (i = 0; i < min(desc->feature_len * 8, 32); i++) + if (in_features[i / 8] & (1 << (i % 8))) + features |= (1 << i); + return features; +} - features = kvm_vq_features(desc); - if (!(features[fbit / 8] & (1 << (fbit % 8)))) - return false; +static void kvm_set_features(struct virtio_device *vdev, u32 features) +{ + unsigned int i; + struct kvm_device_desc *desc = to_kvmdev(vdev)->desc; + /* Second half of bitmap is features we accept. */ + u8 *out_features = kvm_vq_features(desc) + desc->feature_len; - /* - * We set the matching bit in the other half of the bitmap to tell the - * Host we want to use this feature. - */ - features[desc->feature_len + fbit / 8] |= (1 << (fbit % 8)); - return true; + memset(out_features, 0, desc->feature_len); + for (i = 0; i < min(desc->feature_len * 8, 32); i++) { + if (features & (1 << i)) + out_features[i / 8] |= (1 << (i % 8)); + } } /* @@ -221,7 +226,8 @@ static void kvm_del_vq(struct virtqueue * The config ops structure as defined by virtio config */ static struct virtio_config_ops kvm_vq_configspace_ops = { - .feature = kvm_feature, + .get_features = kvm_get_features, + .set_features = kvm_set_features, .get = kvm_get, .set = kvm_set, .get_status = kvm_get_status, |
From: Robin H. <ho...@sg...> - 2008-05-04 19:13:43
|
> diff --git a/mm/Kconfig b/mm/Kconfig > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -205,3 +205,6 @@ config VIRT_TO_BUS > config VIRT_TO_BUS > def_bool y > depends on !ARCH_NO_VIRT_TO_BUS > + > +config MMU_NOTIFIER > + bool Without some text following the bool keyword, I am not even asked for this config setting on my ia64 build. Thanks, Robin |
From: Bob M. <bm...@md...> - 2008-05-04 19:04:24
|
kvm-62 good to go and fast!!! Thanks Bob -----Original Message----- From: Izik Eidus [mailto:iz...@qu...] Sent: Sunday, May 04, 2008 3:31 AM To: Avi Kivity Cc: Bob Moran; kvm...@li... Subject: Re: [kvm-devel] Widescreen video in KVM Avi Kivity wrote: > Bob Moran wrote: >> The http://kvm.qumranet.com/kvmwiki/FAQ section3 Q13 RE. widescreen >> resolution in KVM, refers me to: >> >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/13557 which >> describes a patch to be applied. >> >> I am not familiar with patch application and unsure where to find the >> file to patch. Any help would be appreciated. >> >> > > The patch is included in kvm-62, so if you use that (or any more recent > release) you should have the functionality included. > > In case you still encounter problems, let us know. > isnt it work just with -std-vga? -- woof. |
From: Avi K. <av...@qu...> - 2008-05-04 15:40:42
|
[Resurrecting post from the dead] Marcelo Tosatti wrote: > Forcing clustered APIC mode works only on SMP, and there were high CPU > consumption on Windows SMP guests due to C3 state being reported (fixed > in kvm-30 something). > > So perhaps: > - Faking clustered APIC on SMP > - Faking C3 on UP > > And turning of the TSC bit (for 32-bit guests). > > Is the way to go? > > Avi, do you understand why C3 was causing the Windows SMP problems ? > > It's probably inb()ing on the port in a loop. It's not SMP causing the problems, but the ACPI HAL. I'll check this. > /* Common C-state entry for C2, C3, .. */ > static void acpi_cstate_enter(struct acpi_processor_cx *cstate) > { > if (cstate->space_id == ACPI_CSTATE_FFH) { > /* Call into architectural FFH based C-state */ > acpi_processor_ffh_cstate_enter(cstate); > } else { > int unused; > /* IO port based C-state */ > inb(cstate->address); > /* Dummy wait op - must do something useless after P_LVL2 read > because chipsets cannot guarantee that STPCLK# signal > gets asserted in time to freeze execution properly. */ > unused = inl(acpi_gbl_FADT.xpm_timer_block.address); > } > } > > Clearly that inb() won't actually idle under QEMU. So the question is, > if C3 stated is reported, that port read should be emulated... But how? > We can add now use the KVM_SET_MPSTATE ioctl to halt the vcpu if we see the port read. Since not all hosts support setting mpstate, the bios should only report C3 if the host supports it. -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-05-04 15:32:45
|
No new architecture support today, but instead there is support for Intel's Extended Page Tables, which increase virtualization performance dramatically on Intel's Nehalem processors. Also fix host oops on AMD running some 16-bit loaders and applications. Changes from kvm-67: - Intel EPT support (Sheng Yang) - Code cleanups (Harvey Harrison) - Fix task switch busy bit setting (Izik Eidus) - Reduce guest idle cpu usage on ppc (Hollis Blanchard) - Support floating point instructions on ppc (Christian Ehrhardt) - Fix lmsw emulation - fixes host oops on AMD - Add PIT mode 4 support (Marcelo Tosatti) - fixes DragonflyBSD - Avoid spurious exceptions on state reload (Jan Kiszka) - Add SVM kvmtrace support (Joerg Roedel) - Avoid schedule-in-atomic on 2.6.26 hosts (Andrea Arcangeli) - Handle vma regions with no backing page (Anthony Liguori) - yet another step on the way to pci device assignment - move external module compatibility code into .c file - build: sync non-x86 kvm headers - avoid using kernel headers; use installed libc headers instead - don't exit iothread berfore all vcpus are stopped (Dor Laor) - libkvm uninitialized variable fix (Marcelo Tosatti) - remove old user/config.mak (Jerone Young) - fix vcpu startup race (Anthony Liguori) - dump all libkvm errors to stderr (Jan Kiszka) - fix cross-compilation (Jerone Young) - fix kvm_show_code() to work on ROM and real-mode (Jan Kiszka) - allow qemu -kernel option with extboot (Mark McLoughlin) Notes: If you use the modules bundled with kvm-68, you can use any version of Linux from 2.6.17 upwards. If you use the modules bundled with Linux 2.6.20, you need to use kvm-12. If you use the modules bundled with Linux 2.6.21, you need to use kvm-17. Modules from Linux 2.6.22 and up will work with any kvm version from kvm-22. Some features may only be available in newer releases. For best performance, use Linux 2.6.23-rc2 or later as the host. http://kvm.qumranet.com |
From: Anthony L. <an...@co...> - 2008-05-04 14:46:11
|
Avi Kivity wrote: > Anthony Liguori wrote: >>> We can keep the signals blocked, but run the signalfd emulation in a >>> separate thread (where it can dequeue signals using sigwait as an >>> added bonus). This will reduce the differences between the two >>> modes at the expense of increased signalfd() emulation complexity, >>> which I think is a good tradeoff. >>> >> >> signalfd() can't be emulated transparently with a separate thread >> because you won't be able to wait on signals destined for the >> specific thread (only signals sent to the process). We deliver >> signals directly to the IO thread (specifically, SIGUSR1) so this >> could get nasty. We could just not block SIGUSR1 and rely on the >> fact that it will break us out of select() but I that makes things a >> bit more subtle than I'd like. >> > > We can completely kill off SIGUSR1 and replace it with its own pipe. > There's hardly any point in asking the kernel to signal a task, then > having the kernel convert this to a fd write. > > (Or maybe use eventfd()) That's a really good idea. I'll update the patch. Regards, Anthony Liguori |
From: Avi K. <av...@qu...> - 2008-05-04 14:39:35
|
Anthony Liguori wrote: >> We can keep the signals blocked, but run the signalfd emulation in a >> separate thread (where it can dequeue signals using sigwait as an >> added bonus). This will reduce the differences between the two modes >> at the expense of increased signalfd() emulation complexity, which I >> think is a good tradeoff. >> > > signalfd() can't be emulated transparently with a separate thread > because you won't be able to wait on signals destined for the specific > thread (only signals sent to the process). We deliver signals > directly to the IO thread (specifically, SIGUSR1) so this could get > nasty. We could just not block SIGUSR1 and rely on the fact that it > will break us out of select() but I that makes things a bit more > subtle than I'd like. > We can completely kill off SIGUSR1 and replace it with its own pipe. There's hardly any point in asking the kernel to signal a task, then having the kernel convert this to a fd write. (Or maybe use eventfd()) -- error compiling committee.c: too many arguments to function |
From: Anthony L. <an...@co...> - 2008-05-04 14:21:21
|
Avi Kivity wrote: > Please split the signalfd() emulation into a separate (preparatory) > patch. Also, we need to detect signalfd() at run time as well as > compile time, since qemu may be compiled on a different machine than it > is run on. > Ok. > We can keep the signals blocked, but run the signalfd emulation in a > separate thread (where it can dequeue signals using sigwait as an added > bonus). This will reduce the differences between the two modes at the > expense of increased signalfd() emulation complexity, which I think is a > good tradeoff. > signalfd() can't be emulated transparently with a separate thread because you won't be able to wait on signals destined for the specific thread (only signals sent to the process). We deliver signals directly to the IO thread (specifically, SIGUSR1) so this could get nasty. We could just not block SIGUSR1 and rely on the fact that it will break us out of select() but I that makes things a bit more subtle than I'd like. I personally prefer using pipe() within the same thread although I'm willing to also do the separate thread. Regards, Anthony LIguori > We can move signalfd emulation into a separate file in order to improve > readability. > > |
From: Avi K. <av...@qu...> - 2008-05-04 13:27:50
|
Aurelien Jarno wrote: > The in-kernel PIT emulation ignores pending timers if operating > under mode 3, which for example Hurd uses. > > This mode should output a square wave, high for (N+1)/2 counts and low > for (N-1)/2 counts. As we only care about the resulting interrupts, the > period is N, and mode 3 is the same as mode 2 with regard to > interrupts. > > Applied, thanks. -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-05-04 13:16:47
|
iMil wrote: > Hi, > > Since I upgraded my ubuntu machine to 8.04, > /usr/local/bin/qemu-system-x86_64 segfaults when starting with -net > tap,ifname=tap0 flags. Of course, it's been recompiled. > > $ sudo /usr/local/bin/qemu-system-x86_64 /data/virt/netbsd.img -net > nic,macaddr=00:56:01:02:03:04 -net tap,ifname=tap0,script=/etc/qemu-ifup > Segmentation fault > Please generate a core and post a stacktrace. You'll probably need to set 'ulimit -c unlimited' in order to get a core. -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-05-04 13:03:53
|
Andrea Arcangeli wrote: > On Fri, May 02, 2008 at 12:28:32PM +0300, Avi Kivity wrote: > >> Applied, thanks. Dynamic allocation for the fpu state was introduced in >> 2.6.26-rc, right? >> > > It seems very recent, hit mainline on 30 Apr. > > Also we may want to think if there's something cheaper than fx_save to > trigger a math exception that doesn't alter the fpu state, I didn't > think much about it given it's such a slow path that's probably not > worth changing with something more complicated anyway. And bringing in > a few l1 exclusive cachelines in the cpu should allow the second > instruction to repeat faster than the first. > Oh, it's hardly performance critical. I think it is fine as is. -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-05-04 13:03:21
|
Anthony Liguori wrote: > QEMU is rather aggressive about exhausting the wait period when selecting. > This is fine when the wait period is low and when there is significant delays > in-between selects as it improves IO throughput. > > With the IO thread, there is a very small delay between selects and our wait > period for select is very large. This patch changes main_loop_wait to only > select once before doing the various other things in the main loop. This > generally improves responsiveness of things like SDL but also improves > individual file descriptor throughput quite dramatically. > > This patch is relies on my io-thread-timerfd.patch. > (did you mean signalfd?) Patchset looks good; but as it depends on previous patches I can't apply it yet. -- error compiling committee.c: too many arguments to function |
From: Javier G. G. <ja...@gu...> - 2008-05-04 12:39:46
|
On Friday 02 May 2008, Anthony Liguori wrote: > What we really need is a global configuration file so that individual > users can select these defaults according to what makes sense for them. i favor the idea of writing parameters into the boot image itself. -- Javier |
From: Avi K. <av...@qu...> - 2008-05-04 12:06:57
|
Linus, please pull from repo and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git kvm-updates-2.6.26 In addition to a few random fixes this update adds support for Intel Extended Page Tables (EPT), a feature which greatly improves virtualization performance on the new Nehalem processors. The implementation builds on the same code paths as AMD NPT (and real-mode shadow) and is therefore small and safe for inclusion. Andrea Arcangeli (1): KVM: avoid fx_init() schedule in atomic Avi Kivity (2): KVM: x86 emulator: disable writeback on lmsw KVM: MMU: Allow more than PAGES_PER_HPAGE write protections per large page Christian Ehrhardt (1): KVM: ppc: deliver INTERRUPT_FP_UNAVAIL to the guest Glauber Costa (1): x86: KVM geust: make setup_secondary_clock definition dependent on local apic Hollis Blanchard (1): KVM: ppc: Handle guest idle by emulating MSR[WE] writes Izik Eidus (1): KVM: x86: task switch: fix wrong bit setting for the busy flag Jan Kiszka (1): KVM: Avoid spurious execeptions after setting registers Marcelo Tosatti (1): KVM: PIT: support mode 4 Sheng Yang (8): KVM: VMX: EPT Feature Detection KVM: MMU: Move some definitions to a header file KVM: Add kvm_x86_ops get_tdp_level() KVM: MMU: Add EPT support KVM: MMU: Remove #ifdef CONFIG_X86_64 to support 4 level EPT KVM: Export necessary function for EPT KVM: VMX: Prepare an identity page table for EPT in real mode KVM: VMX: Enable EPT feature for KVM arch/powerpc/kvm/booke_guest.c | 6 + arch/powerpc/kvm/powerpc.c | 20 ++- arch/x86/kernel/kvmclock.c | 4 + arch/x86/kvm/i8254.c | 2 + arch/x86/kvm/mmu.c | 89 ++++------ arch/x86/kvm/mmu.h | 37 ++++- arch/x86/kvm/svm.c | 10 + arch/x86/kvm/vmx.c | 375 ++++++++++++++++++++++++++++++++++++++-- arch/x86/kvm/vmx.h | 38 ++++ arch/x86/kvm/x86.c | 22 ++- arch/x86/kvm/x86_emulate.c | 1 + include/asm-powerpc/kvm_host.h | 1 + include/asm-powerpc/kvm_ppc.h | 5 + include/asm-x86/kvm_host.h | 10 +- virt/kvm/kvm_main.c | 1 + 15 files changed, 542 insertions(+), 79 deletions(-) -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-05-04 11:49:01
|
Glauber Costa wrote: > since the pv_apic_ops are only present if CONFIG_X86_LOCAL_APIC is compiled > in, kvmclock failed to build without this option. This patch fixes this > > Applied, thanks. -- error compiling committee.c: too many arguments to function |