Menu

#479 netkvm.sys stops sending/receiving on Windows Server 2003 VM

open
nobody
None
5
2014-08-25
2009-09-28
Mark Weaver
No

This usually happens within an hour or two of starting the interface. It can be cured temporarily by disabling/enabling the adapter within Windows. I've run the Windows interface with log level set to 2 -- when traffic stops it still logs outgoing traffic as normal but ParaNdis_ProcessRxPath stops being logged. I suspect this is to do with the traffic content or timing as I cannot reproduce this with iperf, but only with external traffic to a website hosted on the machine.

What further steps can I take to debug this issue?

Host details:

2 x dual core xeons:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
stepping : 6
cpu MHz : 2327.685
cache size : 6144 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips : 4655.37
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

kernel is 2.6.31 from kernel.org, userspace is debian lenny, all 64-bit
qemu is qemu-kvm-0.10.6

Guest details:
Windows Server 2003 32-bit

qemu is started as:
qemu-system-x86_64 \ -boot c \ -drive file="/data/vms/stooge/boot.raw",if=virtio,boot=on,cache=off \ -m 3072 \ -smp 1 \ -vnc 10.80.80.89:2 \ -k en-gb \ -net nic,model=virtio,macaddr=DE:AD:BE:EF:11:29 \ -net tap,ifname=tap0 \ -localtime \ -usb -usbdevice tablet \ -mem-path /hugepages &

Discussion

  • Yan Vugenfirer

    Yan Vugenfirer - 2009-09-29
    1. Could you please attach the log
    2. Could you be more specific on the scenario? Are you running some tests or network application?
    3. You could raise debug level even more to level 6 - that would give the information about the rings (how much space is left and etc)
    4. In the code you could add debug prints to ParaNdis5_MiniportISR to check if the driver even receives the interrupt.

    Thanks.

     
  • Yan Vugenfirer

    Yan Vugenfirer - 2009-09-29

    Another thing to test - could you please run the guest without "/hugepages" option.

     
  • Mark Weaver

    Mark Weaver - 2009-09-29
    1. Could you please attach the log

    Too big for sf.net, I have put a log here:

    http://www.blushingpenguin.com/kvm/netkvm.log.bz2

    During that log, it appeared that outgoing packets were still being transmitted,
    however incoming packets were not being received. This was verified by running
    ping on the guest and using tcpdump on the host. After a while packets started
    begin received again. The pattern can be seen with:

    grep received netkvm.log > foo

    up to 2235.26074219 packets are being pulled out regularly -- generally 1-2 packets
    at a time. After that they start begin pulled out irregularly and in greater numbers.
    after 2870.28808594 normal service is resumed.

    1. Could you be more specific on the scenario? Are you running some tests
      or network application?

    It's running websites under IIS. I tried to reproduce this issue with various
    iperf scenarios but failed to do so.

    1. You could raise debug level even more to level 6 - that would give the
      information about the rings (how much space is left and etc)

    I have raised the level to 7 (the level of the log linked to above).

    1. In the code you could add debug prints to ParaNdis5_MiniportISR to
      check if the driver even receives the interrupt.

    It appears that DEBUG_EXIT_STATUS(7, (ULONG)b); is in the function
    ParaNdis5_MiniportISR so I assume this is sufficient.

    (5). Another thing to test - could you please run the guest without "/hugepages"
    option.

    The same issue occurs without hugepages.

     
  • Dor Laor

    Dor Laor - 2009-10-11

    Sorry, /me wrong, I now saw you compiled the git code. Ignore the last comment

     

Log in to post a comment.