#8 82599 and irq on Xeon 7542

closed
Emil Tantilov
ixgbe (37)
standalone_driver
5
2013-07-09
2011-03-18
Alex Chukharev
No

HI.
I have supermicro server with :
* Intel® 7500 (Boxboro-EX) chipset
* ICH10R
4 processors with 6 cores in each (Intel(R) Xeon(R) CPU X7542 @ 2.67GHz)

I installed 2 10G intel NICs(82599EB) and the latest ixgbe driver (v3.2.10).

Yesterday I made a test with pktgen and got some strange results(I generated traffic on server (A) and flooded it through supermicro(B)):

1) While I spread interrupts on B 10G nic among 12 cores (the 1-st and the 2-d processors) I recieved 100% utilization on all 12 cores at rate 5Gbit/s (~1 mln pps)

2) While I spread interrupts on B 10G nic among 8 cores (the 1-st (6cores) and the 2-d(2cores) processors) I recieved 100% utilization on all 8 cores at rate 7Gbit/s (~1 mln pps)

3) While I spread interrupts on B 10G nic among 6 cores (the 1-st (6cores) processors) I DID NOT recieved 100% utilization and I could forward 9.5Gbit/s (~1 mln pps)

Of course I have heard that spreading interrupts among different processors could make things worth, but not so dramatically.

I made test on 2.6.37-gentoo-r1.

So can you tell me, how to overcome this problem?

Discussion

  • vkozhevnikov
    vkozhevnikov
    2011-03-18

    Hi!

    Repeat test 3) but on 2-nd processor. NUMA systems sometimes is unpredictable...

     
  • Alex Chukharev
    Alex Chukharev
    2011-03-23

    to vkozhevnikov:

    This night I spread 5 Gbit/s between 2 10G NICS(but it acted like test3 on 2-d and 3-d processors ). And server rebooted, unfortunately I forgot to start netconsole, so I do not have output of the error, but you wrote about NUMA, can you give some references about it's "unpredictable" behavior?

     
  • Alex Chukharev
    Alex Chukharev
    2011-03-28

    The server rebooted last night, that is the output from netconsole:

    [155017.529651] netconsole: remote ethernet address 00:1b:0d:ed:5b:c0
    [155017.529885] console [netcon0] enabled
    [155017.529886] netconsole: network logging started
    [354104.480686] BUG: unable to handle kernel NULL pointer dereference at (null)
    [354104.480760] IP: [<ffffffff814ed1e9>] leaf_info_free_rcu+0x9/0x20
    [354104.480806] PGD 45dd1b067 PUD 45bcbf067 PMD 0
    [354104.480846] Oops: 0002 [#1] SMP
    [354104.480880] last sysfs file: /sys/devices/virtual/net/eth2.235/type
    [354104.480914] CPU 6
    [354104.480923] Modules linked in: netconsole configfs 8021q garp stp ipt_NETFLOW xt_statistic xt_limit xt_NOTRACK xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_raw iptable_filter ipt_addrtype xt_dscp xt_string xt_owner xt_multiport xt_iprange xt_hashlimit xt_conntrack xt_DSCP xt_NFQUEUE xt_mark xt_connmark nf_conntrack ip_tables x_tables ixgbe igb psmouse pcspkr serio_raw tpm_tis i2c_i801 tpm iTCO_wdt dca iTCO_vendor_support tpm_bios joydev iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi tg3 e1000 xfs exportfs nfs auth_rpcgss nfs_acl lockd sunrpc jfs reiserfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 dm_crypt scsi_wait_scan sl811_hcd usbhid usb_storage hid aic94xx libsas lpfc qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280 imm parport dmx3191d gdth advansys initio BusLogic arcmsr aic7xxx aic79xx sata_inic162x sata_mv ahci libahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_cs5530 pata_cs5520 pata_via pata_jmicron pata_marvell pata_netcell pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_cypress pata_oldpiix pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix [last unloaded: netconsole]
    [354104.483324]
    [354104.483349] Pid: 0, comm: kworker/0:1 Not tainted 2.6.37-gentoo-r1 #1 Supermicro X8QB6/X8QB6
    [354104.483422] RIP: 0010:[<ffffffff814ed1e9>]
    [<ffffffff814ed1e9>]
    leaf_info_free_rcu+0x9/0x20
    [354104.483480] RSP: 0018:ffff88007eec3e70 EFLAGS: 00010286
    [354104.483510] RAX: 0000000000000000 RBX: ffff88007eecfce0 RCX: 0000000000000000
    [354104.483561] RDX: ffff8804229a7890 RSI: 0000000000000000 RDI: ffff8804229a7890
    [354104.483611] RBP: ffff88007eec3e70 R08: 0000000000000000 R09: 000142e195063f00
    [354104.483661] R10: ffffffff8158f04d R11: 0000000000000010 R12: ffffffff81750600
    [354104.483711] R13: ffff880427e11da0 R14: ffff88041e3b4680 R15: ffff88007eecfd10
    [354104.483762] FS: 0000000000000000(0000) GS:ffff88007eec0000(0000) knlGS:0000000000000000
    [354104.483814] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [354104.483846] CR2: 0000000000000000 CR3: 000000045e75e000 CR4: 00000000000006e0
    [354104.483897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [354104.483946] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [354104.483997] Process kworker/0:1 (pid: 0, threadinfo ffff88045f97a000, task ffff88045f94c990)
    [354104.484049] Stack:
    [354104.484073] ffff88007eec3ec0 ffffffff810beb6a 0000000000000000 ffff880400000000
    [354104.484138] ffff88007eec3e90 0000000000000048 0000000000000001 0000000000000100
    [354104.484202] 0000000000000009 ffff88045f97bfd8 ffff88007eec3ed0 ffffffff810bedc8
    [354104.484267] Call Trace:
    [354104.484292] <IRQ>
    [354104.484325] [<ffffffff810beb6a>] rcu_process_callbacks+0x11a/0x350
    [354104.484360]
    [<ffffffff810bedc8>] rcu_process_callbacks+0x28/0x50
    [354104.484397]
    [<ffffffff81060c47>]
    do_softirq+0xb7/0x230
    [354104.484434] [<ffffffff8108a4b6>] ? tick_program_event+0x26/0x30
    [354104.484470] [<ffffffff8100cf5c>] call_softirq+0x1c/0x30
    [354104.484502] [<ffffffff8100e995>] do_softirq+0x65/0xa0
    [354104.484533] [<ffffffff81060b05>] irq_exit+0x85/0x90
    [354104.484570] [<ffffffff8157ccc0>] smp_apic_timer_interrupt+0x70/0x9b
    [354104.484604] [<ffffffff8100ca13>] apic_timer_interrupt+0x13/0x20
    [354104.484636] <EOI>
    [354104.484666] [<ffffffff81014b55>] ? mwait_idle+0x85/0xf0
    [354104.484699] [<ffffffff8157903a>] ? atomic_notifier_call_chain+0x1a/0x20
    [354104.484734] [<ffffffff8100ae5b>] cpu_idle+0xbb/0x140
    [354104.484768] [<ffffffff8156c340>] start_secondary+0x1e6/0x1ed
    [354104.484800] Code: 24 e9 3f fd ff ff e8 87 cf b6 ff 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 e8 82 52 c4 ff c9 c3 55 48 89 e5 0f 1f 44 00 18 <00> 00 00 04 88 ff ff c4 ff c9 c3 66 66 66 2e 0f 1f 84 00 00 00
    [354104.485112] RIP [<ffffffff814ed1e9>] leaf_info_free_rcu+0x9/0x20
    [354104.485149] RSP <ffff88007eec3e70>
    [354104.485176] CR2: 0000000000000000
    [354104.485455] ---[ end trace 11e00c035bfa014a ]---
    [354104.485517] Kernel panic - not syncing: Fatal exception in interrupt
    [354104.485583] Pid: 0, comm: kworker/0:1 Tainted: G D 2.6.37-gentoo-r1 #1
    [354104.485667] Call Trace:
    [354104.485723] <IRQ>
    [<ffffffff815723a1>] panic+0x91/0x1a1
    [354104.485827]
    [<ffffffff8105b135>] ? kmsg_dump+0x145/0x160
    [354104.485891]
    [<ffffffff815763ba>] oops_end+0xea/0xf0
    [354104.485959]
    [<ffffffff8103b95b>] no_context+0xfb/0x260
    [354104.486023]
    [<ffffffff8103bbe5>]
    bad_area_nosemaphore+0x125/0x1e0
    [354104.486091] [<ffffffff8112f547>] ? add_partial+0x57/0x90
    [354104.486155] [<ffffffff8103bcb3>] bad_area_nosemaphore+0x13/0x20
    [354104.486220] [<ffffffff81578dc7>] do_page_fault+0x337/0x4f0
    [354104.486288] [<ffffffff81477bf7>] ? kfree_skb+0x47/0xa0
    [354104.486351]
    [<ffffffff81477c8b>] ? consume_skb+0x3b/0x80
    [354104.486421]
    [<ffffffffa09058bb>] ? ixgbe_poll+0x116b/0x1580 [ixgbe]
    [354104.486489]
    [<ffffffff81575695>] page_fault+0x25/0x30
    [354104.486553]
    [<ffffffff814ed1e9>] ?
    leaf_info_free_rcu+0x9/0x20
    [354104.486620] [<ffffffff810beb6a>] rcu_process_callbacks+0x11a/0x350
    [354104.486686]
    [<ffffffff810bedc8>] rcu_process_callbacks+0x28/0x50
    [354104.486751]
    [<ffffffff81060c47>]
    do_softirq+0xb7/0x230
    [354104.486815] [<ffffffff8108a4b6>] ? tick_program_event+0x26/0x30
    [354104.486880] [<ffffffff8100cf5c>] call_softirq+0x1c/0x30
    [354104.486942] [<ffffffff8100e995>] do_softirq+0x65/0xa0
    [354104.487006] [<ffffffff81060b05>] irq_exit+0x85/0x90
    [354104.487069] [<ffffffff8157ccc0>] smp_apic_timer_interrupt+0x70/0x9b
    [354104.487136] [<ffffffff8100ca13>] apic_timer_interrupt+0x13/0x20
    [354104.487201] <EOI> [<ffffffff81014b55>] ? mwait_idle+0x85/0xf0
    [354104.487302] [<ffffffff8157903a>] ? atomic_notifier_call_chain+0x1a/0x20
    [354104.487368] [<ffffffff8100ae5b>] cpu_idle+0xbb/0x140
    [354104.487431] [<ffffffff8156c340>] start_secondary+0x1e6/0x1ed

     

    Related

    Bugs: #1

  • assigned to Emil.

     
  • Jacob Keller
    Jacob Keller
    2012-05-04

    Due to the age if this issue, can you please respond within 60 days if this is still a problem. Otherwise this bug will be automatically closed.

     
  • Todd Fujinaka
    Todd Fujinaka
    2013-07-09

    • status: pending --> closed
     
  • Todd Fujinaka
    Todd Fujinaka
    2013-07-09

    Closed due to inactivity.