From: John J. <jo...@jo...> - 2011-01-31 18:45:00
|
As the "goal" is to crash the linux kernel, I alternated the setup a little: A Dell with a Broadcom, 57xx NIC, 2x VMs, both with E1000 "cards". One VM with gPXE, the other running OpenFiler, x64, with IET. One additional VM with Wireshark with pcnet32 (32bit windows xp) This should rule out any disk, memory or realtek nic's as the issue on the original mini-itx board, or both sets of hardware have the same issues. Updating the broadcom driver is not yet done. Test 1: IET: 0.17.4-of (The current standard Openfiler install CD, no patches or upgrades) Result: Crash @ 30% of the install of Windows 7, after a number of "unable to find iscsi task" and "invalid data len" Test 2: IET: 1.4.19-of (Openfiler patched to the latest and greatest) Result: Success! with the install of Windows 7, after a number of "unable to find iscsi task" and "invalid data len". WT? It did not fail?!?I should re-run tests on this version for confirmation. Test 3: IET: trunk, 396 Result: Crash @ 3% with the install of Windows 7, after a number of "unable to find iscsi task" and "invalid data len". Sorry, no wireshark capture of this one. Would have been ideal. Output: openfilerdev kernel: [ 1206.553342] ------------[ cut here ]------------ openfilerdev kernel: [ 1206.554125] invalid opcode: 0000 [#1] SMP openfilerdev kernel: [ 1206.554484] last sysfs file: /sys/devices/virtual/misc/autofs/dev Jan 31 19:59:33 openfilerdev kernel: [ 1206.537658] iscsi_trgt: BUG at /usr/src/iscsitarget/trunk/kernel/iscsi.c:1144 assert(list_empty(&scsi_cmnd->pdu_list)) Jan 31 19:59:33 openfilerdev kernel: [ 1206.545360] Pid: 4249, comm: istd1 Not tainted 2.6.29.6-0.24.smp.gcc3.4.x86_64 #1 Jan 31 19:59:33 openfilerdev kernel: [ 1206.547082] Call Trace: Jan 31 19:59:33 openfilerdev kernel: [ 1206.547792] [<ffffffffa04127ec>] cmnd_rx_end+0x268/0x2c8 [iscsi_trgt] Jan 31 19:59:33 openfilerdev kernel: [ 1206.548582] [<ffffffffa041324b>] istd+0x578/0x1188 [iscsi_trgt] Jan 31 19:59:33 openfilerdev kernel: [ 1206.549524] [<ffffffff80229e0e>] ? native_load_tls+0x9/0x39 Jan 31 19:59:33 openfilerdev kernel: [ 1206.549989] [<ffffffff8021002f>] ? __switch_to+0xb4/0x361 Jan 31 19:59:33 openfilerdev kernel: [ 1206.550394] [<ffffffff804d252e>] ? tcp_sendpage+0x0/0x5c5 Jan 31 19:59:33 openfilerdev kernel: [ 1206.550785] [<ffffffffa0412cd3>] ? istd+0x0/0x1188 [iscsi_trgt] Jan 31 19:59:33 openfilerdev kernel: [ 1206.551222] [<ffffffff8025d312>] kthread+0x49/0x72 Jan 31 19:59:33 openfilerdev kernel: [ 1206.551591] [<ffffffff802126ca>] child_rip+0xa/0x20 Jan 31 19:59:33 openfilerdev kernel: [ 1206.551972] [<ffffffff80211fe8>] ? restore_args+0x0/0x30 Jan 31 19:59:33 openfilerdev kernel: [ 1206.552353] [<ffffffff8025d2c9>] ? kthread+0x0/0x72 Jan 31 19:59:33 openfilerdev kernel: [ 1206.552720] [<ffffffff802126c0>] ? child_rip+0x0/0x20 Jan 31 19:59:33 openfilerdev kernel: [ 1206.553342] ------------[ cut here ]------------ Jan 31 19:59:33 openfilerdev kernel: [ 1206.553702] kernel BUG at /usr/src/iscsitarget/trunk/kernel/iscsi.c:1144! Jan 31 19:59:33 openfilerdev kernel: [ 1206.554125] invalid opcode: 0000 [#1] SMP Jan 31 19:59:33 openfilerdev kernel: [ 1206.554484] last sysfs file: /sys/devices/virtual/misc/autofs/dev Jan 31 19:59:33 openfilerdev kernel: [ 1206.554885] CPU 0 Jan 31 19:59:33 openfilerdev kernel: [ 1206.555162] Modules linked in: iscsi_trgt xt_tcpudp nfsd lockd nfs_acl auth_rpcgss iptable_filter ip_tables x_tables autofs4 ipv6 sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi aoe binfmt_misc xfs exportfs dm_multipath scsi_dh dm_snapshot dm_mod sbs sbshc battery e1000 sg vmci floppy i2c_piix4 i2c_core ac shpchp button scsi_wait_scan ext3 jbd mptspi scsi_transport_spi mptscsih mptbase raid10 raid456 async_memcpy async_xor xor async_tx ehci_hcd uhci_hcd sata_vsc sata_via sata_uli sata_sx4 sata_svw sata_sis pata_sis sata_sil sata_sil24 sata_qstor sata_promise sata_nv sata_mv sata_inic162x ata_piix ahci libata sd_mod crc_t10dif scsi_mod rtc_cmos rtc_core rtc_lib [last unloaded: iscsi_trgt] Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] Pid: 4249, comm: istd1 Not tainted 2.6.29.6-0.24.smp.gcc3.4.x86_64 #1 VMware Virtual Platform Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] RIP: 0010:[<ffffffffa04127ec>] [<ffffffffa04127ec>] cmnd_rx_end+0x268/0x2c8 [iscsi_trgt] Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] RSP: 0018:ffff88000310ddd0 EFLAGS: 00010282 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] RAX: ffff880017a19f40 RBX: ffff8800120663a8 RCX: 0000000000000000 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] RDX: 0000000000000000 RSI: ffff8800000b9f40 RDI: ffff88000310df58 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] RBP: ffff88000310ddf0 R08: 0000000000000000 R09: ffff8800000ba6c0 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] R10: 0000000000008de0 R11: ffff88000108c180 R12: ffff880012066888 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] R13: ffff8800120668c0 R14: ffff8800110b2000 R15: 0000000000000000 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] FS: 0000000000000000(0000) GS:ffffffff80895080(0000) knlGS:0000000000000000 Jan 31 19:59:33 openfilerdev kernel: [ 1206.557150] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b openfilerdev kernel: [ 1206.557150] Stack: openfilerdev kernel: [ 1206.557150] ffff88000310df10 ffffffffa041324b ffff8800010133f0 ffffffff80229e0e openfilerdev kernel: [ 1206.557150] ffff88000310de70 ffffffff8021002f 0000000011285bc0 0000003000000000 openfilerdev kernel: [ 1206.557150] Call Trace: openfilerdev kernel: [ 1206.557150] [<ffffffffa041324b>] istd+0x578/0x1188 [iscsi_trgt] openfilerdev kernel: [ 1206.557150] [<ffffffff80229e0e>] ? native_load_tls+0x9/0x39 openfilerdev kernel: [ 1206.557150] [<ffffffff8021002f>] ? __switch_to+0xb4/0x361 openfilerdev kernel: [ 1206.557150] [<ffffffff804d252e>] ? tcp_sendpage+0x0/0x5c5 openfilerdev kernel: [ 1206.557150] [<ffffffffa0412cd3>] ? istd+0x0/0x1188 [iscsi_trgt] openfilerdev kernel: [ 1206.557150] [<ffffffff8025d312>] kthread+0x49/0x72 openfilerdev kernel: [ 1206.557150] [<ffffffff802126ca>] child_rip+0xa/0x20 openfilerdev kernel: [ 1206.557150] [<ffffffff80211fe8>] ? restore_args+0x0/0x30 openfilerdev kernel: [ 1206.557150] [<ffffffff8025d2c9>] ? kthread+0x0/0x72 openfilerdev kernel: [ 1206.557150] [<ffffffff802126c0>] ? child_rip+0x0/0x20 openfilerdev kernel: [ 1206.557150] Code: 39 43 78 74 2a 48 c7 c1 57 b5 41 a0 ba 78 04 00 00 48 c7 c6 26 aa 41 a0 48 c7 c7 50 aa 41 a0 31 c0 e8 92 78 e3 df e8 22 26 e0 df <0f> 0b eb fe 48 89 df e8 99 eb ff ff 4c 89 e7 e8 93 d2 ff ff eb Test 4: IET: trunk, 396 Result: Crash @ 52% with the install of Windows 7, after a number of "unable to find iscsi task" and "invalid data len". Wireshark capture created. It is however 1GB in size. It compresses to around 400MB... Wireshark filter: "(src 192.168.1.100 or dst 192.168.1.100) and port 3260" Any suggestions on what capture filter to apply to get the size down? Test 5: IET: trunk, 396 Currently running at 30% completed. Capture files are at 500MB.... Not good.... Btw, gPXE version 1.0.1 appears to exist, but the web site, etherboot.org, is currently flakey, so unable to download to test. However, i would think it would be beneficial for IET not to crash, regardless of how good / bad the initiator is. JJ > 2011/1/31 John Jore <jo...@jo...>: >> Hi, thanks for the suggestion. I'll run a memtest overnight and report >> back the results. >> >> I run 2 ESXi servers (around 10-12 VMs) w/mini-itx boards with Intel >> NIC's) against the same IET and no kernel crashes until this gPXE >> experiment. If this was a memory problem, I would have expected this to >> happen more often. > > Hmm, I suspect there might be an interop problem with the gpxe iscsi > initiator and IET and the particular settings you use - hence my > inquiry for the network capture. > In particular the InitialR2T and ImmediateData settings in your config > are unusual, as the exact opposite settings usually provide better > performance and are thus probably in use in most cases / setups. > > Anyway, I think the network capture will tell us more. > > Arne > -!- |