linux-iscsi / Support Requests / #32 Kernel Panic when doing sg IO >4k over iscsi

Kernel Panic when doing sg IO >4k over iscsi

#32 Kernel Panic when doing sg IO >4k over iscsi

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2006-03-01

Created: 2006-03-01

Creator: glozada

Private: No

I am running FC4 (2.6.11-1.1369_FC4) and have compiled
the linux-iscsi.4.0.2.2 driver for that kernel.

I know I should probably be using the open-iscsi but at
this point in the game it is not an option or one that
I really dont want to take.

The target is an Overland REO 4000 virtual tape
device....has to be a tape device, panic does not
happen with disk devices. When I use sg_utils
(sg_test_rwbuf) command to issue a w/r greater than a
page (4k) the system panics. I can do =<4k sg io all
day without issue. This only happens when the iSCSI
connection is across an gigE link....does not happen
when connection is across an 10/100 link. NIC's, both
10/100 and GIG, are Intel MB embedded adapters using
the e100 and e1000 respectively. Attached is a capture
of the call trace. This one has me twisting.....ugh!
Any help would be greatly appreciated.

Regards
Godfrey

sg_utils call:

./sg_test_rwbuf --size=4096 /dev/sg0 count=5 => this works
./sg_test_rwbuf --size=8192 /dev/sg0 count=5 => this panics

Discussion

glozada - 2006-03-01

Call trace of Kernel Panic

linux-iscsi-panic.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mike Christie - 2006-03-01

Logged In: YES
user_id=827494

what version of sg_utils is this with?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nobody/Anonymous - 2006-03-02

Logged In: NO

Does this bug only occur with sg_test_rwbuf? Are other
programs like sg_dd ok?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nobody/Anonymous - 2006-07-21

Logged In: NO

Where can I get the linux-iscsi.4.0.2.2 ?
Thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Scott M. Ferris - 2006-07-21

Logged In: YES
user_id=40524

I'm pretty sure open-iscsi will have the same problem. I
tracked down a similiar issue in open-iscsi, which turned
out to be a (bad) interaction between the sg driver using
higher-order kmalloc calls for indirect I/O buffers (to get
physically contiguous RAM), and the tcp_sendpage code doing
get_page/put_page. It may be caused by the scsi_lib/block
code not doing the right thing when ripping apart
scatter/gather lists and rebuilding them 1 page at a time,
since for higher-order allocations certain fields are only
set on the first struct page of the allocation.

I haven't looked an linux-iscsi in a long time, so I'm not
sure what workaround/fix might be easiest. Some possible
workarounds/fixes:

1) Use SG_FLAG_DIRECT to avoid indirect I/O.
2) Somehow disable the use of sendpage in the iscsi kernel
module (and add a data copy).
3) Change the SG driver to not use higher-order allocations
when the SCSI host has clustering disabled.
4) Figure out if the SCSI/block code is doing something
improper when it rebuilds scatter/gather lists, and make
sure every page has a non-zero page count before being
passed to the network stack.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kernel Panic when doing sg IO >4k over iscsi

Group

Searches

Help

#32 Kernel Panic when doing sg IO >4k over iscsi

Discussion