From: Brad J. <bjo...@pr...> - 2006-05-18 22:21:31
|
The system running scst crashes when doing I/O to target from remote system. Here is my setup: My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. It is running Linux 2.6.15.7. It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. It has a Qlogic 2312 HBA connected to a switch. This is my FC target host. (My FC Initiator is another x86 system with a Qlogic HBA also connected to the switch.) For back-end devices it has an LSI FC949X HBA connected to a Hitachi Fibre-channel drive. Here is my start script: -------------------------------------------------------- modprobe -v qla2x00tgt modprobe -v scst_disk echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices echo "1" >/sys/class/scsi_host/host5/target_mode_enabled -------------------------------------------------------- In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my Qlogic target-mode port. Everything starts successfully (including scsi_tgt module since it is a dependency of scst_disk). >From my initiator system I see the one drive I have exposed. I successfully partition that drive and do mkfs. At this point everything is still fine. I then mount the file system and copy a big file to it. The copy seems to work fine but at some point shortly after that my target system crashes. There is no oops output to the system log. So I did it again with a remote kgdb attached. Here is the gdb output: Program received signal SIGILL, Illegal instruction. __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 1055 if (put_page_testzero(page)) { (gdb) bt #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, context=<value optimized out>, pflags=0xc046cfb8, left_locked=<value optimized out>) at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, pflags=0xc046cfb8, context=268435457) at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 #7 0xc012d905 in tasklet_action (a=<value optimized out>) at kernel/softirq.c:267 #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 #12 0xc010499e in common_interrupt () at thread_info.h:91 #13 0xc1cf4000 in ?? () #14 0x00000000 in ?? () I can reproduce this easily every time. Let me know if you want any further information about this. ...Brad Johnson |
From: Ming Z. <mi...@el...> - 2006-05-18 22:36:06
|
can u try latest cvs code? Ming On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > The system running scst crashes when doing I/O to target from remote > system. > > Here is my setup: > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > It is running Linux 2.6.15.7. > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > host. (My FC Initiator is another x86 system with a Qlogic HBA also > connected to the switch.) > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > Fibre-channel drive. > > Here is my start script: > -------------------------------------------------------- > modprobe -v qla2x00tgt > modprobe -v scst_disk > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > -------------------------------------------------------- > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > Qlogic target-mode port. Everything starts successfully (including > scsi_tgt module since it is a dependency of scst_disk). > > >From my initiator system I see the one drive I have exposed. I > successfully partition that drive and do mkfs. At this point everything > is still fine. I then mount the file system and copy a big file to it. > The copy seems to work fine but at some point shortly after that my > target system crashes. There is no oops output to the system log. So I > did it again with a remote kgdb attached. Here is the gdb output: > > > Program received signal SIGILL, Illegal instruction. > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > 1055 if (put_page_testzero(page)) { > (gdb) bt > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > context=<value optimized out>, pflags=0xc046cfb8, > left_locked=<value optimized out>) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > pflags=0xc046cfb8, context=268435457) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > at kernel/softirq.c:267 > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > #12 0xc010499e in common_interrupt () at thread_info.h:91 > #13 0xc1cf4000 in ?? () > #14 0x00000000 in ?? () > > > I can reproduce this easily every time. Let me know if you want any > further information about this. > > ...Brad Johnson > > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel |
From: Ming Z. <min...@gm...> - 2006-05-18 22:38:37
|
1) can u try latest code from scst cvs 2) can u try to export it via scst_fileio module? see if oops in same place. ming On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > The system running scst crashes when doing I/O to target from remote > system. > > Here is my setup: > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > It is running Linux 2.6.15.7. > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > host. (My FC Initiator is another x86 system with a Qlogic HBA also > connected to the switch.) > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > Fibre-channel drive. > > Here is my start script: > -------------------------------------------------------- > modprobe -v qla2x00tgt > modprobe -v scst_disk > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > -------------------------------------------------------- > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > Qlogic target-mode port. Everything starts successfully (including > scsi_tgt module since it is a dependency of scst_disk). > > >From my initiator system I see the one drive I have exposed. I > successfully partition that drive and do mkfs. At this point everything > is still fine. I then mount the file system and copy a big file to it. > The copy seems to work fine but at some point shortly after that my > target system crashes. There is no oops output to the system log. So I > did it again with a remote kgdb attached. Here is the gdb output: > > > Program received signal SIGILL, Illegal instruction. > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > 1055 if (put_page_testzero(page)) { > (gdb) bt > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > context=<value optimized out>, pflags=0xc046cfb8, > left_locked=<value optimized out>) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > pflags=0xc046cfb8, context=268435457) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > at kernel/softirq.c:267 > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > #12 0xc010499e in common_interrupt () at thread_info.h:91 > #13 0xc1cf4000 in ?? () > #14 0x00000000 in ?? () > > > I can reproduce this easily every time. Let me know if you want any > further information about this. > > ...Brad Johnson > > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel |
From: Brad J. <bjo...@pr...> - 2006-05-19 14:57:50
|
I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). It goes a while longer before crashing this time, and fails in a different place. This test again used scst_disk handler. I will next try using the scst_fileio module and will let you know the results. Here is the GDB output for this failure: Program received signal SIGILL, Illegal instruction. __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized out>, pflags=0xf1488fd4, left_locked=1) at /root/mid-level/cvs_version/src/scst_targ.c:2494 2494 BUG(); (gdb) bt #0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized out>, pflags=0xf1488fd4, left_locked=1) at /root/mid-level/cvs_version/src/scst_targ.c:2494 #1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, pflags=0xf1488fd4, context=268435459) at /root/mid-level/cvs_version/src/scst_targ.c:54 #2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) at /root/mid-level/cvs_version/src/scst_targ.c:2662 #3 0xc01024d9 in kernel_thread_helper () at arch/i386/kernel/process.c:298 (gdb) print *cmd $2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = 0, data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, search_cmd_list_entry = { next = 0xe92800d0, prev = 0xe92800d0}, cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", cdb_len = 10, queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, data_direction = DMA_FROM_DEVICE, expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = 4096, data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', msg_status = 0 '\0', host_status = 0, driver_status = 0, sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = 0x0, tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} (gdb) ...Brad On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > 1) can u try latest code from scst cvs > > 2) can u try to export it via scst_fileio module? see if oops in same > place. > > ming > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > The system running scst crashes when doing I/O to target from remote > > system. > > > > Here is my setup: > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > > It is running Linux 2.6.15.7. > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > > host. (My FC Initiator is another x86 system with a Qlogic HBA also > > connected to the switch.) > > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > > Fibre-channel drive. > > > > Here is my start script: > > -------------------------------------------------------- > > modprobe -v qla2x00tgt > > modprobe -v scst_disk > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > -------------------------------------------------------- > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > > Qlogic target-mode port. Everything starts successfully (including > > scsi_tgt module since it is a dependency of scst_disk). > > > > >From my initiator system I see the one drive I have exposed. I > > successfully partition that drive and do mkfs. At this point everything > > is still fine. I then mount the file system and copy a big file to it. > > The copy seems to work fine but at some point shortly after that my > > target system crashes. There is no oops output to the system log. So I > > did it again with a remote kgdb attached. Here is the gdb output: > > > > > > Program received signal SIGILL, Illegal instruction. > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > 1055 if (put_page_testzero(page)) { > > (gdb) bt > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > context=<value optimized out>, pflags=0xc046cfb8, > > left_locked=<value optimized out>) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > pflags=0xc046cfb8, context=268435457) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > at kernel/softirq.c:267 > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > > #12 0xc010499e in common_interrupt () at thread_info.h:91 > > #13 0xc1cf4000 in ?? () > > #14 0x00000000 in ?? () > > > > > > I can reproduce this easily every time. Let me know if you want any > > further information about this. > > > > ...Brad Johnson > > > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > |
From: Vladislav B. <vs...@vl...> - 2006-05-22 12:18:52
Attachments:
exec_bug.diff
|
Thanks. Could you try with the attached patch, please? (On top of the latest CVS) Brad Johnson wrote: > I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > It goes a while longer before crashing this time, and fails in a > different place. This test again used scst_disk handler. I will next try > using the scst_fileio module and will let you know the results. > Here is the GDB output for this failure: > > Program received signal SIGILL, Illegal instruction. > __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > out>, > pflags=0xf1488fd4, left_locked=1) > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > 2494 BUG(); > (gdb) bt > #0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > out>, > pflags=0xf1488fd4, left_locked=1) > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > #1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > pflags=0xf1488fd4, context=268435459) > at /root/mid-level/cvs_version/src/scst_targ.c:54 > #2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > at /root/mid-level/cvs_version/src/scst_targ.c:2662 > #3 0xc01024d9 in kernel_thread_helper () at > arch/i386/kernel/process.c:298 > > (gdb) print *cmd > $2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > 0, > data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > search_cmd_list_entry = { > next = 0xe92800d0, prev = 0xe92800d0}, > cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > cdb_len = 10, > queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > data_direction = DMA_FROM_DEVICE, > expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > 4096, > data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > msg_status = 0 '\0', host_status = 0, driver_status = 0, > sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > 0x0, > tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > (gdb) > > > ...Brad > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>1) can u try latest code from scst cvs >> >>2) can u try to export it via scst_fileio module? see if oops in same >>place. >> >>ming >> >> >> >> >>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >> >>>The system running scst crashes when doing I/O to target from remote >>>system. >>> >>>Here is my setup: >>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>It is running Linux 2.6.15.7. >>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>connected to the switch.) >>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>Fibre-channel drive. >>> >>>Here is my start script: >>>-------------------------------------------------------- >>>modprobe -v qla2x00tgt >>>modprobe -v scst_disk >>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>-------------------------------------------------------- >>> >>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>Qlogic target-mode port. Everything starts successfully (including >>>scsi_tgt module since it is a dependency of scst_disk). >>> >>>>From my initiator system I see the one drive I have exposed. I >>>successfully partition that drive and do mkfs. At this point everything >>>is still fine. I then mount the file system and copy a big file to it. >>>The copy seems to work fine but at some point shortly after that my >>>target system crashes. There is no oops output to the system log. So I >>>did it again with a remote kgdb attached. Here is the gdb output: >>> >>> >>>Program received signal SIGILL, Illegal instruction. >>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>1055 if (put_page_testzero(page)) { >>>(gdb) bt >>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>> context=<value optimized out>, pflags=0xc046cfb8, >>> left_locked=<value optimized out>) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>> pflags=0xc046cfb8, context=268435457) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>> at kernel/softirq.c:267 >>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>#13 0xc1cf4000 in ?? () >>>#14 0x00000000 in ?? () >>> >>> >>>I can reproduce this easily every time. Let me know if you want any >>>further information about this. >>> >>>...Brad Johnson >>> >>> >>> >>> >>>------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job easier >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Scst-devel mailing list >>>Scs...@li... >>>https://lists.sourceforge.net/lists/listinfo/scst-devel >> >> > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Brad J. <bjo...@pr...> - 2006-05-22 23:16:37
|
Vlad, The patched version has not yet crashed after running bonnie for an hour. But now it doesn't seem to be progressing, with the following messages appearing continuously in the system log: May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, sess->init_phase 3), probably the system is overloaded May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to allocate or build requested buffer (size 3182592), sending BUSY status The requested buffer size varies slightly, but is always in the 3 million range. ...Brad On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: > Thanks. Could you try with the attached patch, please? (On top of the > latest CVS) > > Brad Johnson wrote: > > I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > > It goes a while longer before crashing this time, and fails in a > > different place. This test again used scst_disk handler. I will next try > > using the scst_fileio module and will let you know the results. > > Here is the GDB output for this failure: > > > > Program received signal SIGILL, Illegal instruction. > > __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > > out>, > > pflags=0xf1488fd4, left_locked=1) > > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > > 2494 BUG(); > > (gdb) bt > > #0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > > out>, > > pflags=0xf1488fd4, left_locked=1) > > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > > #1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > > pflags=0xf1488fd4, context=268435459) > > at /root/mid-level/cvs_version/src/scst_targ.c:54 > > #2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > > at /root/mid-level/cvs_version/src/scst_targ.c:2662 > > #3 0xc01024d9 in kernel_thread_helper () at > > arch/i386/kernel/process.c:298 > > > > (gdb) print *cmd > > $2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > > sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > > atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > > 0, > > data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > > cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > > tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > > search_cmd_list_entry = { > > next = 0xe92800d0, prev = 0xe92800d0}, > > cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > > cdb_len = 10, > > queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > > data_direction = DMA_FROM_DEVICE, > > expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > > 4096, > > data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > > sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > > get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > > msg_status = 0 '\0', host_status = 0, driver_status = 0, > > sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > > tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > > extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > > 0x0, > > tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > > (gdb) > > > > > > ...Brad > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > >>1) can u try latest code from scst cvs > >> > >>2) can u try to export it via scst_fileio module? see if oops in same > >>place. > >> > >>ming > >> > >> > >> > >> > >>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >> > >>>The system running scst crashes when doing I/O to target from remote > >>>system. > >>> > >>>Here is my setup: > >>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>It is running Linux 2.6.15.7. > >>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>connected to the switch.) > >>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>Fibre-channel drive. > >>> > >>>Here is my start script: > >>>-------------------------------------------------------- > >>>modprobe -v qla2x00tgt > >>>modprobe -v scst_disk > >>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>-------------------------------------------------------- > >>> > >>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>Qlogic target-mode port. Everything starts successfully (including > >>>scsi_tgt module since it is a dependency of scst_disk). > >>> > >>>>From my initiator system I see the one drive I have exposed. I > >>>successfully partition that drive and do mkfs. At this point everything > >>>is still fine. I then mount the file system and copy a big file to it. > >>>The copy seems to work fine but at some point shortly after that my > >>>target system crashes. There is no oops output to the system log. So I > >>>did it again with a remote kgdb attached. Here is the gdb output: > >>> > >>> > >>>Program received signal SIGILL, Illegal instruction. > >>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>1055 if (put_page_testzero(page)) { > >>>(gdb) bt > >>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>> context=<value optimized out>, pflags=0xc046cfb8, > >>> left_locked=<value optimized out>) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>> pflags=0xc046cfb8, context=268435457) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>> at kernel/softirq.c:267 > >>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>#13 0xc1cf4000 in ?? () > >>>#14 0x00000000 in ?? () > >>> > >>> > >>>I can reproduce this easily every time. Let me know if you want any > >>>further information about this. > >>> > >>>...Brad Johnson > >>> > >>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >> > >> > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > |
From: Brad J. <bjo...@pr...> - 2006-05-23 13:54:08
|
The initiator system log is full of these messages: sd 1:0:0:0: timing out command, waited 150s sd 1:0:0:0: SCSI error: return code = 0x28 end_request: I/O error, dev sda, sector 22334567 sd 1:0:0:0: timing out command, waited 150s sd 1:0:0:0: SCSI error: return code = 0x8 end_request: I/O error, dev sda, sector 22382295 Here is the sgv file: Name Hit Total sgv 0 0 sgv-4K 0 0 sgv-8K 0 0 sgv-16K 0 0 sgv-32K 0 0 sgv-64K 0 0Name Hit Total sgv 0 0 sgv-4K 0 0 sgv-8K 0 0 sgv-16K 0 0 sgv-32K 0 0 sgv-64K 0 0 sgv-128K 0 0 sgv-256K 0 0 sgv-512K 0 0 sgv-1024K 0 0 sgv-2048K 0 0 sgv-clust 2381747 2383002 sgv-clust-4K 974 1040 sgv-clust-8K 459 512 sgv-clust-16K 510 543 sgv-clust-32K 746 787 sgv-clust-64K 8541 8629 sgv-clust-128K 22 32 sgv-clust-256K 61 76 sgv-clust-512K 35599 35714 sgv-clust-1024K 341439 341721 sgv-clust-2048K 1993396 1993948 sgv-dma 0 0 sgv-dma-4K 0 0 sgv-dma-8K 0 0 sgv-dma-16K 0 0 sgv-dma-32K 0 0 sgv-dma-64K 0 0 sgv-dma-128K 0 0 sgv-dma-256K 0 0 sgv-dma-512K 0 0 sgv-dma-1024K 0 0 sgv-dma-2048K 0 0 big 7052121 sgv-128K 0 0 sgv-256K 0 0 sgv-512K 0 0 sgv-1024K 0 0 sgv-2048K 0 0 sgv-clust 2381747 2383002 sgv-clust-4K 974 1040 sgv-clust-8K 459 512 sgv-clust-16K 510 543 sgv-clust-32K 746 787 sgv-clust-64K 8541 8629 sgv-clust-128K 22 32 sgv-clust-256K 61 76 sgv-clust-512K 35599 35714 sgv-clust-1024K 341439 341721 sgv-clust-2048K 1993396 1993948 sgv-dma 0 0 sgv-dma-4K 0 0 sgv-dma-8K 0 0 sgv-dma-16K 0 0 sgv-dma-32K 0 0 sgv-dma-64K 0 0 sgv-dma-128K 0 0 sgv-dma-256K 0 0 sgv-dma-512K 0 0 sgv-dma-1024K 0 0 sgv-dma-2048K 0 0 big 7052121 ...Brad On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: > OK, thanks. > > Relating the messages. What is your initiator? Does it have something in > its logs? 3 Mb data size in commands is a bit too much, Qlogic cards > usually have troubles to handle it, so you see those messages. Could you > send me the content of /proc/scsi_tgt/sgv file, please? > > Vlad > > Brad Johnson wrote: > > Vlad, > > > > The patched version has not yet crashed after running bonnie for an > > hour. But now it doesn't seem to be progressing, with the following > > messages appearing continuously in the system log: > > > > May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY > > status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, > > sess->init_phase 3), probably the system is overloaded > > May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to > > allocate or build requested buffer (size 3182592), sending BUSY status > > > > > > The requested buffer size varies slightly, but is always in the 3 > > million range. > > > > ...Brad > > > > > > On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: > > > >>Thanks. Could you try with the attached patch, please? (On top of the > >>latest CVS) > >> > >>Brad Johnson wrote: > >> > >>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > >>>It goes a while longer before crashing this time, and fails in a > >>>different place. This test again used scst_disk handler. I will next try > >>>using the scst_fileio module and will let you know the results. > >>>Here is the GDB output for this failure: > >>> > >>>Program received signal SIGILL, Illegal instruction. > >>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>out>, > >>> pflags=0xf1488fd4, left_locked=1) > >>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>2494 BUG(); > >>>(gdb) bt > >>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>out>, > >>> pflags=0xf1488fd4, left_locked=1) > >>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > >>> pflags=0xf1488fd4, context=268435459) > >>> at /root/mid-level/cvs_version/src/scst_targ.c:54 > >>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > >>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 > >>>#3 0xc01024d9 in kernel_thread_helper () at > >>>arch/i386/kernel/process.c:298 > >>> > >>>(gdb) print *cmd > >>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > >>> sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > >>> atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > >>>0, > >>> data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > >>> cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > >>> tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > >>>search_cmd_list_entry = { > >>> next = 0xe92800d0, prev = 0xe92800d0}, > >>> cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > >>>cdb_len = 10, > >>> queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > >>> data_direction = DMA_FROM_DEVICE, > >>> expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > >>>4096, > >>> data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > >>> sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > >>> get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > >>> msg_status = 0 '\0', host_status = 0, driver_status = 0, > >>> sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > >>> tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > >>> extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > >>>0x0, > >>> tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > >>>(gdb) > >>> > >>> > >>>...Brad > >>> > >>> > >>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>> > >>> > >>>>1) can u try latest code from scst cvs > >>>> > >>>>2) can u try to export it via scst_fileio module? see if oops in same > >>>>place. > >>>> > >>>>ming > >>>> > >>>> > >>>> > >>>> > >>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >>>> > >>>> > >>>>>The system running scst crashes when doing I/O to target from remote > >>>>>system. > >>>>> > >>>>>Here is my setup: > >>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>>>It is running Linux 2.6.15.7. > >>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>>>connected to the switch.) > >>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>>>Fibre-channel drive. > >>>>> > >>>>>Here is my start script: > >>>>>-------------------------------------------------------- > >>>>>modprobe -v qla2x00tgt > >>>>>modprobe -v scst_disk > >>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>>>-------------------------------------------------------- > >>>>> > >>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>>>Qlogic target-mode port. Everything starts successfully (including > >>>>>scsi_tgt module since it is a dependency of scst_disk). > >>>>> > >>>>>>From my initiator system I see the one drive I have exposed. I > >>>>>successfully partition that drive and do mkfs. At this point everything > >>>>>is still fine. I then mount the file system and copy a big file to it. > >>>>>The copy seems to work fine but at some point shortly after that my > >>>>>target system crashes. There is no oops output to the system log. So I > >>>>>did it again with a remote kgdb attached. Here is the gdb output: > >>>>> > >>>>> > >>>>>Program received signal SIGILL, Illegal instruction. > >>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>1055 if (put_page_testzero(page)) { > >>>>>(gdb) bt > >>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>>>> context=<value optimized out>, pflags=0xc046cfb8, > >>>>> left_locked=<value optimized out>) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>>>> pflags=0xc046cfb8, context=268435457) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>>>> at kernel/softirq.c:267 > >>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>>>#13 0xc1cf4000 in ?? () > >>>>>#14 0x00000000 in ?? () > >>>>> > >>>>> > >>>>>I can reproduce this easily every time. Let me know if you want any > >>>>>further information about this. > >>>>> > >>>>>...Brad Johnson > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>_______________________________________________ > >>>>>Scst-devel mailing list > >>>>>Scs...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>> > >>>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>> > >> > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > |
From: Vladislav B. <vs...@vl...> - 2006-05-24 11:04:12
|
Those messages on the initiator are expected. It issues them when it can't perform commands constantly receiving BUSY for them from the target, which can't allocate necessary memory in limited SG entries count for such huge data sizes. Brad, how were you managed the initiator to send such commands? I have seen such big data sizes only when sent them manually via sg for testing. Did you manually edit /sys/block/YOUR_DEVICE/queue/max_sectors_kb? What are the current values there and in /sys/block/YOUR_DEVICE/queue/max_hw_sectors_kb? You should set max_sectors_kb in something more sane like 1024. Vlad Brad Johnson wrote: > The initiator system log is full of these messages: > > sd 1:0:0:0: timing out command, waited 150s > sd 1:0:0:0: SCSI error: return code = 0x28 > end_request: I/O error, dev sda, sector 22334567 > sd 1:0:0:0: timing out command, waited 150s > sd 1:0:0:0: SCSI error: return code = 0x8 > end_request: I/O error, dev sda, sector 22382295 > > Here is the sgv file: > > Name Hit Total > sgv 0 0 > sgv-4K 0 0 > sgv-8K 0 0 > sgv-16K 0 0 > sgv-32K 0 0 > sgv-64K 0 0Name Hit Total > sgv 0 0 > sgv-4K 0 0 > sgv-8K 0 0 > sgv-16K 0 0 > sgv-32K 0 0 > sgv-64K 0 0 > sgv-128K 0 0 > sgv-256K 0 0 > sgv-512K 0 0 > sgv-1024K 0 0 > sgv-2048K 0 0 > > sgv-clust 2381747 2383002 > sgv-clust-4K 974 1040 > sgv-clust-8K 459 512 > sgv-clust-16K 510 543 > sgv-clust-32K 746 787 > sgv-clust-64K 8541 8629 > sgv-clust-128K 22 32 > sgv-clust-256K 61 76 > sgv-clust-512K 35599 35714 > sgv-clust-1024K 341439 341721 > sgv-clust-2048K 1993396 1993948 > > sgv-dma 0 0 > sgv-dma-4K 0 0 > sgv-dma-8K 0 0 > sgv-dma-16K 0 0 > sgv-dma-32K 0 0 > sgv-dma-64K 0 0 > sgv-dma-128K 0 0 > sgv-dma-256K 0 0 > sgv-dma-512K 0 0 > sgv-dma-1024K 0 0 > sgv-dma-2048K 0 0 > > big 7052121 > > sgv-128K 0 0 > sgv-256K 0 0 > sgv-512K 0 0 > sgv-1024K 0 0 > sgv-2048K 0 0 > > sgv-clust 2381747 2383002 > sgv-clust-4K 974 1040 > sgv-clust-8K 459 512 > sgv-clust-16K 510 543 > sgv-clust-32K 746 787 > sgv-clust-64K 8541 8629 > sgv-clust-128K 22 32 > sgv-clust-256K 61 76 > sgv-clust-512K 35599 35714 > sgv-clust-1024K 341439 341721 > sgv-clust-2048K 1993396 1993948 > > sgv-dma 0 0 > sgv-dma-4K 0 0 > sgv-dma-8K 0 0 > sgv-dma-16K 0 0 > sgv-dma-32K 0 0 > sgv-dma-64K 0 0 > sgv-dma-128K 0 0 > sgv-dma-256K 0 0 > sgv-dma-512K 0 0 > sgv-dma-1024K 0 0 > sgv-dma-2048K 0 0 > > big 7052121 > > > ...Brad > > > On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: > >>OK, thanks. >> >>Relating the messages. What is your initiator? Does it have something in >>its logs? 3 Mb data size in commands is a bit too much, Qlogic cards >>usually have troubles to handle it, so you see those messages. Could you >>send me the content of /proc/scsi_tgt/sgv file, please? >> >>Vlad >> >>Brad Johnson wrote: >> >>>Vlad, >>> >>>The patched version has not yet crashed after running bonnie for an >>>hour. But now it doesn't seem to be progressing, with the following >>>messages appearing continuously in the system log: >>> >>>May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY >>>status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, >>>sess->init_phase 3), probably the system is overloaded >>>May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to >>>allocate or build requested buffer (size 3182592), sending BUSY status >>> >>> >>>The requested buffer size varies slightly, but is always in the 3 >>>million range. >>> >>>...Brad >>> >>> >>>On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: >>> >>> >>>>Thanks. Could you try with the attached patch, please? (On top of the >>>>latest CVS) >>>> >>>>Brad Johnson wrote: >>>> >>>> >>>>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). >>>>>It goes a while longer before crashing this time, and fails in a >>>>>different place. This test again used scst_disk handler. I will next try >>>>>using the scst_fileio module and will let you know the results. >>>>>Here is the GDB output for this failure: >>>>> >>>>>Program received signal SIGILL, Illegal instruction. >>>>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized >>>>>out>, >>>>> pflags=0xf1488fd4, left_locked=1) >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 >>>>>2494 BUG(); >>>>>(gdb) bt >>>>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized >>>>>out>, >>>>> pflags=0xf1488fd4, left_locked=1) >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 >>>>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, >>>>> pflags=0xf1488fd4, context=268435459) >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:54 >>>>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 >>>>>#3 0xc01024d9 in kernel_thread_helper () at >>>>>arch/i386/kernel/process.c:298 >>>>> >>>>>(gdb) print *cmd >>>>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, >>>>> sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, >>>>> atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = >>>>>0, >>>>> data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, >>>>> cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, >>>>> tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, >>>>>search_cmd_list_entry = { >>>>> next = 0xe92800d0, prev = 0xe92800d0}, >>>>> cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", >>>>>cdb_len = 10, >>>>> queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, >>>>> data_direction = DMA_FROM_DEVICE, >>>>> expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = >>>>>4096, >>>>> data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, >>>>> sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, >>>>> get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', >>>>> msg_status = 0 '\0', host_status = 0, driver_status = 0, >>>>> sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, >>>>> tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, >>>>> extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = >>>>>0x0, >>>>> tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} >>>>>(gdb) >>>>> >>>>> >>>>>...Brad >>>>> >>>>> >>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: >>>>> >>>>> >>>>> >>>>>>1) can u try latest code from scst cvs >>>>>> >>>>>>2) can u try to export it via scst_fileio module? see if oops in same >>>>>>place. >>>>>> >>>>>>ming >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >>>>>> >>>>>> >>>>>> >>>>>>>The system running scst crashes when doing I/O to target from remote >>>>>>>system. >>>>>>> >>>>>>>Here is my setup: >>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>>>>>It is running Linux 2.6.15.7. >>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>>>>>connected to the switch.) >>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>>>>>Fibre-channel drive. >>>>>>> >>>>>>>Here is my start script: >>>>>>>-------------------------------------------------------- >>>>>>>modprobe -v qla2x00tgt >>>>>>>modprobe -v scst_disk >>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>>>>>-------------------------------------------------------- >>>>>>> >>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>>>>>Qlogic target-mode port. Everything starts successfully (including >>>>>>>scsi_tgt module since it is a dependency of scst_disk). >>>>>>> >>>>>>>>From my initiator system I see the one drive I have exposed. I >>>>>>>successfully partition that drive and do mkfs. At this point everything >>>>>>>is still fine. I then mount the file system and copy a big file to it. >>>>>>>The copy seems to work fine but at some point shortly after that my >>>>>>>target system crashes. There is no oops output to the system log. So I >>>>>>>did it again with a remote kgdb attached. Here is the gdb output: >>>>>>> >>>>>>> >>>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>1055 if (put_page_testzero(page)) { >>>>>>>(gdb) bt >>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>>>>>> context=<value optimized out>, pflags=0xc046cfb8, >>>>>>> left_locked=<value optimized out>) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>>>>>> pflags=0xc046cfb8, context=268435457) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>>>>>> at kernel/softirq.c:267 >>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>>>>>#13 0xc1cf4000 in ?? () >>>>>>>#14 0x00000000 in ?? () >>>>>>> >>>>>>> >>>>>>>I can reproduce this easily every time. Let me know if you want any >>>>>>>further information about this. >>>>>>> >>>>>>>...Brad Johnson >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>------------------------------------------------------- >>>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>>_______________________________________________ >>>>>>>Scst-devel mailing list >>>>>>>Scs...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>>> >>>>>> >>>>> >>>>>------------------------------------------------------- >>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>_______________________________________________ >>>>>Scst-devel mailing list >>>>>Scs...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>> >>>> >>> >>> >>>------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job easier >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Scst-devel mailing list >>>Scs...@li... >>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>> >> >> > > |
From: Brad J. <bjo...@pr...> - 2006-05-24 15:04:30
|
Both max_sectors_kb and max_hw_sectors_kb are set to 32767. I did not manually edit the values. I am using a Qlogic 2462 4 Gb HBA on the initiator. I don't know if those are its defaults or what, but if this happened to me it will surely happen to someone else in the future. So is the only solution to this just documenting that these values must be less than a certain size? I will lower the values to 1024 and try again. ...Brad On Wed, 2006-05-24 at 15:03 +0400, Vladislav Bolkhovitin wrote: > Those messages on the initiator are expected. It issues them when it > can't perform commands constantly receiving BUSY for them from the > target, which can't allocate necessary memory in limited SG entries > count for such huge data sizes. > > Brad, how were you managed the initiator to send such commands? I have > seen such big data sizes only when sent them manually via sg for > testing. Did you manually edit > /sys/block/YOUR_DEVICE/queue/max_sectors_kb? What are the current values > there and in /sys/block/YOUR_DEVICE/queue/max_hw_sectors_kb? You should > set max_sectors_kb in something more sane like 1024. > > Vlad > > Brad Johnson wrote: > > The initiator system log is full of these messages: > > > > sd 1:0:0:0: timing out command, waited 150s > > sd 1:0:0:0: SCSI error: return code = 0x28 > > end_request: I/O error, dev sda, sector 22334567 > > sd 1:0:0:0: timing out command, waited 150s > > sd 1:0:0:0: SCSI error: return code = 0x8 > > end_request: I/O error, dev sda, sector 22382295 > > > > Here is the sgv file: > > > > Name Hit Total > > sgv 0 0 > > sgv-4K 0 0 > > sgv-8K 0 0 > > sgv-16K 0 0 > > sgv-32K 0 0 > > sgv-64K 0 0Name Hit Total > > sgv 0 0 > > sgv-4K 0 0 > > sgv-8K 0 0 > > sgv-16K 0 0 > > sgv-32K 0 0 > > sgv-64K 0 0 > > sgv-128K 0 0 > > sgv-256K 0 0 > > sgv-512K 0 0 > > sgv-1024K 0 0 > > sgv-2048K 0 0 > > > > sgv-clust 2381747 2383002 > > sgv-clust-4K 974 1040 > > sgv-clust-8K 459 512 > > sgv-clust-16K 510 543 > > sgv-clust-32K 746 787 > > sgv-clust-64K 8541 8629 > > sgv-clust-128K 22 32 > > sgv-clust-256K 61 76 > > sgv-clust-512K 35599 35714 > > sgv-clust-1024K 341439 341721 > > sgv-clust-2048K 1993396 1993948 > > > > sgv-dma 0 0 > > sgv-dma-4K 0 0 > > sgv-dma-8K 0 0 > > sgv-dma-16K 0 0 > > sgv-dma-32K 0 0 > > sgv-dma-64K 0 0 > > sgv-dma-128K 0 0 > > sgv-dma-256K 0 0 > > sgv-dma-512K 0 0 > > sgv-dma-1024K 0 0 > > sgv-dma-2048K 0 0 > > > > big 7052121 > > > > sgv-128K 0 0 > > sgv-256K 0 0 > > sgv-512K 0 0 > > sgv-1024K 0 0 > > sgv-2048K 0 0 > > > > sgv-clust 2381747 2383002 > > sgv-clust-4K 974 1040 > > sgv-clust-8K 459 512 > > sgv-clust-16K 510 543 > > sgv-clust-32K 746 787 > > sgv-clust-64K 8541 8629 > > sgv-clust-128K 22 32 > > sgv-clust-256K 61 76 > > sgv-clust-512K 35599 35714 > > sgv-clust-1024K 341439 341721 > > sgv-clust-2048K 1993396 1993948 > > > > sgv-dma 0 0 > > sgv-dma-4K 0 0 > > sgv-dma-8K 0 0 > > sgv-dma-16K 0 0 > > sgv-dma-32K 0 0 > > sgv-dma-64K 0 0 > > sgv-dma-128K 0 0 > > sgv-dma-256K 0 0 > > sgv-dma-512K 0 0 > > sgv-dma-1024K 0 0 > > sgv-dma-2048K 0 0 > > > > big 7052121 > > > > > > ...Brad > > > > > > On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: > > > >>OK, thanks. > >> > >>Relating the messages. What is your initiator? Does it have something in > >>its logs? 3 Mb data size in commands is a bit too much, Qlogic cards > >>usually have troubles to handle it, so you see those messages. Could you > >>send me the content of /proc/scsi_tgt/sgv file, please? > >> > >>Vlad > >> > >>Brad Johnson wrote: > >> > >>>Vlad, > >>> > >>>The patched version has not yet crashed after running bonnie for an > >>>hour. But now it doesn't seem to be progressing, with the following > >>>messages appearing continuously in the system log: > >>> > >>>May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY > >>>status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, > >>>sess->init_phase 3), probably the system is overloaded > >>>May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to > >>>allocate or build requested buffer (size 3182592), sending BUSY status > >>> > >>> > >>>The requested buffer size varies slightly, but is always in the 3 > >>>million range. > >>> > >>>...Brad > >>> > >>> > >>>On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: > >>> > >>> > >>>>Thanks. Could you try with the attached patch, please? (On top of the > >>>>latest CVS) > >>>> > >>>>Brad Johnson wrote: > >>>> > >>>> > >>>>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > >>>>>It goes a while longer before crashing this time, and fails in a > >>>>>different place. This test again used scst_disk handler. I will next try > >>>>>using the scst_fileio module and will let you know the results. > >>>>>Here is the GDB output for this failure: > >>>>> > >>>>>Program received signal SIGILL, Illegal instruction. > >>>>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>>>out>, > >>>>> pflags=0xf1488fd4, left_locked=1) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>>>2494 BUG(); > >>>>>(gdb) bt > >>>>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>>>out>, > >>>>> pflags=0xf1488fd4, left_locked=1) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > >>>>> pflags=0xf1488fd4, context=268435459) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:54 > >>>>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 > >>>>>#3 0xc01024d9 in kernel_thread_helper () at > >>>>>arch/i386/kernel/process.c:298 > >>>>> > >>>>>(gdb) print *cmd > >>>>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > >>>>> sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > >>>>> atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > >>>>>0, > >>>>> data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > >>>>> cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > >>>>> tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > >>>>>search_cmd_list_entry = { > >>>>> next = 0xe92800d0, prev = 0xe92800d0}, > >>>>> cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > >>>>>cdb_len = 10, > >>>>> queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > >>>>> data_direction = DMA_FROM_DEVICE, > >>>>> expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > >>>>>4096, > >>>>> data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > >>>>> sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > >>>>> get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > >>>>> msg_status = 0 '\0', host_status = 0, driver_status = 0, > >>>>> sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > >>>>> tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > >>>>> extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > >>>>>0x0, > >>>>> tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > >>>>>(gdb) > >>>>> > >>>>> > >>>>>...Brad > >>>>> > >>>>> > >>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>>>> > >>>>> > >>>>> > >>>>>>1) can u try latest code from scst cvs > >>>>>> > >>>>>>2) can u try to export it via scst_fileio module? see if oops in same > >>>>>>place. > >>>>>> > >>>>>>ming > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>>The system running scst crashes when doing I/O to target from remote > >>>>>>>system. > >>>>>>> > >>>>>>>Here is my setup: > >>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>>>>>It is running Linux 2.6.15.7. > >>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>>>>>connected to the switch.) > >>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>>>>>Fibre-channel drive. > >>>>>>> > >>>>>>>Here is my start script: > >>>>>>>-------------------------------------------------------- > >>>>>>>modprobe -v qla2x00tgt > >>>>>>>modprobe -v scst_disk > >>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>>>>>-------------------------------------------------------- > >>>>>>> > >>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>>>>>Qlogic target-mode port. Everything starts successfully (including > >>>>>>>scsi_tgt module since it is a dependency of scst_disk). > >>>>>>> > >>>>>>>>From my initiator system I see the one drive I have exposed. I > >>>>>>>successfully partition that drive and do mkfs. At this point everything > >>>>>>>is still fine. I then mount the file system and copy a big file to it. > >>>>>>>The copy seems to work fine but at some point shortly after that my > >>>>>>>target system crashes. There is no oops output to the system log. So I > >>>>>>>did it again with a remote kgdb attached. Here is the gdb output: > >>>>>>> > >>>>>>> > >>>>>>>Program received signal SIGILL, Illegal instruction. > >>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>1055 if (put_page_testzero(page)) { > >>>>>>>(gdb) bt > >>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>>>>>> context=<value optimized out>, pflags=0xc046cfb8, > >>>>>>> left_locked=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>>>>>> pflags=0xc046cfb8, context=268435457) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>>>>>> at kernel/softirq.c:267 > >>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>>>>>#13 0xc1cf4000 in ?? () > >>>>>>>#14 0x00000000 in ?? () > >>>>>>> > >>>>>>> > >>>>>>>I can reproduce this easily every time. Let me know if you want any > >>>>>>>further information about this. > >>>>>>> > >>>>>>>...Brad Johnson > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>------------------------------------------------------- > >>>>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>>>_______________________________________________ > >>>>>>>Scst-devel mailing list > >>>>>>>Scs...@li... > >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>>> > >>>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>_______________________________________________ > >>>>>Scst-devel mailing list > >>>>>Scs...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>> > >>>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>> > >> > >> > > > > > > > > ------------------------------------------------------- > All the advantages of Linux Managed Hosting--Without the Cost and Risk! > Fully trained technicians. The highest number of Red Hat certifications in > the hosting industry. Fanatical Support. Click to learn more > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Brad J. <bjo...@pr...> - 2006-05-24 16:26:48
|
After setting max_sectors_kb = 1024 I got the same results (unable to allocate size 1 MB). Also the same results at 512. I had to lower it to 128 to get it to work. Isn't this a parameter that the target and initiator should negotiate? I know with iSCSI the target negotiates the maximum write buffer size it can handle in a single command. ...Brad On Wed, 2006-05-24 at 15:03 +0400, Vladislav Bolkhovitin wrote: > Those messages on the initiator are expected. It issues them when it > can't perform commands constantly receiving BUSY for them from the > target, which can't allocate necessary memory in limited SG entries > count for such huge data sizes. > > Brad, how were you managed the initiator to send such commands? I have > seen such big data sizes only when sent them manually via sg for > testing. Did you manually edit > /sys/block/YOUR_DEVICE/queue/max_sectors_kb? What are the current values > there and in /sys/block/YOUR_DEVICE/queue/max_hw_sectors_kb? You should > set max_sectors_kb in something more sane like 1024. > > Vlad > > Brad Johnson wrote: > > The initiator system log is full of these messages: > > > > sd 1:0:0:0: timing out command, waited 150s > > sd 1:0:0:0: SCSI error: return code = 0x28 > > end_request: I/O error, dev sda, sector 22334567 > > sd 1:0:0:0: timing out command, waited 150s > > sd 1:0:0:0: SCSI error: return code = 0x8 > > end_request: I/O error, dev sda, sector 22382295 > > > > Here is the sgv file: > > > > Name Hit Total > > sgv 0 0 > > sgv-4K 0 0 > > sgv-8K 0 0 > > sgv-16K 0 0 > > sgv-32K 0 0 > > sgv-64K 0 0Name Hit Total > > sgv 0 0 > > sgv-4K 0 0 > > sgv-8K 0 0 > > sgv-16K 0 0 > > sgv-32K 0 0 > > sgv-64K 0 0 > > sgv-128K 0 0 > > sgv-256K 0 0 > > sgv-512K 0 0 > > sgv-1024K 0 0 > > sgv-2048K 0 0 > > > > sgv-clust 2381747 2383002 > > sgv-clust-4K 974 1040 > > sgv-clust-8K 459 512 > > sgv-clust-16K 510 543 > > sgv-clust-32K 746 787 > > sgv-clust-64K 8541 8629 > > sgv-clust-128K 22 32 > > sgv-clust-256K 61 76 > > sgv-clust-512K 35599 35714 > > sgv-clust-1024K 341439 341721 > > sgv-clust-2048K 1993396 1993948 > > > > sgv-dma 0 0 > > sgv-dma-4K 0 0 > > sgv-dma-8K 0 0 > > sgv-dma-16K 0 0 > > sgv-dma-32K 0 0 > > sgv-dma-64K 0 0 > > sgv-dma-128K 0 0 > > sgv-dma-256K 0 0 > > sgv-dma-512K 0 0 > > sgv-dma-1024K 0 0 > > sgv-dma-2048K 0 0 > > > > big 7052121 > > > > sgv-128K 0 0 > > sgv-256K 0 0 > > sgv-512K 0 0 > > sgv-1024K 0 0 > > sgv-2048K 0 0 > > > > sgv-clust 2381747 2383002 > > sgv-clust-4K 974 1040 > > sgv-clust-8K 459 512 > > sgv-clust-16K 510 543 > > sgv-clust-32K 746 787 > > sgv-clust-64K 8541 8629 > > sgv-clust-128K 22 32 > > sgv-clust-256K 61 76 > > sgv-clust-512K 35599 35714 > > sgv-clust-1024K 341439 341721 > > sgv-clust-2048K 1993396 1993948 > > > > sgv-dma 0 0 > > sgv-dma-4K 0 0 > > sgv-dma-8K 0 0 > > sgv-dma-16K 0 0 > > sgv-dma-32K 0 0 > > sgv-dma-64K 0 0 > > sgv-dma-128K 0 0 > > sgv-dma-256K 0 0 > > sgv-dma-512K 0 0 > > sgv-dma-1024K 0 0 > > sgv-dma-2048K 0 0 > > > > big 7052121 > > > > > > ...Brad > > > > > > On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: > > > >>OK, thanks. > >> > >>Relating the messages. What is your initiator? Does it have something in > >>its logs? 3 Mb data size in commands is a bit too much, Qlogic cards > >>usually have troubles to handle it, so you see those messages. Could you > >>send me the content of /proc/scsi_tgt/sgv file, please? > >> > >>Vlad > >> > >>Brad Johnson wrote: > >> > >>>Vlad, > >>> > >>>The patched version has not yet crashed after running bonnie for an > >>>hour. But now it doesn't seem to be progressing, with the following > >>>messages appearing continuously in the system log: > >>> > >>>May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY > >>>status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, > >>>sess->init_phase 3), probably the system is overloaded > >>>May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to > >>>allocate or build requested buffer (size 3182592), sending BUSY status > >>> > >>> > >>>The requested buffer size varies slightly, but is always in the 3 > >>>million range. > >>> > >>>...Brad > >>> > >>> > >>>On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: > >>> > >>> > >>>>Thanks. Could you try with the attached patch, please? (On top of the > >>>>latest CVS) > >>>> > >>>>Brad Johnson wrote: > >>>> > >>>> > >>>>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > >>>>>It goes a while longer before crashing this time, and fails in a > >>>>>different place. This test again used scst_disk handler. I will next try > >>>>>using the scst_fileio module and will let you know the results. > >>>>>Here is the GDB output for this failure: > >>>>> > >>>>>Program received signal SIGILL, Illegal instruction. > >>>>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>>>out>, > >>>>> pflags=0xf1488fd4, left_locked=1) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>>>2494 BUG(); > >>>>>(gdb) bt > >>>>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>>>out>, > >>>>> pflags=0xf1488fd4, left_locked=1) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > >>>>> pflags=0xf1488fd4, context=268435459) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:54 > >>>>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 > >>>>>#3 0xc01024d9 in kernel_thread_helper () at > >>>>>arch/i386/kernel/process.c:298 > >>>>> > >>>>>(gdb) print *cmd > >>>>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > >>>>> sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > >>>>> atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > >>>>>0, > >>>>> data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > >>>>> cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > >>>>> tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > >>>>>search_cmd_list_entry = { > >>>>> next = 0xe92800d0, prev = 0xe92800d0}, > >>>>> cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > >>>>>cdb_len = 10, > >>>>> queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > >>>>> data_direction = DMA_FROM_DEVICE, > >>>>> expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > >>>>>4096, > >>>>> data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > >>>>> sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > >>>>> get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > >>>>> msg_status = 0 '\0', host_status = 0, driver_status = 0, > >>>>> sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > >>>>> tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > >>>>> extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > >>>>>0x0, > >>>>> tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > >>>>>(gdb) > >>>>> > >>>>> > >>>>>...Brad > >>>>> > >>>>> > >>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>>>> > >>>>> > >>>>> > >>>>>>1) can u try latest code from scst cvs > >>>>>> > >>>>>>2) can u try to export it via scst_fileio module? see if oops in same > >>>>>>place. > >>>>>> > >>>>>>ming > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>>The system running scst crashes when doing I/O to target from remote > >>>>>>>system. > >>>>>>> > >>>>>>>Here is my setup: > >>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>>>>>It is running Linux 2.6.15.7. > >>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>>>>>connected to the switch.) > >>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>>>>>Fibre-channel drive. > >>>>>>> > >>>>>>>Here is my start script: > >>>>>>>-------------------------------------------------------- > >>>>>>>modprobe -v qla2x00tgt > >>>>>>>modprobe -v scst_disk > >>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>>>>>-------------------------------------------------------- > >>>>>>> > >>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>>>>>Qlogic target-mode port. Everything starts successfully (including > >>>>>>>scsi_tgt module since it is a dependency of scst_disk). > >>>>>>> > >>>>>>>>From my initiator system I see the one drive I have exposed. I > >>>>>>>successfully partition that drive and do mkfs. At this point everything > >>>>>>>is still fine. I then mount the file system and copy a big file to it. > >>>>>>>The copy seems to work fine but at some point shortly after that my > >>>>>>>target system crashes. There is no oops output to the system log. So I > >>>>>>>did it again with a remote kgdb attached. Here is the gdb output: > >>>>>>> > >>>>>>> > >>>>>>>Program received signal SIGILL, Illegal instruction. > >>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>1055 if (put_page_testzero(page)) { > >>>>>>>(gdb) bt > >>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>>>>>> context=<value optimized out>, pflags=0xc046cfb8, > >>>>>>> left_locked=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>>>>>> pflags=0xc046cfb8, context=268435457) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>>>>>> at kernel/softirq.c:267 > >>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>>>>>#13 0xc1cf4000 in ?? () > >>>>>>>#14 0x00000000 in ?? () > >>>>>>> > >>>>>>> > >>>>>>>I can reproduce this easily every time. Let me know if you want any > >>>>>>>further information about this. > >>>>>>> > >>>>>>>...Brad Johnson > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>------------------------------------------------------- > >>>>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>>>_______________________________________________ > >>>>>>>Scst-devel mailing list > >>>>>>>Scs...@li... > >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>>> > >>>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>_______________________________________________ > >>>>>Scst-devel mailing list > >>>>>Scs...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>> > >>>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>> > >> > >> > > > > > > > > ------------------------------------------------------- > All the advantages of Linux Managed Hosting--Without the Cost and Risk! > Fully trained technicians. The highest number of Red Hat certifications in > the hosting industry. Fanatical Support. Click to learn more > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Vladislav B. <vs...@vl...> - 2006-05-25 10:38:03
|
Brad Johnson wrote: > After setting max_sectors_kb = 1024 I got the same results (unable to > allocate size 1 MB). Also the same results at 512. I had to lower it to > 128 to get it to work. Which means that there are some problems in SCST memory allocator. It must allocate memory up to 1 Mb more or less seamlessly. I'll check it. > Isn't this a parameter that the target and initiator should negotiate? I > know with iSCSI the target negotiates the maximum write buffer size it > can handle in a single command. Maybe, but not in the current implementation, where this limit set statically (as in the most SCSI drivers). AFAIK, among all SCSI transports only iSCSI allows such negotiation. > ...Brad > > > On Wed, 2006-05-24 at 15:03 +0400, Vladislav Bolkhovitin wrote: > >>Those messages on the initiator are expected. It issues them when it >>can't perform commands constantly receiving BUSY for them from the >>target, which can't allocate necessary memory in limited SG entries >>count for such huge data sizes. >> >>Brad, how were you managed the initiator to send such commands? I have >>seen such big data sizes only when sent them manually via sg for >>testing. Did you manually edit >>/sys/block/YOUR_DEVICE/queue/max_sectors_kb? What are the current values >>there and in /sys/block/YOUR_DEVICE/queue/max_hw_sectors_kb? You should >>set max_sectors_kb in something more sane like 1024. >> >>Vlad >> >>Brad Johnson wrote: >> >>>The initiator system log is full of these messages: >>> >>>sd 1:0:0:0: timing out command, waited 150s >>>sd 1:0:0:0: SCSI error: return code = 0x28 >>>end_request: I/O error, dev sda, sector 22334567 >>>sd 1:0:0:0: timing out command, waited 150s >>>sd 1:0:0:0: SCSI error: return code = 0x8 >>>end_request: I/O error, dev sda, sector 22382295 >>> >>>Here is the sgv file: >>> >>>Name Hit Total >>>sgv 0 0 >>> sgv-4K 0 0 >>> sgv-8K 0 0 >>> sgv-16K 0 0 >>> sgv-32K 0 0 >>> sgv-64K 0 0Name Hit Total >>>sgv 0 0 >>> sgv-4K 0 0 >>> sgv-8K 0 0 >>> sgv-16K 0 0 >>> sgv-32K 0 0 >>> sgv-64K 0 0 >>> sgv-128K 0 0 >>> sgv-256K 0 0 >>> sgv-512K 0 0 >>> sgv-1024K 0 0 >>> sgv-2048K 0 0 >>> >>>sgv-clust 2381747 2383002 >>> sgv-clust-4K 974 1040 >>> sgv-clust-8K 459 512 >>> sgv-clust-16K 510 543 >>> sgv-clust-32K 746 787 >>> sgv-clust-64K 8541 8629 >>> sgv-clust-128K 22 32 >>> sgv-clust-256K 61 76 >>> sgv-clust-512K 35599 35714 >>> sgv-clust-1024K 341439 341721 >>> sgv-clust-2048K 1993396 1993948 >>> >>>sgv-dma 0 0 >>> sgv-dma-4K 0 0 >>> sgv-dma-8K 0 0 >>> sgv-dma-16K 0 0 >>> sgv-dma-32K 0 0 >>> sgv-dma-64K 0 0 >>> sgv-dma-128K 0 0 >>> sgv-dma-256K 0 0 >>> sgv-dma-512K 0 0 >>> sgv-dma-1024K 0 0 >>> sgv-dma-2048K 0 0 >>> >>>big 7052121 >>> >>> sgv-128K 0 0 >>> sgv-256K 0 0 >>> sgv-512K 0 0 >>> sgv-1024K 0 0 >>> sgv-2048K 0 0 >>> >>>sgv-clust 2381747 2383002 >>> sgv-clust-4K 974 1040 >>> sgv-clust-8K 459 512 >>> sgv-clust-16K 510 543 >>> sgv-clust-32K 746 787 >>> sgv-clust-64K 8541 8629 >>> sgv-clust-128K 22 32 >>> sgv-clust-256K 61 76 >>> sgv-clust-512K 35599 35714 >>> sgv-clust-1024K 341439 341721 >>> sgv-clust-2048K 1993396 1993948 >>> >>>sgv-dma 0 0 >>> sgv-dma-4K 0 0 >>> sgv-dma-8K 0 0 >>> sgv-dma-16K 0 0 >>> sgv-dma-32K 0 0 >>> sgv-dma-64K 0 0 >>> sgv-dma-128K 0 0 >>> sgv-dma-256K 0 0 >>> sgv-dma-512K 0 0 >>> sgv-dma-1024K 0 0 >>> sgv-dma-2048K 0 0 >>> >>>big 7052121 >>> >>> >>>...Brad >>> >>> >>>On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: >>> >>> >>>>OK, thanks. >>>> >>>>Relating the messages. What is your initiator? Does it have something in >>>>its logs? 3 Mb data size in commands is a bit too much, Qlogic cards >>>>usually have troubles to handle it, so you see those messages. Could you >>>>send me the content of /proc/scsi_tgt/sgv file, please? >>>> >>>>Vlad >>>> >>>>Brad Johnson wrote: >>>> >>>> >>>>>Vlad, >>>>> >>>>>The patched version has not yet crashed after running bonnie for an >>>>>hour. But now it doesn't seem to be progressing, with the following >>>>>messages appearing continuously in the system log: >>>>> >>>>>May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY >>>>>status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, >>>>>sess->init_phase 3), probably the system is overloaded >>>>>May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to >>>>>allocate or build requested buffer (size 3182592), sending BUSY status >>>>> >>>>> >>>>>The requested buffer size varies slightly, but is always in the 3 >>>>>million range. >>>>> >>>>>...Brad >>>>> >>>>> >>>>>On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: >>>>> >>>>> >>>>> >>>>>>Thanks. Could you try with the attached patch, please? (On top of the >>>>>>latest CVS) >>>>>> >>>>>>Brad Johnson wrote: >>>>>> >>>>>> >>>>>> >>>>>>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). >>>>>>>It goes a while longer before crashing this time, and fails in a >>>>>>>different place. This test again used scst_disk handler. I will next try >>>>>>>using the scst_fileio module and will let you know the results. >>>>>>>Here is the GDB output for this failure: >>>>>>> >>>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized >>>>>>>out>, >>>>>>> pflags=0xf1488fd4, left_locked=1) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 >>>>>>>2494 BUG(); >>>>>>>(gdb) bt >>>>>>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized >>>>>>>out>, >>>>>>> pflags=0xf1488fd4, left_locked=1) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 >>>>>>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, >>>>>>> pflags=0xf1488fd4, context=268435459) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:54 >>>>>>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 >>>>>>>#3 0xc01024d9 in kernel_thread_helper () at >>>>>>>arch/i386/kernel/process.c:298 >>>>>>> >>>>>>>(gdb) print *cmd >>>>>>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, >>>>>>>sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, >>>>>>>atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = >>>>>>>0, >>>>>>>data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, >>>>>>>cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, >>>>>>>tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, >>>>>>>search_cmd_list_entry = { >>>>>>> next = 0xe92800d0, prev = 0xe92800d0}, >>>>>>>cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", >>>>>>>cdb_len = 10, >>>>>>>queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, >>>>>>>data_direction = DMA_FROM_DEVICE, >>>>>>>expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = >>>>>>>4096, >>>>>>>data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, >>>>>>>sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, >>>>>>>get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', >>>>>>>msg_status = 0 '\0', host_status = 0, driver_status = 0, >>>>>>>sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, >>>>>>>tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, >>>>>>>extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = >>>>>>>0x0, >>>>>>>tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} >>>>>>>(gdb) >>>>>>> >>>>>>> >>>>>>>...Brad >>>>>>> >>>>>>> >>>>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>1) can u try latest code from scst cvs >>>>>>>> >>>>>>>>2) can u try to export it via scst_fileio module? see if oops in same >>>>>>>>place. >>>>>>>> >>>>>>>>ming >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>The system running scst crashes when doing I/O to target from remote >>>>>>>>>system. >>>>>>>>> >>>>>>>>>Here is my setup: >>>>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>>>>>>>It is running Linux 2.6.15.7. >>>>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>>>>>>>connected to the switch.) >>>>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>>>>>>>Fibre-channel drive. >>>>>>>>> >>>>>>>>>Here is my start script: >>>>>>>>>-------------------------------------------------------- >>>>>>>>>modprobe -v qla2x00tgt >>>>>>>>>modprobe -v scst_disk >>>>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>>>>>>>-------------------------------------------------------- >>>>>>>>> >>>>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>>>>>>>Qlogic target-mode port. Everything starts successfully (including >>>>>>>>>scsi_tgt module since it is a dependency of scst_disk). >>>>>>>>> >>>>>>>>>>From my initiator system I see the one drive I have exposed. I >>>>>>>>>successfully partition that drive and do mkfs. At this point everything >>>>>>>>>is still fine. I then mount the file system and copy a big file to it. >>>>>>>>>The copy seems to work fine but at some point shortly after that my >>>>>>>>>target system crashes. There is no oops output to the system log. So I >>>>>>>>>did it again with a remote kgdb attached. Here is the gdb output: >>>>>>>>> >>>>>>>>> >>>>>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>>>1055 if (put_page_testzero(page)) { >>>>>>>>>(gdb) bt >>>>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>>>>>>>> context=<value optimized out>, pflags=0xc046cfb8, >>>>>>>>> left_locked=<value optimized out>) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>>>>>>>> pflags=0xc046cfb8, context=268435457) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>>>>>>>> at kernel/softirq.c:267 >>>>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>>>>>>>#13 0xc1cf4000 in ?? () >>>>>>>>>#14 0x00000000 in ?? () >>>>>>>>> >>>>>>>>> >>>>>>>>>I can reproduce this easily every time. Let me know if you want any >>>>>>>>>further information about this. >>>>>>>>> >>>>>>>>>...Brad Johnson >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>------------------------------------------------------- >>>>>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>>>>_______________________________________________ >>>>>>>>>Scst-devel mailing list >>>>>>>>>Scs...@li... >>>>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>>>>> >>>>>>>> >>>>>>>------------------------------------------------------- >>>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>>_______________________________________________ >>>>>>>Scst-devel mailing list >>>>>>>Scs...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>>>> >>>>>> >>>>> >>>>>------------------------------------------------------- >>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>_______________________________________________ >>>>>Scst-devel mailing list >>>>>Scs...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>> >>>> >>>> >>> >> >> >>------------------------------------------------------- >>All the advantages of Linux Managed Hosting--Without the Cost and Risk! >>Fully trained technicians. The highest number of Red Hat certifications in >>the hosting industry. Fanatical Support. Click to learn more >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 >>_______________________________________________ >>Scst-devel mailing list >>Scs...@li... >>https://lists.sourceforge.net/lists/listinfo/scst-devel >> > > > |
From: Vladislav B. <vs...@vl...> - 2006-05-26 16:43:48
|
Resent, because sf.net bounced the original. ==================================================================== Brad Johnson wrote: > After setting max_sectors_kb = 1024 I got the same results (unable to > allocate size 1 MB). Also the same results at 512. I had to lower it to > 128 to get it to work. Which means that there are some problems in SCST memory allocator. It must allocate memory up to 1 Mb more or less seamlessly. I'll check it. > Isn't this a parameter that the target and initiator should negotiate? I > know with iSCSI the target negotiates the maximum write buffer size it > can handle in a single command. Maybe, but not in the current implementation, where this limit set statically (as in the most SCSI drivers). AFAIK, among all SCSI transports only iSCSI allows such negotiation. > ...Brad > > > On Wed, 2006-05-24 at 15:03 +0400, Vladislav Bolkhovitin wrote: > >>Those messages on the initiator are expected. It issues them when it >>can't perform commands constantly receiving BUSY for them from the >>target, which can't allocate necessary memory in limited SG entries >>count for such huge data sizes. >> >>Brad, how were you managed the initiator to send such commands? I have >>seen such big data sizes only when sent them manually via sg for >>testing. Did you manually edit >>/sys/block/YOUR_DEVICE/queue/max_sectors_kb? What are the current values >>there and in /sys/block/YOUR_DEVICE/queue/max_hw_sectors_kb? You should >>set max_sectors_kb in something more sane like 1024. >> >>Vlad >> >>Brad Johnson wrote: >> >>>The initiator system log is full of these messages: >>> >>>sd 1:0:0:0: timing out command, waited 150s >>>sd 1:0:0:0: SCSI error: return code = 0x28 >>>end_request: I/O error, dev sda, sector 22334567 >>>sd 1:0:0:0: timing out command, waited 150s >>>sd 1:0:0:0: SCSI error: return code = 0x8 >>>end_request: I/O error, dev sda, sector 22382295 >>> >>>Here is the sgv file: >>> >>>Name Hit Total >>>sgv 0 0 >>> sgv-4K 0 0 >>> sgv-8K 0 0 >>> sgv-16K 0 0 >>> sgv-32K 0 0 >>> sgv-64K 0 0Name Hit Total >>>sgv 0 0 >>> sgv-4K 0 0 >>> sgv-8K 0 0 >>> sgv-16K 0 0 >>> sgv-32K 0 0 >>> sgv-64K 0 0 >>> sgv-128K 0 0 >>> sgv-256K 0 0 >>> sgv-512K 0 0 >>> sgv-1024K 0 0 >>> sgv-2048K 0 0 >>> >>>sgv-clust 2381747 2383002 >>> sgv-clust-4K 974 1040 >>> sgv-clust-8K 459 512 >>> sgv-clust-16K 510 543 >>> sgv-clust-32K 746 787 >>> sgv-clust-64K 8541 8629 >>> sgv-clust-128K 22 32 >>> sgv-clust-256K 61 76 >>> sgv-clust-512K 35599 35714 >>> sgv-clust-1024K 341439 341721 >>> sgv-clust-2048K 1993396 1993948 >>> >>>sgv-dma 0 0 >>> sgv-dma-4K 0 0 >>> sgv-dma-8K 0 0 >>> sgv-dma-16K 0 0 >>> sgv-dma-32K 0 0 >>> sgv-dma-64K 0 0 >>> sgv-dma-128K 0 0 >>> sgv-dma-256K 0 0 >>> sgv-dma-512K 0 0 >>> sgv-dma-1024K 0 0 >>> sgv-dma-2048K 0 0 >>> >>>big 7052121 >>> >>> sgv-128K 0 0 >>> sgv-256K 0 0 >>> sgv-512K 0 0 >>> sgv-1024K 0 0 >>> sgv-2048K 0 0 >>> >>>sgv-clust 2381747 2383002 >>> sgv-clust-4K 974 1040 >>> sgv-clust-8K 459 512 >>> sgv-clust-16K 510 543 >>> sgv-clust-32K 746 787 >>> sgv-clust-64K 8541 8629 >>> sgv-clust-128K 22 32 >>> sgv-clust-256K 61 76 >>> sgv-clust-512K 35599 35714 >>> sgv-clust-1024K 341439 341721 >>> sgv-clust-2048K 1993396 1993948 >>> >>>sgv-dma 0 0 >>> sgv-dma-4K 0 0 >>> sgv-dma-8K 0 0 >>> sgv-dma-16K 0 0 >>> sgv-dma-32K 0 0 >>> sgv-dma-64K 0 0 >>> sgv-dma-128K 0 0 >>> sgv-dma-256K 0 0 >>> sgv-dma-512K 0 0 >>> sgv-dma-1024K 0 0 >>> sgv-dma-2048K 0 0 >>> >>>big 7052121 >>> >>> >>>...Brad >>> >>> >>>On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: >>> >>> >>>>OK, thanks. >>>> >>>>Relating the messages. What is your initiator? Does it have something in >>>>its logs? 3 Mb data size in commands is a bit too much, Qlogic cards >>>>usually have troubles to handle it, so you see those messages. Could you >>>>send me the content of /proc/scsi_tgt/sgv file, please? >>>> >>>>Vlad >>>> >>>>Brad Johnson wrote: >>>> >>>> >>>>>Vlad, >>>>> >>>>>The patched version has not yet crashed after running bonnie for an >>>>>hour. But now it doesn't seem to be progressing, with the following >>>>>messages appearing continuously in the system log: >>>>> >>>>>May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY >>>>>status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, >>>>>sess->init_phase 3), probably the system is overloaded >>>>>May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to >>>>>allocate or build requested buffer (size 3182592), sending BUSY status >>>>> >>>>> >>>>>The requested buffer size varies slightly, but is always in the 3 >>>>>million range. >>>>> >>>>>...Brad >>>>> >>>>> >>>>>On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: >>>>> >>>>> >>>>> >>>>>>Thanks. Could you try with the attached patch, please? (On top of the >>>>>>latest CVS) >>>>>> >>>>>>Brad Johnson wrote: >>>>>> >>>>>> >>>>>> >>>>>>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). >>>>>>>It goes a while longer before crashing this time, and fails in a >>>>>>>different place. This test again used scst_disk handler. I will next try >>>>>>>using the scst_fileio module and will let you know the results. >>>>>>>Here is the GDB output for this failure: >>>>>>> >>>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized >>>>>>>out>, >>>>>>> pflags=0xf1488fd4, left_locked=1) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 >>>>>>>2494 BUG(); >>>>>>>(gdb) bt >>>>>>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized >>>>>>>out>, >>>>>>> pflags=0xf1488fd4, left_locked=1) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 >>>>>>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, >>>>>>> pflags=0xf1488fd4, context=268435459) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:54 >>>>>>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) >>>>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 >>>>>>>#3 0xc01024d9 in kernel_thread_helper () at >>>>>>>arch/i386/kernel/process.c:298 >>>>>>> >>>>>>>(gdb) print *cmd >>>>>>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, >>>>>>>sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, >>>>>>>atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = >>>>>>>0, >>>>>>>data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, >>>>>>>cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, >>>>>>>tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, >>>>>>>search_cmd_list_entry = { >>>>>>> next = 0xe92800d0, prev = 0xe92800d0}, >>>>>>>cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", >>>>>>>cdb_len = 10, >>>>>>>queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, >>>>>>>data_direction = DMA_FROM_DEVICE, >>>>>>>expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = >>>>>>>4096, >>>>>>>data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, >>>>>>>sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, >>>>>>>get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', >>>>>>>msg_status = 0 '\0', host_status = 0, driver_status = 0, >>>>>>>sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, >>>>>>>tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, >>>>>>>extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = >>>>>>>0x0, >>>>>>>tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} >>>>>>>(gdb) >>>>>>> >>>>>>> >>>>>>>...Brad >>>>>>> >>>>>>> >>>>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>1) can u try latest code from scst cvs >>>>>>>> >>>>>>>>2) can u try to export it via scst_fileio module? see if oops in same >>>>>>>>place. >>>>>>>> >>>>>>>>ming >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>The system running scst crashes when doing I/O to target from remote >>>>>>>>>system. >>>>>>>>> >>>>>>>>>Here is my setup: >>>>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>>>>>>>It is running Linux 2.6.15.7. >>>>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>>>>>>>connected to the switch.) >>>>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>>>>>>>Fibre-channel drive. >>>>>>>>> >>>>>>>>>Here is my start script: >>>>>>>>>-------------------------------------------------------- >>>>>>>>>modprobe -v qla2x00tgt >>>>>>>>>modprobe -v scst_disk >>>>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>>>>>>>-------------------------------------------------------- >>>>>>>>> >>>>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>>>>>>>Qlogic target-mode port. Everything starts successfully (including >>>>>>>>>scsi_tgt module since it is a dependency of scst_disk). >>>>>>>>> >>>>>>>>>>From my initiator system I see the one drive I have exposed. I >>>>>>>>>successfully partition that drive and do mkfs. At this point everything >>>>>>>>>is still fine. I then mount the file system and copy a big file to it. >>>>>>>>>The copy seems to work fine but at some point shortly after that my >>>>>>>>>target system crashes. There is no oops output to the system log. So I >>>>>>>>>did it again with a remote kgdb attached. Here is the gdb output: >>>>>>>>> >>>>>>>>> >>>>>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>>>1055 if (put_page_testzero(page)) { >>>>>>>>>(gdb) bt >>>>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>>>>>>>> context=<value optimized out>, pflags=0xc046cfb8, >>>>>>>>> left_locked=<value optimized out>) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>>>>>>>> pflags=0xc046cfb8, context=268435457) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>>>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>>>>>>>> at kernel/softirq.c:267 >>>>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>>>>>>>#13 0xc1cf4000 in ?? () >>>>>>>>>#14 0x00000000 in ?? () >>>>>>>>> >>>>>>>>> >>>>>>>>>I can reproduce this easily every time. Let me know if you want any >>>>>>>>>further information about this. >>>>>>>>> >>>>>>>>>...Brad Johnson >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>------------------------------------------------------- >>>>>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>>>>_______________________________________________ >>>>>>>>>Scst-devel mailing list >>>>>>>>>Scs...@li... >>>>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>>>>> >>>>>>>> >>>>>>>------------------------------------------------------- >>>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>>_______________________________________________ >>>>>>>Scst-devel mailing list >>>>>>>Scs...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>>>> >>>>>> >>>>> >>>>>------------------------------------------------------- >>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>_______________________________________________ >>>>>Scst-devel mailing list >>>>>Scs...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>> >>>> >>>> >>> >> >> >>------------------------------------------------------- >>All the advantages of Linux Managed Hosting--Without the Cost and Risk! >>Fully trained technicians. The highest number of Red Hat certifications in >>the hosting industry. Fanatical Support. Click to learn more >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 >>_______________________________________________ >>Scst-devel mailing list >>Scs...@li... >>https://lists.sourceforge.net/lists/listinfo/scst-devel >> > > > |
From: Brad J. <bjo...@pr...> - 2006-05-19 16:25:57
|
I tried it again, this time using the fileio handler (I exported an entire device /dev/sda1). This time I did not have any problems. From the initiator system I ran an entire bonnie disk I/O test run without failure. ...Brad On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > 1) can u try latest code from scst cvs > > 2) can u try to export it via scst_fileio module? see if oops in same > place. > > ming > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > The system running scst crashes when doing I/O to target from remote > > system. > > > > Here is my setup: > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > > It is running Linux 2.6.15.7. > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > > host. (My FC Initiator is another x86 system with a Qlogic HBA also > > connected to the switch.) > > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > > Fibre-channel drive. > > > > Here is my start script: > > -------------------------------------------------------- > > modprobe -v qla2x00tgt > > modprobe -v scst_disk > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > -------------------------------------------------------- > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > > Qlogic target-mode port. Everything starts successfully (including > > scsi_tgt module since it is a dependency of scst_disk). > > > > >From my initiator system I see the one drive I have exposed. I > > successfully partition that drive and do mkfs. At this point everything > > is still fine. I then mount the file system and copy a big file to it. > > The copy seems to work fine but at some point shortly after that my > > target system crashes. There is no oops output to the system log. So I > > did it again with a remote kgdb attached. Here is the gdb output: > > > > > > Program received signal SIGILL, Illegal instruction. > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > 1055 if (put_page_testzero(page)) { > > (gdb) bt > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > context=<value optimized out>, pflags=0xc046cfb8, > > left_locked=<value optimized out>) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > pflags=0xc046cfb8, context=268435457) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > at kernel/softirq.c:267 > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > > #12 0xc010499e in common_interrupt () at thread_info.h:91 > > #13 0xc1cf4000 in ?? () > > #14 0x00000000 in ?? () > > > > > > I can reproduce this easily every time. Let me know if you want any > > further information about this. > > > > ...Brad Johnson > > > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > |
From: Ming Z. <min...@gm...> - 2006-05-19 16:29:36
|
thx for the info. i never run diskio so i am not aware of its stability. maybe Vlad has better idea on this. Ming On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > I tried it again, this time using the fileio handler (I exported an > entire device /dev/sda1). > This time I did not have any problems. From the initiator system I ran > an entire bonnie disk I/O test run without failure. > > ...Brad > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > 1) can u try latest code from scst cvs > > > > 2) can u try to export it via scst_fileio module? see if oops in same > > place. > > > > ming > > > > > > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > > The system running scst crashes when doing I/O to target from remote > > > system. > > > > > > Here is my setup: > > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > > > It is running Linux 2.6.15.7. > > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > > > host. (My FC Initiator is another x86 system with a Qlogic HBA also > > > connected to the switch.) > > > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > > > Fibre-channel drive. > > > > > > Here is my start script: > > > -------------------------------------------------------- > > > modprobe -v qla2x00tgt > > > modprobe -v scst_disk > > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > > -------------------------------------------------------- > > > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > > > Qlogic target-mode port. Everything starts successfully (including > > > scsi_tgt module since it is a dependency of scst_disk). > > > > > > >From my initiator system I see the one drive I have exposed. I > > > successfully partition that drive and do mkfs. At this point everything > > > is still fine. I then mount the file system and copy a big file to it. > > > The copy seems to work fine but at some point shortly after that my > > > target system crashes. There is no oops output to the system log. So I > > > did it again with a remote kgdb attached. Here is the gdb output: > > > > > > > > > Program received signal SIGILL, Illegal instruction. > > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > 1055 if (put_page_testzero(page)) { > > > (gdb) bt > > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > > context=<value optimized out>, pflags=0xc046cfb8, > > > left_locked=<value optimized out>) > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > > pflags=0xc046cfb8, context=268435457) > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > > at kernel/softirq.c:267 > > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > > > #12 0xc010499e in common_interrupt () at thread_info.h:91 > > > #13 0xc1cf4000 in ?? () > > > #14 0x00000000 in ?? () > > > > > > > > > I can reproduce this easily every time. Let me know if you want any > > > further information about this. > > > > > > ...Brad Johnson > > > > > > > > > > > > > > > ------------------------------------------------------- > > > Using Tomcat but need to do more? Need to support web services, security? > > > Get stuff done quickly with pre-integrated technology to make your job easier > > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > > _______________________________________________ > > > Scst-devel mailing list > > > Scs...@li... > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel |
From: Brad J. <bjo...@pr...> - 2006-05-19 16:36:49
|
What are the advantages/disadvantages of using fileio instead of diskio? Also, how does the raid device handler work? ...Brad On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > thx for the info. i never run diskio so i am not aware of its stability. > maybe Vlad has better idea on this. > > Ming > > > On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > > I tried it again, this time using the fileio handler (I exported an > > entire device /dev/sda1). > > This time I did not have any problems. From the initiator system I ran > > an entire bonnie disk I/O test run without failure. > > > > ...Brad > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > 1) can u try latest code from scst cvs > > > > > > 2) can u try to export it via scst_fileio module? see if oops in same > > > place. > > > > > > ming > > > > > > > > > > > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > > > The system running scst crashes when doing I/O to target from remote > > > > system. > > > > > > > > Here is my setup: > > > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > > > > It is running Linux 2.6.15.7. > > > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > > > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > > > > host. (My FC Initiator is another x86 system with a Qlogic HBA also > > > > connected to the switch.) > > > > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > > > > Fibre-channel drive. > > > > > > > > Here is my start script: > > > > -------------------------------------------------------- > > > > modprobe -v qla2x00tgt > > > > modprobe -v scst_disk > > > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > > > -------------------------------------------------------- > > > > > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > > > > Qlogic target-mode port. Everything starts successfully (including > > > > scsi_tgt module since it is a dependency of scst_disk). > > > > > > > > >From my initiator system I see the one drive I have exposed. I > > > > successfully partition that drive and do mkfs. At this point everything > > > > is still fine. I then mount the file system and copy a big file to it. > > > > The copy seems to work fine but at some point shortly after that my > > > > target system crashes. There is no oops output to the system log. So I > > > > did it again with a remote kgdb attached. Here is the gdb output: > > > > > > > > > > > > Program received signal SIGILL, Illegal instruction. > > > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > 1055 if (put_page_testzero(page)) { > > > > (gdb) bt > > > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > > > context=<value optimized out>, pflags=0xc046cfb8, > > > > left_locked=<value optimized out>) > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > > > pflags=0xc046cfb8, context=268435457) > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > > > at kernel/softirq.c:267 > > > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > > > > #12 0xc010499e in common_interrupt () at thread_info.h:91 > > > > #13 0xc1cf4000 in ?? () > > > > #14 0x00000000 in ?? () > > > > > > > > > > > > I can reproduce this easily every time. Let me know if you want any > > > > further information about this. > > > > > > > > ...Brad Johnson > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > Using Tomcat but need to do more? Need to support web services, security? > > > > Get stuff done quickly with pre-integrated technology to make your job easier > > > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > > > _______________________________________________ > > > > Scst-devel mailing list > > > > Scs...@li... > > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > > > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > |
From: Ming Z. <min...@gm...> - 2006-05-19 16:45:30
|
my understanding, not 100% right. fileio is more flexible, it can export any regular file and block device as a scsi disk. so it can be very useful for stuff like lvm, evms, ... fileio can introduce file system cache and lead to some performance gain and data inconsistency point; diskio is like a bridge, so less overhead to export a scsi disk compared with fileio. it should be useful as a FC-SCSI bridge. for other device handlers, frankly i do not know. since i do not have such devices at all. i only know a little about scsi_fileio. Ming On Fri, 2006-05-19 at 11:36 -0500, Brad Johnson wrote: > What are the advantages/disadvantages of using fileio instead of diskio? > Also, how does the raid device handler work? > > ...Brad > > > On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > > thx for the info. i never run diskio so i am not aware of its stability. > > maybe Vlad has better idea on this. > > > > Ming > > > > > > On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > > > I tried it again, this time using the fileio handler (I exported an > > > entire device /dev/sda1). > > > This time I did not have any problems. From the initiator system I ran > > > an entire bonnie disk I/O test run without failure. > > > > > > ...Brad > > > > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > > 1) can u try latest code from scst cvs > > > > > > > > 2) can u try to export it via scst_fileio module? see if oops in same > > > > place. > > > > > > > > ming > > > > > > > > > > > > > > > > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > > > > The system running scst crashes when doing I/O to target from remote > > > > > system. > > > > > > > > > > Here is my setup: > > > > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > > > > > It is running Linux 2.6.15.7. > > > > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > > > > It has a Qlogic 2312 HBA connected to a switch. This is my FC target > > > > > host. (My FC Initiator is another x86 system with a Qlogic HBA also > > > > > connected to the switch.) > > > > > For back-end devices it has an LSI FC949X HBA connected to a Hitachi > > > > > Fibre-channel drive. > > > > > > > > > > Here is my start script: > > > > > -------------------------------------------------------- > > > > > modprobe -v qla2x00tgt > > > > > modprobe -v scst_disk > > > > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > > > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > > > > -------------------------------------------------------- > > > > > > > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > > > > > Qlogic target-mode port. Everything starts successfully (including > > > > > scsi_tgt module since it is a dependency of scst_disk). > > > > > > > > > > >From my initiator system I see the one drive I have exposed. I > > > > > successfully partition that drive and do mkfs. At this point everything > > > > > is still fine. I then mount the file system and copy a big file to it. > > > > > The copy seems to work fine but at some point shortly after that my > > > > > target system crashes. There is no oops output to the system log. So I > > > > > did it again with a remote kgdb attached. Here is the gdb output: > > > > > > > > > > > > > > > Program received signal SIGILL, Illegal instruction. > > > > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > > 1055 if (put_page_testzero(page)) { > > > > > (gdb) bt > > > > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > > > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > > > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > > > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > > > > context=<value optimized out>, pflags=0xc046cfb8, > > > > > left_locked=<value optimized out>) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > > > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > > > > pflags=0xc046cfb8, context=268435457) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > > > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > > > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > > > > at kernel/softirq.c:267 > > > > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > > > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > > > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > > > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > > > > > #12 0xc010499e in common_interrupt () at thread_info.h:91 > > > > > #13 0xc1cf4000 in ?? () > > > > > #14 0x00000000 in ?? () > > > > > > > > > > > > > > > I can reproduce this easily every time. Let me know if you want any > > > > > further information about this. > > > > > > > > > > ...Brad Johnson > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > Using Tomcat but need to do more? Need to support web services, security? > > > > > Get stuff done quickly with pre-integrated technology to make your job easier > > > > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > > > > _______________________________________________ > > > > > Scst-devel mailing list > > > > > Scs...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > Using Tomcat but need to do more? Need to support web services, security? > > > Get stuff done quickly with pre-integrated technology to make your job easier > > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > > _______________________________________________ > > > Scst-devel mailing list > > > Scs...@li... > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel |
From: Vladislav B. <vs...@vl...> - 2006-05-19 17:49:18
|
This is basically correct. I would only add that using diskio you saves some CPU overhead (all the data mem copied), but loosing caching and, may be more important, readahead. Vlad Ming Zhang wrote: > my understanding, not 100% right. > > fileio is more flexible, it can export any regular file and block device > as a scsi disk. so it can be very useful for stuff like lvm, evms, ... > > fileio can introduce file system cache and lead to some performance gain > and data inconsistency point; > > diskio is like a bridge, so less overhead to export a scsi disk compared > with fileio. it should be useful as a FC-SCSI bridge. > > for other device handlers, frankly i do not know. since i do not have > such devices at all. > > i only know a little about scsi_fileio. > > Ming > > > On Fri, 2006-05-19 at 11:36 -0500, Brad Johnson wrote: > >>What are the advantages/disadvantages of using fileio instead of diskio? >>Also, how does the raid device handler work? >> >>...Brad >> >> >>On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: >> >>>thx for the info. i never run diskio so i am not aware of its stability. >>>maybe Vlad has better idea on this. >>> >>>Ming >>> >>> >>>On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: >>> >>>>I tried it again, this time using the fileio handler (I exported an >>>>entire device /dev/sda1). >>>>This time I did not have any problems. From the initiator system I ran >>>>an entire bonnie disk I/O test run without failure. >>>> >>>>...Brad >>>> >>>> >>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: >>>> >>>>>1) can u try latest code from scst cvs >>>>> >>>>>2) can u try to export it via scst_fileio module? see if oops in same >>>>>place. >>>>> >>>>>ming >>>>> >>>>> >>>>> >>>>> >>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >>>>> >>>>>>The system running scst crashes when doing I/O to target from remote >>>>>>system. >>>>>> >>>>>>Here is my setup: >>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>>>>It is running Linux 2.6.15.7. >>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>>>>connected to the switch.) >>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>>>>Fibre-channel drive. >>>>>> >>>>>>Here is my start script: >>>>>>-------------------------------------------------------- >>>>>>modprobe -v qla2x00tgt >>>>>>modprobe -v scst_disk >>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>>>>-------------------------------------------------------- >>>>>> >>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>>>>Qlogic target-mode port. Everything starts successfully (including >>>>>>scsi_tgt module since it is a dependency of scst_disk). >>>>>> >>>>>>>From my initiator system I see the one drive I have exposed. I >>>>>>successfully partition that drive and do mkfs. At this point everything >>>>>>is still fine. I then mount the file system and copy a big file to it. >>>>>>The copy seems to work fine but at some point shortly after that my >>>>>>target system crashes. There is no oops output to the system log. So I >>>>>>did it again with a remote kgdb attached. Here is the gdb output: >>>>>> >>>>>> >>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>1055 if (put_page_testzero(page)) { >>>>>>(gdb) bt >>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>>>>> context=<value optimized out>, pflags=0xc046cfb8, >>>>>> left_locked=<value optimized out>) >>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>>>>> pflags=0xc046cfb8, context=268435457) >>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>>>>> at kernel/softirq.c:267 >>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>>>>#13 0xc1cf4000 in ?? () >>>>>>#14 0x00000000 in ?? () >>>>>> >>>>>> >>>>>>I can reproduce this easily every time. Let me know if you want any >>>>>>further information about this. >>>>>> >>>>>>...Brad Johnson >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>------------------------------------------------------- >>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>_______________________________________________ >>>>>>Scst-devel mailing list >>>>>>Scs...@li... >>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>> >>>>> >>>> >>>> >>>>------------------------------------------------------- >>>>Using Tomcat but need to do more? Need to support web services, security? >>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>_______________________________________________ >>>>Scst-devel mailing list >>>>Scs...@li... >>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>> >>> >> >> >>------------------------------------------------------- >>Using Tomcat but need to do more? Need to support web services, security? >>Get stuff done quickly with pre-integrated technology to make your job easier >>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>_______________________________________________ >>Scst-devel mailing list >>Scs...@li... >>https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Vladislav B. <vs...@vl...> - 2006-05-19 17:52:17
|
Brad Johnson wrote: > What are the advantages/disadvantages of using fileio instead of diskio? > Also, how does the raid device handler work? The RAID handler is pass-through driver for SCSI device with RAID type (0xC). > ...Brad > > > On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > >>thx for the info. i never run diskio so i am not aware of its stability. >>maybe Vlad has better idea on this. >> >>Ming >> >> >>On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: >> >>>I tried it again, this time using the fileio handler (I exported an >>>entire device /dev/sda1). >>>This time I did not have any problems. From the initiator system I ran >>>an entire bonnie disk I/O test run without failure. >>> >>>...Brad >>> >>> >>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: >>> >>>>1) can u try latest code from scst cvs >>>> >>>>2) can u try to export it via scst_fileio module? see if oops in same >>>>place. >>>> >>>>ming >>>> >>>> >>>> >>>> >>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >>>> >>>>>The system running scst crashes when doing I/O to target from remote >>>>>system. >>>>> >>>>>Here is my setup: >>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>>>It is running Linux 2.6.15.7. >>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>>>connected to the switch.) >>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>>>Fibre-channel drive. >>>>> >>>>>Here is my start script: >>>>>-------------------------------------------------------- >>>>>modprobe -v qla2x00tgt >>>>>modprobe -v scst_disk >>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>>>-------------------------------------------------------- >>>>> >>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>>>Qlogic target-mode port. Everything starts successfully (including >>>>>scsi_tgt module since it is a dependency of scst_disk). >>>>> >>>>>>From my initiator system I see the one drive I have exposed. I >>>>>successfully partition that drive and do mkfs. At this point everything >>>>>is still fine. I then mount the file system and copy a big file to it. >>>>>The copy seems to work fine but at some point shortly after that my >>>>>target system crashes. There is no oops output to the system log. So I >>>>>did it again with a remote kgdb attached. Here is the gdb output: >>>>> >>>>> >>>>>Program received signal SIGILL, Illegal instruction. >>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>1055 if (put_page_testzero(page)) { >>>>>(gdb) bt >>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>>>> context=<value optimized out>, pflags=0xc046cfb8, >>>>> left_locked=<value optimized out>) >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>>>> pflags=0xc046cfb8, context=268435457) >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>>>> at kernel/softirq.c:267 >>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>>>#13 0xc1cf4000 in ?? () >>>>>#14 0x00000000 in ?? () >>>>> >>>>> >>>>>I can reproduce this easily every time. Let me know if you want any >>>>>further information about this. >>>>> >>>>>...Brad Johnson >>>>> >>>>> >>>>> >>>>> >>>>>------------------------------------------------------- >>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>_______________________________________________ >>>>>Scst-devel mailing list >>>>>Scs...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>> >>>> >>> >>> >>>------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job easier >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Scst-devel mailing list >>>Scs...@li... >>>https://lists.sourceforge.net/lists/listinfo/scst-devel >> >> > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Ming Z. <min...@gm...> - 2006-05-19 17:54:16
|
On Fri, 2006-05-19 at 21:51 +0400, Vladislav Bolkhovitin wrote: > Brad Johnson wrote: > > What are the advantages/disadvantages of using fileio instead of diskio? > > Also, how does the raid device handler work? > > The RAID handler is pass-through driver for SCSI device with RAID type > (0xC). > no intention for insulting, have u ever tested with such device? most HW raid i saw only export TYPE_DISK. > > ...Brad > > > > > > On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > > > >>thx for the info. i never run diskio so i am not aware of its stability. > >>maybe Vlad has better idea on this. > >> > >>Ming > >> > >> > >>On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > >> > >>>I tried it again, this time using the fileio handler (I exported an > >>>entire device /dev/sda1). > >>>This time I did not have any problems. From the initiator system I ran > >>>an entire bonnie disk I/O test run without failure. > >>> > >>>...Brad > >>> > >>> > >>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>> > >>>>1) can u try latest code from scst cvs > >>>> > >>>>2) can u try to export it via scst_fileio module? see if oops in same > >>>>place. > >>>> > >>>>ming > >>>> > >>>> > >>>> > >>>> > >>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >>>> > >>>>>The system running scst crashes when doing I/O to target from remote > >>>>>system. > >>>>> > >>>>>Here is my setup: > >>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>>>It is running Linux 2.6.15.7. > >>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>>>connected to the switch.) > >>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>>>Fibre-channel drive. > >>>>> > >>>>>Here is my start script: > >>>>>-------------------------------------------------------- > >>>>>modprobe -v qla2x00tgt > >>>>>modprobe -v scst_disk > >>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>>>-------------------------------------------------------- > >>>>> > >>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>>>Qlogic target-mode port. Everything starts successfully (including > >>>>>scsi_tgt module since it is a dependency of scst_disk). > >>>>> > >>>>>>From my initiator system I see the one drive I have exposed. I > >>>>>successfully partition that drive and do mkfs. At this point everything > >>>>>is still fine. I then mount the file system and copy a big file to it. > >>>>>The copy seems to work fine but at some point shortly after that my > >>>>>target system crashes. There is no oops output to the system log. So I > >>>>>did it again with a remote kgdb attached. Here is the gdb output: > >>>>> > >>>>> > >>>>>Program received signal SIGILL, Illegal instruction. > >>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>1055 if (put_page_testzero(page)) { > >>>>>(gdb) bt > >>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>>>> context=<value optimized out>, pflags=0xc046cfb8, > >>>>> left_locked=<value optimized out>) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>>>> pflags=0xc046cfb8, context=268435457) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>>>> at kernel/softirq.c:267 > >>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>>>#13 0xc1cf4000 in ?? () > >>>>>#14 0x00000000 in ?? () > >>>>> > >>>>> > >>>>>I can reproduce this easily every time. Let me know if you want any > >>>>>further information about this. > >>>>> > >>>>>...Brad Johnson > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>_______________________________________________ > >>>>>Scst-devel mailing list > >>>>>Scs...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>> > >>>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >> > >> > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > |
From: Nathaniel C. <uto...@mi...> - 2006-05-19 16:50:15
|
diskio/raidio are pass-through drivers. fileio goes though the block layer, this has the benefit of caching I/Os. The benifit of the pass-through is that all SCSI commands are passed directly (including things like inquiry, which are totally unrelated to the disk's inquiry data for fileio). - Nate On Friday 19 May 2006 12:36, Brad Johnson wrote: > What are the advantages/disadvantages of using fileio instead of diskio? > Also, how does the raid device handler work? > > ...Brad > > On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > > thx for the info. i never run diskio so i am not aware of its stability. > > maybe Vlad has better idea on this. > > > > Ming > > > > On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > > > I tried it again, this time using the fileio handler (I exported an > > > entire device /dev/sda1). > > > This time I did not have any problems. From the initiator system I ran > > > an entire bonnie disk I/O test run without failure. > > > > > > ...Brad > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > > 1) can u try latest code from scst cvs > > > > > > > > 2) can u try to export it via scst_fileio module? see if oops in same > > > > place. > > > > > > > > ming > > > > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > > > > The system running scst crashes when doing I/O to target from > > > > > remote system. > > > > > > > > > > Here is my setup: > > > > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB > > > > > RAM. It is running Linux 2.6.15.7. > > > > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > > > > It has a Qlogic 2312 HBA connected to a switch. This is my FC > > > > > target host. (My FC Initiator is another x86 system with a Qlogic > > > > > HBA also connected to the switch.) > > > > > For back-end devices it has an LSI FC949X HBA connected to a > > > > > Hitachi Fibre-channel drive. > > > > > > > > > > Here is my start script: > > > > > -------------------------------------------------------- > > > > > modprobe -v qla2x00tgt > > > > > modprobe -v scst_disk > > > > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > > > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > > > > -------------------------------------------------------- > > > > > > > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to > > > > > my Qlogic target-mode port. Everything starts successfully > > > > > (including scsi_tgt module since it is a dependency of scst_disk). > > > > > > > > > > >From my initiator system I see the one drive I have exposed. I > > > > > > > > > > successfully partition that drive and do mkfs. At this point > > > > > everything is still fine. I then mount the file system and copy a > > > > > big file to it. The copy seems to work fine but at some point > > > > > shortly after that my target system crashes. There is no oops > > > > > output to the system log. So I did it again with a remote kgdb > > > > > attached. Here is the gdb output: > > > > > > > > > > > > > > > Program received signal SIGILL, Illegal instruction. > > > > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > > 1055 if (put_page_testzero(page)) { > > > > > (gdb) bt > > > > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > > > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > > > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > > > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > > > > context=<value optimized out>, pflags=0xc046cfb8, > > > > > left_locked=<value optimized out>) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > > > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > > > > pflags=0xc046cfb8, context=268435457) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > > > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > > > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > > > > at kernel/softirq.c:267 > > > > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > > > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > > > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > > > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at > > > > > arch/i386/kernel/irq.c:110 #12 0xc010499e in common_interrupt () at > > > > > thread_info.h:91 > > > > > #13 0xc1cf4000 in ?? () > > > > > #14 0x00000000 in ?? () > > > > > > > > > > > > > > > I can reproduce this easily every time. Let me know if you want any > > > > > further information about this. > > > > > > > > > > ...Brad Johnson > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > Using Tomcat but need to do more? Need to support web services, > > > > > security? Get stuff done quickly with pre-integrated technology to > > > > > make your job easier Download IBM WebSphere Application Server > > > > > v.1.0.1 based on Apache Geronimo > > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=1 > > > > >21642 _______________________________________________ > > > > > Scst-devel mailing list > > > > > Scs...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > > ------------------------------------------------------- > > > Using Tomcat but need to do more? Need to support web services, > > > security? Get stuff done quickly with pre-integrated technology to make > > > your job easier Download IBM WebSphere Application Server v.1.0.1 based > > > on Apache Geronimo > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=12164 > > >2 _______________________________________________ > > > Scst-devel mailing list > > > Scs...@li... > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier Download IBM WebSphere Application Server v.1.0.1 based on Apache > Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel |
From: Ming Z. <mi...@el...> - 2006-05-19 16:58:37
|
do u see any device will report it as raid device? i have very limited experience on this. i assumed most HW raid still report disk like devices to host. ming On Fri, 2006-05-19 at 12:49 -0400, Nathaniel Clark wrote: > diskio/raidio are pass-through drivers. fileio goes though the block layer, > this has the benefit of caching I/Os. > > The benifit of the pass-through is that all SCSI commands are passed directly > (including things like inquiry, which are totally unrelated to the disk's > inquiry data for fileio). > > - Nate > > On Friday 19 May 2006 12:36, Brad Johnson wrote: > > What are the advantages/disadvantages of using fileio instead of diskio? > > Also, how does the raid device handler work? > > > > ...Brad > > > > On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > > > thx for the info. i never run diskio so i am not aware of its stability. > > > maybe Vlad has better idea on this. > > > > > > Ming > > > > > > On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > > > > I tried it again, this time using the fileio handler (I exported an > > > > entire device /dev/sda1). > > > > This time I did not have any problems. From the initiator system I ran > > > > an entire bonnie disk I/O test run without failure. > > > > > > > > ...Brad > > > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > > > 1) can u try latest code from scst cvs > > > > > > > > > > 2) can u try to export it via scst_fileio module? see if oops in same > > > > > place. > > > > > > > > > > ming > > > > > > > > > > On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > > > > > > The system running scst crashes when doing I/O to target from > > > > > > remote system. > > > > > > > > > > > > Here is my setup: > > > > > > My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB > > > > > > RAM. It is running Linux 2.6.15.7. > > > > > > It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > > > > > > It has a Qlogic 2312 HBA connected to a switch. This is my FC > > > > > > target host. (My FC Initiator is another x86 system with a Qlogic > > > > > > HBA also connected to the switch.) > > > > > > For back-end devices it has an LSI FC949X HBA connected to a > > > > > > Hitachi Fibre-channel drive. > > > > > > > > > > > > Here is my start script: > > > > > > -------------------------------------------------------- > > > > > > modprobe -v qla2x00tgt > > > > > > modprobe -v scst_disk > > > > > > echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > > > > > > echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > > > > > > -------------------------------------------------------- > > > > > > > > > > > > In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to > > > > > > my Qlogic target-mode port. Everything starts successfully > > > > > > (including scsi_tgt module since it is a dependency of scst_disk). > > > > > > > > > > > > >From my initiator system I see the one drive I have exposed. I > > > > > > > > > > > > successfully partition that drive and do mkfs. At this point > > > > > > everything is still fine. I then mount the file system and copy a > > > > > > big file to it. The copy seems to work fine but at some point > > > > > > shortly after that my target system crashes. There is no oops > > > > > > output to the system log. So I did it again with a remote kgdb > > > > > > attached. Here is the gdb output: > > > > > > > > > > > > > > > > > > Program received signal SIGILL, Illegal instruction. > > > > > > __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > > > 1055 if (put_page_testzero(page)) { > > > > > > (gdb) bt > > > > > > #0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > > > > > > #1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > > > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > > > > > > #2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > > > > > > at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > > > > > > #3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > > > > > > #4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > > > > > > context=<value optimized out>, pflags=0xc046cfb8, > > > > > > left_locked=<value optimized out>) > > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > > > > > > #5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > > > > > > pflags=0xc046cfb8, context=268435457) > > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > > > > > > #6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > > > > > > at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > > > > > > #7 0xc012d905 in tasklet_action (a=<value optimized out>) > > > > > > at kernel/softirq.c:267 > > > > > > #8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > > > > > > #9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > > > > > > #10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > > > > > > #11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at > > > > > > arch/i386/kernel/irq.c:110 #12 0xc010499e in common_interrupt () at > > > > > > thread_info.h:91 > > > > > > #13 0xc1cf4000 in ?? () > > > > > > #14 0x00000000 in ?? () > > > > > > > > > > > > > > > > > > I can reproduce this easily every time. Let me know if you want any > > > > > > further information about this. > > > > > > > > > > > > ...Brad Johnson > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > > Using Tomcat but need to do more? Need to support web services, > > > > > > security? Get stuff done quickly with pre-integrated technology to > > > > > > make your job easier Download IBM WebSphere Application Server > > > > > > v.1.0.1 based on Apache Geronimo > > > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=1 > > > > > >21642 _______________________________________________ > > > > > > Scst-devel mailing list > > > > > > Scs...@li... > > > > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > > > > > ------------------------------------------------------- > > > > Using Tomcat but need to do more? Need to support web services, > > > > security? Get stuff done quickly with pre-integrated technology to make > > > > your job easier Download IBM WebSphere Application Server v.1.0.1 based > > > > on Apache Geronimo > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=12164 > > > >2 _______________________________________________ > > > > Scst-devel mailing list > > > > Scs...@li... > > > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job > > easier Download IBM WebSphere Application Server v.1.0.1 based on Apache > > Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel |
From: Vladislav B. <vs...@vl...> - 2006-05-19 17:46:05
|
Brad, Thanks for the report. Could you please try several times more to see if it crashes in the same or different points? Vlad Brad Johnson wrote: > I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > It goes a while longer before crashing this time, and fails in a > different place. This test again used scst_disk handler. I will next try > using the scst_fileio module and will let you know the results. > Here is the GDB output for this failure: > > Program received signal SIGILL, Illegal instruction. > __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > out>, > pflags=0xf1488fd4, left_locked=1) > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > 2494 BUG(); > (gdb) bt > #0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > out>, > pflags=0xf1488fd4, left_locked=1) > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > #1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > pflags=0xf1488fd4, context=268435459) > at /root/mid-level/cvs_version/src/scst_targ.c:54 > #2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > at /root/mid-level/cvs_version/src/scst_targ.c:2662 > #3 0xc01024d9 in kernel_thread_helper () at > arch/i386/kernel/process.c:298 > > (gdb) print *cmd > $2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > 0, > data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > search_cmd_list_entry = { > next = 0xe92800d0, prev = 0xe92800d0}, > cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > cdb_len = 10, > queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > data_direction = DMA_FROM_DEVICE, > expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > 4096, > data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > msg_status = 0 '\0', host_status = 0, driver_status = 0, > sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > 0x0, > tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > (gdb) > > > ...Brad > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>1) can u try latest code from scst cvs >> >>2) can u try to export it via scst_fileio module? see if oops in same >>place. >> >>ming >> >> >> >> >>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >> >>>The system running scst crashes when doing I/O to target from remote >>>system. >>> >>>Here is my setup: >>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>It is running Linux 2.6.15.7. >>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>connected to the switch.) >>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>Fibre-channel drive. >>> >>>Here is my start script: >>>-------------------------------------------------------- >>>modprobe -v qla2x00tgt >>>modprobe -v scst_disk >>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>-------------------------------------------------------- >>> >>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>Qlogic target-mode port. Everything starts successfully (including >>>scsi_tgt module since it is a dependency of scst_disk). >>> >>>>From my initiator system I see the one drive I have exposed. I >>>successfully partition that drive and do mkfs. At this point everything >>>is still fine. I then mount the file system and copy a big file to it. >>>The copy seems to work fine but at some point shortly after that my >>>target system crashes. There is no oops output to the system log. So I >>>did it again with a remote kgdb attached. Here is the gdb output: >>> >>> >>>Program received signal SIGILL, Illegal instruction. >>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>1055 if (put_page_testzero(page)) { >>>(gdb) bt >>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>> context=<value optimized out>, pflags=0xc046cfb8, >>> left_locked=<value optimized out>) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>> pflags=0xc046cfb8, context=268435457) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>> at kernel/softirq.c:267 >>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>#13 0xc1cf4000 in ?? () >>>#14 0x00000000 in ?? () >>> >>> >>>I can reproduce this easily every time. Let me know if you want any >>>further information about this. >>> >>>...Brad Johnson >>> >>> >>> >>> >>>------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job easier >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Scst-devel mailing list >>>Scs...@li... >>>https://lists.sourceforge.net/lists/listinfo/scst-devel >> >> > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Brad J. <bjo...@pr...> - 2006-05-20 16:55:37
|
Vlad, I tried it 3 times and every time it failed in the exact same place. ...Brad On Fri, 2006-05-19 at 21:45 +0400, Vladislav Bolkhovitin wrote: > Brad, > > Thanks for the report. Could you please try several times more to see if > it crashes in the same or different points? > > Vlad > > Brad Johnson wrote: > > I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > > It goes a while longer before crashing this time, and fails in a > > different place. This test again used scst_disk handler. I will next try > > using the scst_fileio module and will let you know the results. > > Here is the GDB output for this failure: > > > > Program received signal SIGILL, Illegal instruction. > > __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > > out>, > > pflags=0xf1488fd4, left_locked=1) > > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > > 2494 BUG(); > > (gdb) bt > > #0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > > out>, > > pflags=0xf1488fd4, left_locked=1) > > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > > #1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > > pflags=0xf1488fd4, context=268435459) > > at /root/mid-level/cvs_version/src/scst_targ.c:54 > > #2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > > at /root/mid-level/cvs_version/src/scst_targ.c:2662 > > #3 0xc01024d9 in kernel_thread_helper () at > > arch/i386/kernel/process.c:298 > > > > (gdb) print *cmd > > $2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > > sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > > atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > > 0, > > data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > > cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > > tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > > search_cmd_list_entry = { > > next = 0xe92800d0, prev = 0xe92800d0}, > > cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > > cdb_len = 10, > > queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > > data_direction = DMA_FROM_DEVICE, > > expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > > 4096, > > data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > > sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > > get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > > msg_status = 0 '\0', host_status = 0, driver_status = 0, > > sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > > tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > > extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > > 0x0, > > tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > > (gdb) > > > > > > ...Brad > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > >>1) can u try latest code from scst cvs > >> > >>2) can u try to export it via scst_fileio module? see if oops in same > >>place. > >> > >>ming > >> > >> > >> > >> > >>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >> > >>>The system running scst crashes when doing I/O to target from remote > >>>system. > >>> > >>>Here is my setup: > >>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>It is running Linux 2.6.15.7. > >>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>connected to the switch.) > >>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>Fibre-channel drive. > >>> > >>>Here is my start script: > >>>-------------------------------------------------------- > >>>modprobe -v qla2x00tgt > >>>modprobe -v scst_disk > >>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>-------------------------------------------------------- > >>> > >>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>Qlogic target-mode port. Everything starts successfully (including > >>>scsi_tgt module since it is a dependency of scst_disk). > >>> > >>>>From my initiator system I see the one drive I have exposed. I > >>>successfully partition that drive and do mkfs. At this point everything > >>>is still fine. I then mount the file system and copy a big file to it. > >>>The copy seems to work fine but at some point shortly after that my > >>>target system crashes. There is no oops output to the system log. So I > >>>did it again with a remote kgdb attached. Here is the gdb output: > >>> > >>> > >>>Program received signal SIGILL, Illegal instruction. > >>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>1055 if (put_page_testzero(page)) { > >>>(gdb) bt > >>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>> context=<value optimized out>, pflags=0xc046cfb8, > >>> left_locked=<value optimized out>) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>> pflags=0xc046cfb8, context=268435457) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>> at kernel/softirq.c:267 > >>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>#13 0xc1cf4000 in ?? () > >>>#14 0x00000000 in ?? () > >>> > >>> > >>>I can reproduce this easily every time. Let me know if you want any > >>>further information about this. > >>> > >>>...Brad Johnson > >>> > >>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >> > >> > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > > |
From: Vladislav B. <vs...@vl...> - 2006-05-19 18:06:34
|
Ming Zhang wrote: > On Fri, 2006-05-19 at 21:51 +0400, Vladislav Bolkhovitin wrote: > >>Brad Johnson wrote: >> >>>What are the advantages/disadvantages of using fileio instead of diskio? >>>Also, how does the raid device handler work? >> >>The RAID handler is pass-through driver for SCSI device with RAID type >>(0xC). >> > > > no intention for insulting, have u ever tested with such device? most HW > raid i saw only export TYPE_DISK. Personally I - have not. Friend of mine said he has. > > > > >>>...Brad >>> >>> >>>On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: >>> >>> >>>>thx for the info. i never run diskio so i am not aware of its stability. >>>>maybe Vlad has better idea on this. >>>> >>>>Ming >>>> >>>> >>>>On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: >>>> >>>> >>>>>I tried it again, this time using the fileio handler (I exported an >>>>>entire device /dev/sda1). >>>>>This time I did not have any problems. From the initiator system I ran >>>>>an entire bonnie disk I/O test run without failure. >>>>> >>>>>...Brad >>>>> >>>>> >>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: >>>>> >>>>> >>>>>>1) can u try latest code from scst cvs >>>>>> >>>>>>2) can u try to export it via scst_fileio module? see if oops in same >>>>>>place. >>>>>> >>>>>>ming >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: >>>>>> >>>>>> >>>>>>>The system running scst crashes when doing I/O to target from remote >>>>>>>system. >>>>>>> >>>>>>>Here is my setup: >>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. >>>>>>>It is running Linux 2.6.15.7. >>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. >>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target >>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also >>>>>>>connected to the switch.) >>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi >>>>>>>Fibre-channel drive. >>>>>>> >>>>>>>Here is my start script: >>>>>>>-------------------------------------------------------- >>>>>>>modprobe -v qla2x00tgt >>>>>>>modprobe -v scst_disk >>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices >>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled >>>>>>>-------------------------------------------------------- >>>>>>> >>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my >>>>>>>Qlogic target-mode port. Everything starts successfully (including >>>>>>>scsi_tgt module since it is a dependency of scst_disk). >>>>>>> >>>>>>>>From my initiator system I see the one drive I have exposed. I >>>>>>>successfully partition that drive and do mkfs. At this point everything >>>>>>>is still fine. I then mount the file system and copy a big file to it. >>>>>>>The copy seems to work fine but at some point shortly after that my >>>>>>>target system crashes. There is no oops output to the system log. So I >>>>>>>did it again with a remote kgdb attached. Here is the gdb output: >>>>>>> >>>>>>> >>>>>>>Program received signal SIGILL, Illegal instruction. >>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>1055 if (put_page_testzero(page)) { >>>>>>>(gdb) bt >>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 >>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 >>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 >>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 >>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, >>>>>>> context=<value optimized out>, pflags=0xc046cfb8, >>>>>>> left_locked=<value optimized out>) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 >>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, >>>>>>> pflags=0xc046cfb8, context=268435457) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 >>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 >>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) >>>>>>> at kernel/softirq.c:267 >>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 >>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 >>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 >>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 >>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 >>>>>>>#13 0xc1cf4000 in ?? () >>>>>>>#14 0x00000000 in ?? () >>>>>>> >>>>>>> >>>>>>>I can reproduce this easily every time. Let me know if you want any >>>>>>>further information about this. >>>>>>> >>>>>>>...Brad Johnson >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>------------------------------------------------------- >>>>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>>>_______________________________________________ >>>>>>>Scst-devel mailing list >>>>>>>Scs...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>>>> >>>>>> >>>>> >>>>>------------------------------------------------------- >>>>>Using Tomcat but need to do more? Need to support web services, security? >>>>>Get stuff done quickly with pre-integrated technology to make your job easier >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>>_______________________________________________ >>>>>Scst-devel mailing list >>>>>Scs...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>>> >>>> >>> >>> >>>------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job easier >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Scst-devel mailing list >>>Scs...@li... >>>https://lists.sourceforge.net/lists/listinfo/scst-devel >>> >> > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Ming Z. <min...@gm...> - 2006-05-19 18:08:10
|
On Fri, 2006-05-19 at 22:05 +0400, Vladislav Bolkhovitin wrote: > Ming Zhang wrote: > > On Fri, 2006-05-19 at 21:51 +0400, Vladislav Bolkhovitin wrote: > > > >>Brad Johnson wrote: > >> > >>>What are the advantages/disadvantages of using fileio instead of diskio? > >>>Also, how does the raid device handler work? > >> > >>The RAID handler is pass-through driver for SCSI device with RAID type > >>(0xC). > >> > > > > > > no intention for insulting, have u ever tested with such device? most HW > > raid i saw only export TYPE_DISK. > > Personally I - have not. Friend of mine said he has. > ic. interesting. maybe one day i will see one. > > > > > > > > > >>>...Brad > >>> > >>> > >>>On Fri, 2006-05-19 at 12:29 -0400, Ming Zhang wrote: > >>> > >>> > >>>>thx for the info. i never run diskio so i am not aware of its stability. > >>>>maybe Vlad has better idea on this. > >>>> > >>>>Ming > >>>> > >>>> > >>>>On Fri, 2006-05-19 at 11:25 -0500, Brad Johnson wrote: > >>>> > >>>> > >>>>>I tried it again, this time using the fileio handler (I exported an > >>>>>entire device /dev/sda1). > >>>>>This time I did not have any problems. From the initiator system I ran > >>>>>an entire bonnie disk I/O test run without failure. > >>>>> > >>>>>...Brad > >>>>> > >>>>> > >>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>>>> > >>>>> > >>>>>>1) can u try latest code from scst cvs > >>>>>> > >>>>>>2) can u try to export it via scst_fileio module? see if oops in same > >>>>>>place. > >>>>>> > >>>>>>ming > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >>>>>> > >>>>>> > >>>>>>>The system running scst crashes when doing I/O to target from remote > >>>>>>>system. > >>>>>>> > >>>>>>>Here is my setup: > >>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>>>>>It is running Linux 2.6.15.7. > >>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>>>>>connected to the switch.) > >>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>>>>>Fibre-channel drive. > >>>>>>> > >>>>>>>Here is my start script: > >>>>>>>-------------------------------------------------------- > >>>>>>>modprobe -v qla2x00tgt > >>>>>>>modprobe -v scst_disk > >>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>>>>>-------------------------------------------------------- > >>>>>>> > >>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>>>>>Qlogic target-mode port. Everything starts successfully (including > >>>>>>>scsi_tgt module since it is a dependency of scst_disk). > >>>>>>> > >>>>>>>>From my initiator system I see the one drive I have exposed. I > >>>>>>>successfully partition that drive and do mkfs. At this point everything > >>>>>>>is still fine. I then mount the file system and copy a big file to it. > >>>>>>>The copy seems to work fine but at some point shortly after that my > >>>>>>>target system crashes. There is no oops output to the system log. So I > >>>>>>>did it again with a remote kgdb attached. Here is the gdb output: > >>>>>>> > >>>>>>> > >>>>>>>Program received signal SIGILL, Illegal instruction. > >>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>1055 if (put_page_testzero(page)) { > >>>>>>>(gdb) bt > >>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>>>>>> context=<value optimized out>, pflags=0xc046cfb8, > >>>>>>> left_locked=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>>>>>> pflags=0xc046cfb8, context=268435457) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>>>>>> at kernel/softirq.c:267 > >>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>>>>>#13 0xc1cf4000 in ?? () > >>>>>>>#14 0x00000000 in ?? () > >>>>>>> > >>>>>>> > >>>>>>>I can reproduce this easily every time. Let me know if you want any > >>>>>>>further information about this. > >>>>>>> > >>>>>>>...Brad Johnson > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>------------------------------------------------------- > >>>>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>>>_______________________________________________ > >>>>>>>Scst-devel mailing list > >>>>>>>Scs...@li... > >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>>> > >>>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>_______________________________________________ > >>>>>Scst-devel mailing list > >>>>>Scs...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>> > >>>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>> > >> > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > |