From: Brad J. <bjo...@pr...> - 2006-05-22 23:16:37
|
Vlad, The patched version has not yet crashed after running bonnie for an hour. But now it doesn't seem to be progressing, with the following messages appearing continuously in the system log: May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, sess->init_phase 3), probably the system is overloaded May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to allocate or build requested buffer (size 3182592), sending BUSY status The requested buffer size varies slightly, but is always in the 3 million range. ...Brad On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: > Thanks. Could you try with the attached patch, please? (On top of the > latest CVS) > > Brad Johnson wrote: > > I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > > It goes a while longer before crashing this time, and fails in a > > different place. This test again used scst_disk handler. I will next try > > using the scst_fileio module and will let you know the results. > > Here is the GDB output for this failure: > > > > Program received signal SIGILL, Illegal instruction. > > __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > > out>, > > pflags=0xf1488fd4, left_locked=1) > > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > > 2494 BUG(); > > (gdb) bt > > #0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > > out>, > > pflags=0xf1488fd4, left_locked=1) > > at /root/mid-level/cvs_version/src/scst_targ.c:2494 > > #1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > > pflags=0xf1488fd4, context=268435459) > > at /root/mid-level/cvs_version/src/scst_targ.c:54 > > #2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > > at /root/mid-level/cvs_version/src/scst_targ.c:2662 > > #3 0xc01024d9 in kernel_thread_helper () at > > arch/i386/kernel/process.c:298 > > > > (gdb) print *cmd > > $2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > > sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > > atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > > 0, > > data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > > cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > > tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > > search_cmd_list_entry = { > > next = 0xe92800d0, prev = 0xe92800d0}, > > cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > > cdb_len = 10, > > queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > > data_direction = DMA_FROM_DEVICE, > > expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > > 4096, > > data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > > sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > > get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > > msg_status = 0 '\0', host_status = 0, driver_status = 0, > > sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > > tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > > extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > > 0x0, > > tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > > (gdb) > > > > > > ...Brad > > > > > > On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > > > >>1) can u try latest code from scst cvs > >> > >>2) can u try to export it via scst_fileio module? see if oops in same > >>place. > >> > >>ming > >> > >> > >> > >> > >>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >> > >>>The system running scst crashes when doing I/O to target from remote > >>>system. > >>> > >>>Here is my setup: > >>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>It is running Linux 2.6.15.7. > >>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>connected to the switch.) > >>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>Fibre-channel drive. > >>> > >>>Here is my start script: > >>>-------------------------------------------------------- > >>>modprobe -v qla2x00tgt > >>>modprobe -v scst_disk > >>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>-------------------------------------------------------- > >>> > >>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>Qlogic target-mode port. Everything starts successfully (including > >>>scsi_tgt module since it is a dependency of scst_disk). > >>> > >>>>From my initiator system I see the one drive I have exposed. I > >>>successfully partition that drive and do mkfs. At this point everything > >>>is still fine. I then mount the file system and copy a big file to it. > >>>The copy seems to work fine but at some point shortly after that my > >>>target system crashes. There is no oops output to the system log. So I > >>>did it again with a remote kgdb attached. Here is the gdb output: > >>> > >>> > >>>Program received signal SIGILL, Illegal instruction. > >>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>1055 if (put_page_testzero(page)) { > >>>(gdb) bt > >>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>> context=<value optimized out>, pflags=0xc046cfb8, > >>> left_locked=<value optimized out>) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>> pflags=0xc046cfb8, context=268435457) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>> at kernel/softirq.c:267 > >>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>#13 0xc1cf4000 in ?? () > >>>#14 0x00000000 in ?? () > >>> > >>> > >>>I can reproduce this easily every time. Let me know if you want any > >>>further information about this. > >>> > >>>...Brad Johnson > >>> > >>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >> > >> > > > > > > > > ------------------------------------------------------- > > Using Tomcat but need to do more? Need to support web services, security? > > Get stuff done quickly with pre-integrated technology to make your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > > Scst-devel mailing list > > Scs...@li... > > https://lists.sourceforge.net/lists/listinfo/scst-devel > > > |