From: Brad J. <bjo...@pr...> - 2006-05-24 16:26:48
|
After setting max_sectors_kb = 1024 I got the same results (unable to allocate size 1 MB). Also the same results at 512. I had to lower it to 128 to get it to work. Isn't this a parameter that the target and initiator should negotiate? I know with iSCSI the target negotiates the maximum write buffer size it can handle in a single command. ...Brad On Wed, 2006-05-24 at 15:03 +0400, Vladislav Bolkhovitin wrote: > Those messages on the initiator are expected. It issues them when it > can't perform commands constantly receiving BUSY for them from the > target, which can't allocate necessary memory in limited SG entries > count for such huge data sizes. > > Brad, how were you managed the initiator to send such commands? I have > seen such big data sizes only when sent them manually via sg for > testing. Did you manually edit > /sys/block/YOUR_DEVICE/queue/max_sectors_kb? What are the current values > there and in /sys/block/YOUR_DEVICE/queue/max_hw_sectors_kb? You should > set max_sectors_kb in something more sane like 1024. > > Vlad > > Brad Johnson wrote: > > The initiator system log is full of these messages: > > > > sd 1:0:0:0: timing out command, waited 150s > > sd 1:0:0:0: SCSI error: return code = 0x28 > > end_request: I/O error, dev sda, sector 22334567 > > sd 1:0:0:0: timing out command, waited 150s > > sd 1:0:0:0: SCSI error: return code = 0x8 > > end_request: I/O error, dev sda, sector 22382295 > > > > Here is the sgv file: > > > > Name Hit Total > > sgv 0 0 > > sgv-4K 0 0 > > sgv-8K 0 0 > > sgv-16K 0 0 > > sgv-32K 0 0 > > sgv-64K 0 0Name Hit Total > > sgv 0 0 > > sgv-4K 0 0 > > sgv-8K 0 0 > > sgv-16K 0 0 > > sgv-32K 0 0 > > sgv-64K 0 0 > > sgv-128K 0 0 > > sgv-256K 0 0 > > sgv-512K 0 0 > > sgv-1024K 0 0 > > sgv-2048K 0 0 > > > > sgv-clust 2381747 2383002 > > sgv-clust-4K 974 1040 > > sgv-clust-8K 459 512 > > sgv-clust-16K 510 543 > > sgv-clust-32K 746 787 > > sgv-clust-64K 8541 8629 > > sgv-clust-128K 22 32 > > sgv-clust-256K 61 76 > > sgv-clust-512K 35599 35714 > > sgv-clust-1024K 341439 341721 > > sgv-clust-2048K 1993396 1993948 > > > > sgv-dma 0 0 > > sgv-dma-4K 0 0 > > sgv-dma-8K 0 0 > > sgv-dma-16K 0 0 > > sgv-dma-32K 0 0 > > sgv-dma-64K 0 0 > > sgv-dma-128K 0 0 > > sgv-dma-256K 0 0 > > sgv-dma-512K 0 0 > > sgv-dma-1024K 0 0 > > sgv-dma-2048K 0 0 > > > > big 7052121 > > > > sgv-128K 0 0 > > sgv-256K 0 0 > > sgv-512K 0 0 > > sgv-1024K 0 0 > > sgv-2048K 0 0 > > > > sgv-clust 2381747 2383002 > > sgv-clust-4K 974 1040 > > sgv-clust-8K 459 512 > > sgv-clust-16K 510 543 > > sgv-clust-32K 746 787 > > sgv-clust-64K 8541 8629 > > sgv-clust-128K 22 32 > > sgv-clust-256K 61 76 > > sgv-clust-512K 35599 35714 > > sgv-clust-1024K 341439 341721 > > sgv-clust-2048K 1993396 1993948 > > > > sgv-dma 0 0 > > sgv-dma-4K 0 0 > > sgv-dma-8K 0 0 > > sgv-dma-16K 0 0 > > sgv-dma-32K 0 0 > > sgv-dma-64K 0 0 > > sgv-dma-128K 0 0 > > sgv-dma-256K 0 0 > > sgv-dma-512K 0 0 > > sgv-dma-1024K 0 0 > > sgv-dma-2048K 0 0 > > > > big 7052121 > > > > > > ...Brad > > > > > > On Tue, 2006-05-23 at 11:41 +0400, Vladislav Bolkhovitin wrote: > > > >>OK, thanks. > >> > >>Relating the messages. What is your initiator? Does it have something in > >>its logs? 3 Mb data size in commands is a bit too much, Qlogic cards > >>usually have troubles to handle it, so you see those messages. Could you > >>send me the content of /proc/scsi_tgt/sgv file, please? > >> > >>Vlad > >> > >>Brad Johnson wrote: > >> > >>>Vlad, > >>> > >>>The patched version has not yet crashed after running bonnie for an > >>>hour. But now it doesn't seem to be progressing, with the following > >>>messages appearing continuously in the system log: > >>> > >>>May 22 16:25:15 localhost kernel: [4614]: scst_set_busy:Sending BUSY > >>>status to initiator 21:01:00:e0:8b:a6:54:d1 (cmds count 1, queue_type 0, > >>>sess->init_phase 3), probably the system is overloaded > >>>May 22 16:25:15 localhost kernel: [4615]: scst_prepare_space:Unable to > >>>allocate or build requested buffer (size 3182592), sending BUSY status > >>> > >>> > >>>The requested buffer size varies slightly, but is always in the 3 > >>>million range. > >>> > >>>...Brad > >>> > >>> > >>>On Mon, 2006-05-22 at 16:08 +0400, Vladislav Bolkhovitin wrote: > >>> > >>> > >>>>Thanks. Could you try with the attached patch, please? (On top of the > >>>>latest CVS) > >>>> > >>>>Brad Johnson wrote: > >>>> > >>>> > >>>>>I tried it with the latest code from CVS (as of 8:00 AM CDT 05/19/2006). > >>>>>It goes a while longer before crashing this time, and fails in a > >>>>>different place. This test again used scst_disk handler. I will next try > >>>>>using the scst_fileio module and will let you know the results. > >>>>>Here is the GDB output for this failure: > >>>>> > >>>>>Program received signal SIGILL, Illegal instruction. > >>>>>__scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>>>out>, > >>>>> pflags=0xf1488fd4, left_locked=1) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>>>2494 BUG(); > >>>>>(gdb) bt > >>>>>#0 __scst_process_active_cmd (cmd=0xea340d14, context=<value optimized > >>>>>out>, > >>>>> pflags=0xf1488fd4, left_locked=1) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2494 > >>>>>#1 0xf8d88b8c in scst_do_job_active (active_cmd_list=0xf8dab8f0, > >>>>> pflags=0xf1488fd4, context=268435459) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:54 > >>>>>#2 0xf8d88f2b in scst_cmd_thread (arg=<value optimized out>) > >>>>> at /root/mid-level/cvs_version/src/scst_targ.c:2662 > >>>>>#3 0xc01024d9 in kernel_thread_helper () at > >>>>>arch/i386/kernel/process.c:298 > >>>>> > >>>>>(gdb) print *cmd > >>>>>$2 = {cmd_list_entry = {next = 0xf8dab900, prev = 0xf8dab900}, > >>>>> sess = 0xe92800a8, state = 9, sent_to_midlev = 0, ua_ignore = 0, > >>>>> atomic = 0, non_atomic_only = 1, internal = 0, retry = 0, blocking = > >>>>>0, > >>>>> data_buf_alloced = 0, expected_values_set = 1, processible_env = 1, > >>>>> cmd_flags = 0, tgtt = 0xf8c39f40, dev = 0xf5b1a340, lun = 0, > >>>>> tgt_dev = 0xd1b2309c, scsi_req = 0x0, sn = 20668, > >>>>>search_cmd_list_entry = { > >>>>> next = 0xe92800d0, prev = 0xe92800d0}, > >>>>> cdb = "(\000\003\uffff\000?\000\000\b\000\000\000\000\000\000", > >>>>>cdb_len = 10, > >>>>> queue_type = SCST_CMD_QUEUE_UNTAGGED, timeout = 15000, retries = 0, > >>>>> data_direction = DMA_FROM_DEVICE, > >>>>> expected_data_direction = DMA_FROM_DEVICE, expected_transfer_len = > >>>>>4096, > >>>>> data_len = 4096, scst_cmd_done = 0xf8d83080 <scst_cmd_done_local>, > >>>>> sgv = 0xecc606c0, bufflen = 4096, buffer = 0xecc606d4, use_sg = 1, > >>>>> get_sg_buf_entry_num = 0, status = 0 '\0', masked_status = 0 '\0', > >>>>> msg_status = 0 '\0', host_status = 0, driver_status = 0, > >>>>> sense_buffer = '\0' <repeats 95 times>, tgt_dev_saved = 0x0, > >>>>> tgt_resp_flags = 2, resp_data_len = 4096, mgmt_cmd = 0x0, > >>>>> extra_cmd_list_entry = {next = 0x100100, prev = 0x200200}, cmd_saved = > >>>>>0x0, > >>>>> tag = 18824, tgt_specific = 0xe32b09bc, tgt_dev_specific = 0x0} > >>>>>(gdb) > >>>>> > >>>>> > >>>>>...Brad > >>>>> > >>>>> > >>>>>On Thu, 2006-05-18 at 18:38 -0400, Ming Zhang wrote: > >>>>> > >>>>> > >>>>> > >>>>>>1) can u try latest code from scst cvs > >>>>>> > >>>>>>2) can u try to export it via scst_fileio module? see if oops in same > >>>>>>place. > >>>>>> > >>>>>>ming > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>On Thu, 2006-05-18 at 17:20 -0500, Brad Johnson wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>>The system running scst crashes when doing I/O to target from remote > >>>>>>>system. > >>>>>>> > >>>>>>>Here is my setup: > >>>>>>>My target system has 2 Intel Xeon processors (3.2 MHz) and 1 GB RAM. > >>>>>>>It is running Linux 2.6.15.7. > >>>>>>>It has scst-0.9.4 and qla2x00-target-26-0.9.3.8 installed. > >>>>>>>It has a Qlogic 2312 HBA connected to a switch. This is my FC target > >>>>>>>host. (My FC Initiator is another x86 system with a Qlogic HBA also > >>>>>>>connected to the switch.) > >>>>>>>For back-end devices it has an LSI FC949X HBA connected to a Hitachi > >>>>>>>Fibre-channel drive. > >>>>>>> > >>>>>>>Here is my start script: > >>>>>>>-------------------------------------------------------- > >>>>>>>modprobe -v qla2x00tgt > >>>>>>>modprobe -v scst_disk > >>>>>>>echo "add 2:0:3:0 0" >/proc/scsi_tgt/groups/Default/devices > >>>>>>>echo "1" >/sys/class/scsi_host/host5/target_mode_enabled > >>>>>>>-------------------------------------------------------- > >>>>>>> > >>>>>>>In the script, 2:0:3:0 refers to my Hitachi drive, host5 refers to my > >>>>>>>Qlogic target-mode port. Everything starts successfully (including > >>>>>>>scsi_tgt module since it is a dependency of scst_disk). > >>>>>>> > >>>>>>>>From my initiator system I see the one drive I have exposed. I > >>>>>>>successfully partition that drive and do mkfs. At this point everything > >>>>>>>is still fine. I then mount the file system and copy a big file to it. > >>>>>>>The copy seems to work fine but at some point shortly after that my > >>>>>>>target system crashes. There is no oops output to the system log. So I > >>>>>>>did it again with a remote kgdb attached. Here is the gdb output: > >>>>>>> > >>>>>>> > >>>>>>>Program received signal SIGILL, Illegal instruction. > >>>>>>>__free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>1055 if (put_page_testzero(page)) { > >>>>>>>(gdb) bt > >>>>>>>#0 __free_pages (page=0xc190a22c, order=0) at mm/page_alloc.c:1055 > >>>>>>>#1 0xf8d61efd in scst_release_space (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:1430 > >>>>>>>#2 0xf8d60b2a in scst_free_cmd (cmd=0xf40a4e58, check_retry=1) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_lib.c:956 > >>>>>>>#3 0xf8d599ee in scst_finish_cmd (cmd=0xf40a4e58) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2212 > >>>>>>>#4 0xf8d5a7df in __scst_process_active_cmd (cmd=0xf40a4e58, > >>>>>>> context=<value optimized out>, pflags=0xc046cfb8, > >>>>>>> left_locked=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2461 > >>>>>>>#5 0xf8d5aa81 in scst_do_job_active (active_cmd_list=0xf8d756d0, > >>>>>>> pflags=0xc046cfb8, context=268435457) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:54 > >>>>>>>#6 0xf8d5af99 in scst_cmd_tasklet (p=<value optimized out>) > >>>>>>> at /root/mid-level/scst-0.9.4/src/scst_targ.c:2672 > >>>>>>>#7 0xc012d905 in tasklet_action (a=<value optimized out>) > >>>>>>> at kernel/softirq.c:267 > >>>>>>>#8 0xc012d552 in __do_softirq () at kernel/softirq.c:95 > >>>>>>>#9 0xc010619e in do_softirq () at arch/i386/kernel/irq.c:187 > >>>>>>>#10 0xc012d689 in irq_exit () at kernel/softirq.c:169 > >>>>>>>#11 0xc010604e in do_IRQ (regs=0xc1cf4f48) at arch/i386/kernel/irq.c:110 > >>>>>>>#12 0xc010499e in common_interrupt () at thread_info.h:91 > >>>>>>>#13 0xc1cf4000 in ?? () > >>>>>>>#14 0x00000000 in ?? () > >>>>>>> > >>>>>>> > >>>>>>>I can reproduce this easily every time. Let me know if you want any > >>>>>>>further information about this. > >>>>>>> > >>>>>>>...Brad Johnson > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>------------------------------------------------------- > >>>>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>>>_______________________________________________ > >>>>>>>Scst-devel mailing list > >>>>>>>Scs...@li... > >>>>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>>> > >>>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>Using Tomcat but need to do more? Need to support web services, security? > >>>>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>>>_______________________________________________ > >>>>>Scst-devel mailing list > >>>>>Scs...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>>>> > >>>> > >>> > >>> > >>>------------------------------------------------------- > >>>Using Tomcat but need to do more? Need to support web services, security? > >>>Get stuff done quickly with pre-integrated technology to make your job easier > >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Scst-devel mailing list > >>>Scs...@li... > >>>https://lists.sourceforge.net/lists/listinfo/scst-devel > >>> > >> > >> > > > > > > > > ------------------------------------------------------- > All the advantages of Linux Managed Hosting--Without the Cost and Risk! > Fully trained technicians. The highest number of Red Hat certifications in > the hosting industry. Fanatical Support. Click to learn more > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |