From: <fib...@gm...> - 2012-02-29 11:34:27
|
It seems from Bart’s answer to that he’s not really testing a blockio device. In my setup, I have one fileio device, /dev/sdc, and one blockio device, /dev/sdd. [root@sfstest1 ~]# /sfs/sg3_utils-1.26/src/sg_inq /dev/sdc standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=2 SCCS=0 ACC=0 TPGS=0 3PC=1 Protect=0 BQue=0 EncServ=0 MultiP=1 (VS=0) [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=0 Sync=0 Linked=0 [TranDis=0] CmdQue=1 [SPI: Clocking=0x0 QAS=0 IUS=0] length=62 (0x3e) Peripheral device type: disk Vendor identification: SCST_FIO Product identification: disk0 Product revision level: 210 Unit serial number: ab5f2545 [root@sfstest1 ~]# /sfs/sg3_utils-1.26/src/sg_sync -i /dev/sdc [root@sfstest1 ~]# /sfs/sg3_utils-1.26/src/sg_inq /dev/sdd standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=2 SCCS=0 ACC=0 TPGS=0 3PC=1 Protect=0 BQue=0 EncServ=0 MultiP=1 (VS=0) [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=0 Sync=0 Linked=0 [TranDis=0] CmdQue=1 [SPI: Clocking=0x0 QAS=0 IUS=0] length=62 (0x3e) Peripheral device type: disk Vendor identification: SCST_BIO Product identification: disk4 Product revision level: 210 Unit serial number: e5d54499 [root@sfstest1 ~]# /sfs/sg3_utils-1.26/src/sg_sync -i /dev/sdd synchronize cache(10): transport: Host_status=0x07 [DID_ERROR] Driver_status=0x00 [DRIVER_OK, SUGGEST_OK] Synchronize cache failed [root@sfstest1 ~]# Note the error on sg_sync -i because on the target, I got this: <1>BUG: unable to handle kernel NULL pointer dereference at virtual address 0000001d <1>printing eip: f8e75398 *pdpt = 0000000077d45001 *pde = 0000000000000000 <0>Oops: 0000 [#1] SMP <4>Modules linked in: fcoe_vlun fcoe eth_vlun fcoe_offload libfcoe libfc ixgbe scst_vdisk libcrc32c ipv6 nfs lockd nfs_acl sunrpc af_packet floppy video output ide_cd loop synclinkmp synclink hdlc sx generic_serial mxser moxa isicom esp epca scsi_transport_fc vlun scst radeonfb igb fb_ddc i2c_algo_bit i2c_i801 i2c_core dca mdio shpchp pci_hotplug evdev ftdi_sio usbserial ehci_hcd uhci_hcd usbcore thermal processor fan container button battery ac ext3 jbd sd_mod mpt2sas scsi_transport_sas <4> <4>Pid: 9406, comm: disk40_0 Not tainted (2.6.24 #47) <4>EIP: 0060:[<f8e75398>] EFLAGS: 00010282 CPU: 1 <4>EIP is at vdisk_fsync+0x108/0x1d0 [scst_vdisk] <4>EAX: 00000080 EBX: f5506780 ECX: 00000001 EDX: 00000000 <4>ESI: 00000000 EDI: efcf83a8 EBP: f70c1e5c ESP: f70c1e38 <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 <0>Process disk40_0 (pid: 9406, ti=f70c0000 task=f70c5aa0 task.ti=f70c0000) <0>Stack: 80135ed4 f6c64880 f5506780 efcdedc4 00000000 00000000 f55af000 00000000 <0> 00000000 f70c1ec4 f8e7223e a1f16000 00000012 00000000 f55af000 00000000 <0> f555c000 f70c1ee4 f8f7ba52 a1f16000 00000012 00000000 efcf83a8 f5506780 <0>Call Trace: <0> [<80104ffa>] show_trace_log_lvl+0x1a/0x30 <0> [<801050ca>] show_stack_log_lvl+0x9a/0xc0 <0> [<80105278>] show_registers+0xc8/0x1d0 <0> [<801054e0>] die+0x110/0x240 <0> [<8011f58c>] do_page_fault+0x27c/0x750 <0> [<803befba>] error_code+0x72/0x78 <0> [<f8e7223e>] vdisk_do_job+0xc9e/0xd10 [scst_vdisk] <0> [<f8f58e75>] scst_do_real_exec+0x75/0x320 [scst] <0> [<f8f595e3>] scst_exec+0xd3/0x2e0 [scst] <0> [<f8f59910>] scst_send_for_exec+0x120/0x280 [scst] <0> [<f8f5d4f0>] scst_process_active_cmd+0x550/0x820 [scst] <0> [<f8f5d82c>] scst_do_job_active+0x6c/0x130 [scst] <0> [<f8f5d99d>] scst_cmd_thread+0xad/0x210 [scst] <0> [<801413ac>] kthread+0x5c/0xa0 <0> [<80104e6f>] kernel_thread_helper+0x7/0x18 <0> ======================= <0>Code: b4 d2 e7 f8 89 44 24 04 e8 76 7e 2b 87 89 f0 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 8d b4 26 00 00 00 00 8b 55 10 b9 01 00 00 00 <0f> b6 42 1d c0 e8 04 83 e0 01 83 f8 01 19 d2 8b 47 18 81 e2 c0 <0>EIP: [<f8e75398>] vdisk_fsync+0x108/0x1d0 [scst_vdisk] SS:ESP 0068:f70c1e38 [1]kdb> The SYNCHRONIZE CACHE code does this: if (immed) { scst_cmd_get(cmd); /* to protect dev */ cmd->completed = 1; cmd->scst_cmd_done(cmd, SCST_CMD_STATE_DEFAULT, SCST_CONTEXT_SAME); vdisk_fsync(thr, loff, data_len, NULL, dev); Note that it’s calling vdisk_fsync() with a NULL cmd pointer: static int vdisk_fsync(struct scst_vdisk_thr *thr, loff_t loff, loff_t len, struct scst_cmd *cmd, struct scst_device *dev); Then, vdisk_fsync() does this: if (virt_dev->blockio) { res = vdisk_blockio_flush(thr->bdev, (cmd->noio_mem_alloc ? GFP_NOIO : GFP_KERNEL), true); And there's the issue, just as my stack trace above shows. Also, I just looked at the code for sg_verify, and it always sets BYTECHK to 0 (meaning, no data is moved, and nothing gets compared against) , which is useless as a test. -T On Mon, Feb 27, 2012 at 12:33 PM, Bart Van Assche <bva...@ac...> wrote: > On Sun, Feb 26, 2012 at 8:03 PM, fib...@gm... <fib...@gm...> > wrote: >> We are seeing some issues with the way blockio operates. I tested on >> v3979, but there appear to be no changes in scst_vdisk.c since 3979, >> so the current 4140 appears affected as well. >> >> 1. FUA is ignored for WRITEs. >> 2. If a SYNCHRONIZE CACHE command is received with IMMED=1, the >> system >> will crash. >> [ ... ] > >> 4. VERIFY can randomly fail, even when it should succeed. > > Regarding [1]: as far as I can see in scst_vdisk.c the FUA implementation is > fine. > Regarding [2]: sg_sync -i is working fine here. > Regarding [4]: in all tests I ran so far sg_verify was working fine. > > Bart. |