From: Michael B. <mic...@go...> - 2011-10-10 15:40:17
|
Hi, We are using Fuse to develop our XtreemFS client and it works very well :-) I'm also using the interruption support of Fuse to allow interrupting the retry of blocked operations. Therefore I did register my signal handler on SIGUSR1 and it works as excepted for all commands except the read() command: If a read() is interrupted, the signal handler does not get called while the program performing the read() itself (for instance a "cat ~/mnt/_Readme.txt") is properly interrupted. As the read() is still running in the background and retrying, the release() on the file is also delayed until the read() finally returned. I installed Fuse 2.8.5 with enabled debug symbols on my system and tried to narrow down the issue: find_interrupted() in fuse_lowlevel.c fails to find the active read() request: In particular, the check in line 1005 does not succeed: if (curr->unique == req->u.i.unique) { as the value for curr->unique is always off by one (compared to the value in req->u.i.unique), for instance 293 != 294. Are you aware of any other projects which use the Fuse interruption support? So far I could not find a popular example where I could verify my findings. Therefore, I also included instructions how to reproduce the error for further debugging using our XtreemFS client at the end of this mail. What about interruption of read ahead in Fuse in general? Are interrupt requests sent to the Fuse library for every pending read()? From the current experiences I guess that's not the case. Thank you very much for your help. Best regards, Michael Berlin ## Instructions how to reproduce the non-working interruption support for read() requests using the XtreemFS fuse client: # Compile the client with debug symbols (or get binary packages w/o debug symbols for your distribution from www.xtreemfs.org): $ svn checkout http://xtreemfs.googlecode.com/svn/trunk/ xtreemfs-read-only $ cd xtreemfs-read-only $ make client_debug # Mount the public demo server at ~/mnt $ mkdir ~/mnt $ ./bin/mount.xtreemfs demo.xtreemfs.org/demo ~/mnt -d DEBUG -f --max-read-tries=2 # Block the public OSD in your local firewall: In consquence, no read/write/truncate operations will succeed while metadata operations (directory listing etc.) towards the MRC service are still possible. (Once you're done, unblock it again: sudo iptables -D OUTPUT -d demo.xtreemfs.org -p tcp --dport 32640 -j DROP) $ sudo iptables -A OUTPUT -d demo.xtreemfs.org -p tcp --dport 32640 -j DROP # Try to read a file. As the OSD is blocked, "cat" will hang now: $ cat ~/mnt/_Readme.txt # Now, you'll see in the logging output the open() and read() commands: ... [ D | ?:0 | 10/ 7 18:44:03.249 | 0x1d130a0 ] xtreemfs_fuse_open on path /_Readme.txt ... [ D | ?:0 | 10/ 7 18:44:03.256 | 0x1d146c0 ] xtreemfs_fuse_read /_Readme.txt s:4096 o:0 # Press Ctrl-C to interrupt "cat" and the blocked "read" request. # Afterwards, the close() on the file handle is executed as flush() and lock() get called: [ D | ?:0 | 10/ 7 18:44:08.532 | 0x1d13630 ] xtreemfs_fuse_flush /_Readme.txt [ D | ?:0 | 10/ 7 18:44:08.532 | 0x1d13630 ] xtreemfs_fuse_lock on path /_Readme.txt command: set lock type: unlock start: 0 length: 0 pid: 0 # However, the read() was not successfully interrupted and it takes approximately 120 more seconds (2 tries * 60 seconds connection timeout) until release() is called: [ D | ?:0 | 10/ 7 18:46:10.931 | 0x23aad90 ] xtreemfs_fuse_read finished /_Readme.txt s:4096 o:0 r:-5 [ D | ?:0 | 10/ 7 18:46:10.931 | 0x23aad90 ] xtreemfs_fuse_release /_Readme.txt ## For comparison, here is the log output of a successfully interrupted write(): [ D | ?:0 | 10/10 17:23:26.625 | 0x1423ba0 ] xtreemfs_fuse_write /test.txt size: 5 ... [ D | ?:0 | 10/10 17:23:41.884 | 0x1423ba0 ] INTERRUPT triggered, setting TLS pointer ... [ E | ?:0 | 10/10 17:24:26.641 | 0x1423ba0 ] Got no response from server demo.xtreemfs.org-OSD (demo.xtreemfs.org:32640), retrying (infinite attempts left, waiting at least 15 seconds between two attempts). [ D | ?:0 | 10/10 17:24:26.641 | 0x1423ba0 ] Caught interrupt, aborting sync request. [ D | ?:0 | 10/10 17:24:26.641 | 0x1422ff0 ] xtreemfs_fuse_flush /test.txt [ D | ?:0 | 10/10 17:24:26.641 | 0x1422ff0 ] xtreemfs_fuse_lock on path /test.txt command: set lock type: unlock start: 0 length: 0 pid: 0 ... [ D | ?:0 | 10/10 17:24:26.642 | 0x1422380 ] xtreemfs_fuse_release /test.txt $ echo test >> ~/mnt/test.txt ^C^C^C^C^C^C^C^C^C^C^C^C^C^C^Cbash: echo: write error: Interrupted system call |