From: Eirik B. <ebakke@Princeton.EDU> - 2007-12-30 19:02:36
|
Hi, FUSE developers. I have been trying to benchmark FUSE, and noted that the performance penalty imposed by FUSE when used to implement a trivial filter file system (example_fh) on top of a native ext3 disk file system is highest for small, unoptimized (no opportunity for cache, readahead etc.) operations like "create". As far as I understand, this particular scenario incurs 3 extra system calls per create: 0) The benchmark (or other program) issues a "create" system call, as usual. 1) fuselib issues a "read" system call to get the FUSE "create" request 2) The filter file system issues a "create" system call to the underlying (ext3) file system (other types of file systems may issue other system calls here, e.g. writing to and reading from a socket) 3) fuselib issues a "writev" system call to return the result of the "create" operation As I assumed that these extra user-kernel boundary crossings would be the cause of most of FUSE's additional latency, I thought that it should be possible to improve the performance of small operations like "create" by combining calls (1) and (3) into a single system call. In other words, whenever FUSE writes the result of a request (normally with "writev"), the FUSE kernel driver waits until it can return another request. That eliminates a "read" for every "writev". I successfully implemented this by means of a special new ioctl call on /dev/fuse (patch included), but haven't been able to show any performance improvement when using lmbench to measure empty file creation latency (lmbench-crea_0 in the charts). Any ideas for why this could be? Are the extra system calls themselves not a significant cause of latency after all? If so, where does the time actually go? Note: the attached patch only works in single-threaded FUSE mode (example_fh /mnt/fuse -s). There are some deadlock issues with sleeping while sending a FUSE reply. I tested it with FUSE 2.7.2 on vanilla Linux 2.6.23.12 on a 32-bit machine. (It works on a 64-bit FC5 machine, too.) I have confirmed that fuselib issues only half of the number of system calls when run with the patch enabled. Thanks for your thoughts, Eirik Bakke (Doing a undergrad project with Prof. Kai Li at Princeton University, NJ, USA.) |