From: Greg B. <br...@ro...> - 2007-03-16 03:05:38
|
Barlow, Simon <simon.barlow@...> writes: > > We have developed a fuse file system for a SCSI device which works well > part from a performance issue which we think is down to fuse. > > The performance figures are showing 16MB/s and we would expect at least > 30MB/s. > > The fuse filesystem is using direct_io therefore the maximum block size > is 128K. > > >From the client's we wish to read, for example, 1MB of data which we > know gets segmented into 1MB/128KB read operations. > > Are the plans for fuse to allow unlimited read operations ie. The 128K > limit no longer exists, and if so when will this be available. > > If not then can you suggest a way of improving the fuse read operations > such that performance increases simon, over the course of the last few days, i have gathered a bit more understanding regarding the issue. in direct_io mode, fuse has a maximum read/write buffer of 128K. the fuse kernel code takes the buffer from the requesting application and aligns it to a page boundary (in the case of i386, this is 4k) before passing the buffer up to your fuse user-level program. in the case where the requesting application is issuing a 128K read/write, if the buffer sent to the kernel is not on a 4k boundary, then fuse will segment the buffer into two requests -- the first will be of size (buffer_addr - (buffer_addr mod 4k)) and the second will be (buffer_addr mod 4k). if buffer_addr is on a 4k boundary, then only one I/O is issued. otherwise, a large I/O followed by a relatively short I/O will be issued. i believe this is the cause of your performance degradation. there are two easy solutions: 1) if you control the application, make sure it issues read/write system calls with a buffer address that is on a 4k boundary. 2) tweak the fuse code for #1, here's an example: #define BUFSIZE (128*1024) buf = (char *)malloc(BUFSIZE+PAGE_SIZE); ptr = buf + (4096 - ((unsigned int)buf % PAGE_SIZE)); fd = open("/mnt/fa/a", O_RDONLY); for (i = 0 ; i < 10 ; ++i) { read(fd, ptr, BUFSIZE); } allocate a buffer that is one page larger than you need, then adjust the base pointer. for #2, you need to edit the file fuse-2.6.3/kernel/fuse_i.h and change the line: #define FUSE_MAX_PAGES_PER_REQ 32 to: #define FUSE_MAX_PAGES_PER_REQ 33 then rebuild the fuse driver. then, when you run your user-level fuse program, make sure to pass the flags: -o max_read=131072 -o max_write=131072 in my case, it looks something like: ./fa-client -o direct_io -o max_read=131072 -o max_write=131072 /mnt/fa i did option #2 because i need a general-purpose solution and now i'm seeing 128K reads and writes for all applications that issue large I/Os. on a side note, 'dd' performs great without #2. this is because, internally, it implements #1. - gb |