Re: [fuse-devel] How could fuse support parallel opreation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Xing Jing <xin...@gm...> writes:

> Hi, everyone, glad to meet you here.
>
> I've adopted fuse to implement a distributed file system for about
> three years. Fuse is really very good, it is easy to learn and use,
> and user mode file system is much easier to develop.
>
> At present, I met a problem about the performance of client in my
> distributed file system. Although I used mulitple thread to read/write
> or create files, only one request from the client can be submit to the
> server at one time, which means even we config a very powerfull
> computer as a client and mulitple servers to provide service, the
> whole system still can only process request one by one as the
> limitation of client.
>
> And then, I checked the source code of fuse, I found that most of file
> sysetm system calls which sent to user mode from fuse kernel are sent
> in the block mode, could there be several queues to process the block
> requests to provide a higher performance?
>
> Does anyone else meet the same problem? If you have solutions to deal
> with this, would you please tell me? Thank you very much.
>
> Jing
> Dec.17th

Fusecan operate in 3 modes:

1) single threaded

Every request blocks until it is finished.

2) multi threaded

Libfuse starts (or reuses an idle one) a new thread for every request
and multiple requests can be processed in parallel, one per thread.
There is a limit on the number of pending requests and some operations
block others. The former can be changed iirc while the later is required
for correct behaviour and nothing can be done there.

3) asyncron

Libfuse doesn't have a read-to-use main loop for this. But you can
include the fuse FD in your select/poll/epoll loop and call the recieve
and process functions in libfuse yourself. You can read requests from
the FD and reply to them in any order. The kernel doesn't care. The
limit on the number of pending requests and locking between some
operation remain.

This is usefull with non-blocking IO like you have in a networked
filesystem, when your FS is just a repeater and doesn't need the cpu
power of multiple cores. The parallelity comes from interweaving the
requests in a single thread.

You also have the choice of using read/write or splice operations.
Esspecialy with splicing the kernel side will not be the bottleneck. A
single queue is way fast enough to splice all the requests you want in
no time. Even with read/write, which means memcpy() calls, I doubt you
can make your filesystem faster than the kernel can memcpy() from a
single queue.

In your case I bet splice would be applicable.

MfG
        Goswin