Thread: [Ocamlfuse-devel] Missing features
Brought to you by:
applejack
From: Goswin v. B. <bre...@in...> - 2007-04-05 17:57:05
|
Hi, I played around a bit with ocaml-fuse and run into some minor problems: - Can I just leave out fuse binding and get proper errors if someone tries to use them? Do I get an ENOSYS then? - Is it OK to throw the usual Unix.unix_error exceptions one would expect from the respective system calls for each binding? What if some other exception gets thrown? - no create() binding, newer kernels use that and it might be nice to add - 'mknod : string -> int -> unit' is missing the 3rd argument for the major/minor of device special files - 'fopen : string -> Unix.open_flag list -> int option;' Fuse open() can return an arbitrary file handle in the struct fuse_file_info which is passed to to all file operations (read/write/flush/release/...). The file handle is an uint64_t so it is big enough for 64bit inodes, pointers or block addresses for large disks. I actualy would like to store an ocaml value in the file handle which leads me to another problem: - How do I tell the GC about ocaml values that I pass to the fuse C code? For now I have a (Inode.t, FileHandle.t) Hashtbl.t where Inode.t is an int (31 bit only on 32bit cpus) which could be limiting and slower. - fuse.h contains lots of comments for callbacks. Might be nice to adapt them to the respective ocamlfuse bindings. MfG Goswin |
From: <ci...@di...> - 2007-04-06 22:00:15
|
Scrive Goswin von Brederlow <bre...@in...>: > Hi, > > I played around a bit with ocaml-fuse and run into some minor > problems: > I am traveling again, this time for personal reasons that I didn't know before, so I will save your e-mail and a fresh copy of ocamlfuse CVS on a pen drive and answer to your questions tomorrow on my laptop while in train :) See you in the next days Vincenzo ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. |
From: Vincenzo C. <ci...@di...> - 2007-04-07 20:48:44
|
Il giorno gio, 05/04/2007 alle 19.56 +0200, Goswin von Brederlow ha scritto: > Hi, > > I played > around a bit with ocaml-fuse and run into some minor > problems: Thank you very much for reporting, please feel free to send any more comment, bug or RFE (and patches of course :) ) > > - Can I just leave out fuse binding and get proper errors if someone > tries to use them? Do I get an ENOSYS then? Yes, you get ENOSYS as in C. > > - Is it OK to throw the usual Unix.unix_error exceptions one would > expect from the respective system calls for each binding? What if > some other exception gets thrown? > You can throw unix_error and error translation is done in Fuse_util.c, in a way that sucks, but I didn't find anything better - oh well, I have an idea really - should think about that. Other errors are caught in Fuse_lib.named_op and its clones (notice bad coding practice there) let cb x = try Ok (f x) with Unix.Unix_error (err,_,_) -> Bad err | _ -> Bad Unix.ERANGE (* TODO: find a better way to signal the user and log this *) as you can see it is already on my todo list to improve the situation. This is one of the robustness constraints that I imposed on ocamlfuse: no non-unix exception can be thrown by a fuse operation. > - no create() binding, newer kernels use that and it might be nice to > add Ok, will do this ASAP, thank you for input. > > - 'mknod : string -> int -> unit' is missing the 3rd argument for the > major/minor of device special files Was it introduced recently? I have lost the contact with fuse in the last months, maybe the whole interface has to be updated to a newer fuse version. > - 'fopen : string -> Unix.open_flag list -> int option;' > Fuse open() can return an arbitrary file handle in the struct > fuse_file_info which is passed to to all file operations > (read/write/flush/release/...). The file handle is an uint64_t so it > is big enough for 64bit inodes, pointers or block addresses for > large disks. So you say that I should change fopen to return int64 option? This can be done of course. Otoh, it would be better if fopen could return an arbitrary type but that would make the entire record type parametric - and I would have to solve next point: > > I actualy would like to store an ocaml value in the file handle > which leads me to another problem: > > - How do I tell the GC about ocaml values that I pass to the fuse C > code? > There are ways, however I preferred not to solve this in the beginning, leaving it for a future version - another TODO item, at least I should take a look and see if it would be easy. > For now I have a (Inode.t, FileHandle.t) Hashtbl.t where Inode.t is > an int (31 bit only on 32bit cpus) which could be limiting and > slower. Yes, it is slower, however not that much. Your bottleneck is doomed to be I/O if you do real I/O and not memory access, however your mileage may vary. > > - fuse.h contains lots of comments for callbacks. Might be nice to > adapt them to the respective ocamlfuse bindings. > Yes but I wouldn't like to copy that by hand and have to track changes in the original. Didn't find a solution yet. In your other e-mail, you ask what do I use for I/O. I use operations in the Largefile package, which are fast and nonblocking - the interface of the read and write functions in ocamlfuse is designed to make it easy to use the Largefile package, see fusexmp.ml for an example and the type of buffer in the beginning of Fuse.mli. Perhaps your concerns where about blocking I/O but I can assure you it's nonblocking in recent (>= 3.08 at least) versions of ocaml. Thanks again, and don't hesitate to contact me whenever you want. Bye Vincenzo |
From: Goswin v. B. <bre...@in...> - 2007-04-08 01:33:07
|
Vincenzo Ciancia <ci...@di...> writes: > Il giorno gio, 05/04/2007 alle 19.56 +0200, Goswin von Brederlow ha > scritto: >> - no create() binding, newer kernels use that and it might be nice to >> add > > Ok, will do this ASAP, thank you for input. This one is new. >> - 'mknod : string -> int -> unit' is missing the 3rd argument for the >> major/minor of device special files > > Was it introduced recently? I have lost the contact with fuse in the > last months, maybe the whole interface has to be updated to a newer fuse > version. The ocamlfuse code say it uses fuse 2.2. I have libfuse2 2.6.3 installed: /usr/include/fuse/fuse.h: /* IMPORTANT: you should define FUSE_USE_VERSION before including this header. To use the newest API define it to 26 (recommended for any new application), to use the old API define it to 21 (default) 22 or 25, to use the even older 1.X API define it to 11. */ But I would guess mknod had always had that third argument. It is optional though, it only has meaning when creating a block or char special device. >> - 'fopen : string -> Unix.open_flag list -> int option;' >> Fuse open() can return an arbitrary file handle in the struct >> fuse_file_info which is passed to to all file operations >> (read/write/flush/release/...). The file handle is an uint64_t so it >> is big enough for 64bit inodes, pointers or block addresses for >> large disks. > > So you say that I should change fopen to return int64 option? This can > be done of course. Otoh, it would be better if fopen could return an > arbitrary type but that would make the entire record type parametric - > and I would have to solve next point: I think a better solution would be to have 2 fuse modules. Like Unix and Unix.Largefile. The simple one would have open, read, write with and int like now. The more complex one would have open return an arbitrary value or an ocaml version of fuse_file_info and pass that to read/write. > In your other e-mail, you ask what do I use for I/O. I use operations in > the Largefile package, which are fast and nonblocking - the interface of > the read and write functions in ocamlfuse is designed to make it easy to > use the Largefile package, see fusexmp.ml for an example and the type of > buffer in the beginning of Fuse.mli. Perhaps your concerns where about > blocking I/O but I can assure you it's nonblocking in recent (>= 3.08 at > least) versions of ocaml. Ugh? By default it should be blocking unless you open with O_NONBLOCK. And then the read/write returns when it cant read/write enough data directly. But it does not continue reading/writing in the background afaik. With libaio you fire off all your reads and/or writes and then you can check or wait for completions to come in. You can keep working between fireing off the request and the completion and the kernel will do its thing in the background. I plan to have data split across multiple physical disks so when a read/write comes in it is likely to involve multiple FDs that should all read/write in parallel for maximum throughput. From a theoretical point it is the difference between 40MB/s and 200MB/s. If I get 100MB/s I'm happy, if I get 20MB/s I'm sad. > Thanks again, and don't hesitate to contact me whenever you want. How was your trip? > Bye > > Vincenzo MfG Goswin |
From: Vincenzo C. <ci...@di...> - 2007-04-08 11:09:58
|
Il giorno dom, 08/04/2007 alle 03.32 +0200, Goswin von Brederlow ha scritto: > > Ok, will do this ASAP, thank you for input. > > This one is new. > You're right - time is scarce and things to do are many - I think that I should update the whole binding to very recent fuse, and the create method is one thing to implement, if you have time, make a list of all the changes, that will ease my work a lot, or else I will start just by implementing create. It is much more time consuming to identify interface changes than it is to implement those in the bindings. > But I would guess mknod had always had that third argument. It is > optional though, it only has meaning when creating a block or char > special device. I remember that fuse didn't support special devices in 2.4 series but I might be proved wrong > I think a better solution would be to have 2 fuse modules. Like Unix > and Unix.Largefile. Well, maybe not, since the goal of the ocamlfuse layer is to mimic C as close as possible, so that the right thing to do seems to switch to 64 bit for the file handle size. > Ugh? By default it should be blocking unless you open with > O_NONBLOCK. > With libaio you fire off all your reads and/or writes and then you can > check or wait for completions to come in Ok, I thought you meant "nonblocking" in the ocaml sense: foreign calls to C are always blocking by default, but if you know they won't affect the ocaml runtime, or do proper locking, you can tag them as "nonblocking". In that case, your ocaml threads keep running while the system call is in progress. Then you can implement asynchronous operations in ocaml land very easily (and efficiently, since the ocaml runtime handles threads with minimal overhead). Maybe there is something I am forgetting here about libc, please forgive me if I will make you repeat yourself. > > I plan to have data split across multiple physical disks so when a > read/write comes in it is likely to involve multiple FDs that should > all read/write in parallel for maximum throughput. From a theoretical > point it is the difference between 40MB/s and 200MB/s. If I get > 100MB/s I'm happy, if I get 20MB/s I'm sad. > Your operations (in particular, reads and writes) will be done in parallel by default, try that. I don't like how current scheduling is done (a new ocaml thread is fired for each operation) but it is easy to change, like almost everything in ocaml. By the way, in ocaml 3.08, the stat-alike calls where still not tagged as nonblocking, resulting in a stat over a dvd in fusexmp to block the entire filesystem process for some 10 seconds. I seem to recall this problem has been solved in recent ocaml versions, however it is easy to check, if not, I will add a non-blocking stat to the utility libraries of ocamlfuse. > > Thanks again, and don't hesitate to contact me whenever you want. > > How was your trip? > It was long... I had to travel for 11 hours on Monday and Wednesday, and the same on Saturday - hope to stay here some day now :) Vincenzo |
From: Goswin v. B. <bre...@in...> - 2007-04-08 17:42:38
|
Vincenzo Ciancia <ci...@di...> writes: > Il giorno dom, 08/04/2007 alle 03.32 +0200, Goswin von Brederlow ha > scritto: > >> I think a better solution would be to have 2 fuse modules. Like Unix >> and Unix.Largefile. > > Well, maybe not, since the goal of the ocamlfuse layer is to mimic C as > close as possible, so that the right thing to do seems to switch to 64 > bit for the file handle size. That is the only solution that I see that doesn't require handing (pointers to) ocaml objects to the libfuse code. If you have say type open_result = FD of Unix.file_descr | Int64 of Int64.t then you would have to put an enum { FD, INT64 } and the actual value into the fuse_file_info.fh, which only has room for 64bit. So you would have to store a pointer to the open_result in there and prevent the GC from moving it around (malloc and copy it?). Slightly more complex would be having type ['a] operations = ... fopen : string -> Unix.open_flag list -> 'a option; read : string -> buffer -> int64 -> 'a -> int; write : string -> buffer -> int64 -> 'a -> int; release : string -> Unix.open_flag list -> 'a -> unit; flush : string -> 'a -> unit; fsync : string -> bool -> 'a -> unit; ... } in which case you also have to tell the GC about the 'a object you store in the fuse_file_info so it can follow any pointers it contains. >> Ugh? By default it should be blocking unless you open with >> O_NONBLOCK. > >> With libaio you fire off all your reads and/or writes and then you can >> check or wait for completions to come in > > Ok, I thought you meant "nonblocking" in the ocaml sense: foreign calls > to C are always blocking by default, but if you know they won't affect > the ocaml runtime, or do proper locking, you can tag them as > "nonblocking". In that case, your ocaml threads keep running while the > system call is in progress. Then you can implement asynchronous > operations in ocaml land very easily (and efficiently, since the ocaml > runtime handles threads with minimal overhead). Maybe there is something > I am forgetting here about libc, please forgive me if I will make you > repeat yourself. Nice to know. I was trying to avoid threads, maybe even disable the fuse threads too for simplicity sake. >> I plan to have data split across multiple physical disks so when a >> read/write comes in it is likely to involve multiple FDs that should >> all read/write in parallel for maximum throughput. From a theoretical >> point it is the difference between 40MB/s and 200MB/s. If I get >> 100MB/s I'm happy, if I get 20MB/s I'm sad. >> > > Your operations (in particular, reads and writes) will be done in > parallel by default, try that. I don't like how current scheduling is > done (a new ocaml thread is fired for each operation) but it is easy to > change, like almost everything in ocaml. By the way, in ocaml 3.08, the > stat-alike calls where still not tagged as nonblocking, resulting in a > stat over a dvd in fusexmp to block the entire filesystem process for > some 10 seconds. I seem to recall this problem has been solved in recent > ocaml versions, however it is easy to check, if not, I will add a > non-blocking stat to the utility libraries of ocamlfuse. Multiple reads on the fuse FS will be split into threads. I'm talking about a single read, say 64K, in one call. With 2 disks I have to read 32K from each disk, 4k blocks alternating between disks. With libaio I hope that I can prepare the 16 read requests, fire them off to the kernel and the kernel would actualy only read two 32k chunks from the disks. If I use 2 threads (one for each) and have each do 8 4k reads on its disk then the kernel would most likely read 8 times 4k from each disk (unless I'm really lucky and read-ahead works). Anyway that is the theory. MfG Goswin |
From: Vincenzo C. <ci...@di...> - 2007-04-09 17:48:08
|
Il giorno dom, 08/04/2007 alle 19.41 +0200, Goswin von Brederlow ha scritto: > Slightly more complex would be having > > type ['a] operations = > ... > fopen : string -> Unix.open_flag list -> 'a option; > read : string -> buffer -> int64 -> 'a -> int; > write : string -> buffer -> int64 -> 'a -> int; > release : string -> Unix.open_flag list -> 'a -> unit; > flush : string -> 'a -> unit; > fsync : string -> bool -> 'a -> unit; > ... > } > I know, but I will implement int64 handle ASAP, and polymorphic type as soon as I have true time (read: as soon as someone else does that :) ). On the other hand, I have implemented the third argument of MKNOD (not yet in CVS) but it seems to give random values, i.e.: what is the damn size of dev_t? Once, I was aware of all these issues, but now I am always in a hurry and need a bit of help here. I am not sure I got why mknod is useful in fuse, is it just for FIFOs? Can you provide a test case (i.e. a shell command) that works with fusexmp.c? Bye and thanks Vincenzo |
From: Goswin v. B. <bre...@in...> - 2007-04-10 04:19:14
|
Vincenzo Ciancia <ci...@di...> writes: > Il giorno dom, 08/04/2007 alle 19.41 +0200, Goswin von Brederlow ha scritto: > On the other hand, I have implemented the third argument of MKNOD (not > yet in CVS) but it seems to give random values, i.e.: what is the damn > size of dev_t? Once, I was aware of all these issues, but now I am > always in a hurry and need a bit of help here. I am not sure I got why > mknod is useful in fuse, is it just for FIFOs? Can you provide a test > case (i.e. a shell command) that works with fusexmp.c? > > Bye and thanks > > Vincenzo The size of dev_t will be sizeof(dev_t) :) Can be 16bit or 64bit I think. I guess you should use Int64.t, (Int32.t, Int32.t) or { major : Int32.t; minor : Int32.t }. In the absence of a create function or on older kernels you need mknod to create files but then the 3rd argument will be unset. Same for fifo or socket as I understand it. I'm assuming fuse mknod follows the manpage for mknod(2): int mknod(const char *pathname, mode_t mode, dev_t dev); DESCRIPTION The system call mknod() creates a filesystem node (file, device special file or named pipe) named pathname, with attributes specified by mode and dev. ... If the file type is S_IFCHR or S_IFBLK then dev specifies the major and minor numbers of the newly created device special file; otherwise it is ignored. As testcase a simple "touch foo" for file and "mknod foo b 8 1" for an actual block device should do. But I found another problem. The Unix module doesn't have a mknod function. Something to add to the unix_stubs.c I guess. MfG Goswin |
From: Vincenzo C. <ci...@di...> - 2007-04-10 13:49:05
|
> The size of dev_t will be sizeof(dev_t) :) > As testcase a simple "touch foo" for file and "mknod foo b 8 1" for an > actual block device should do. > Ok I am back inside C, unix, ocaml and fuse. Now I have things a little clearer. I already added an mknod(2) binding to unix_stubs.c, however I still have to sort out a little my int size problems, however I have understood what was going wrong in my tests with fusexmp.c ... I was using mknod(1) and expecting to see its first _numerical_ argument as mode and its second _numerical_ argument as rdev, of course it's not that way, me dumb. And I was looking at the manpage, too :) Ok you can start holding your breath. Regarding create and destroy, these will be my next step if they are in 2.4 or else I will start the update to 2.6 which has to be done soon or later. Bye Vincenzo |
From: Goswin v. B. <bre...@in...> - 2007-04-10 11:30:33
|
Hi, I found another callback that is missing: /** * Clean up filesystem * * Called on filesystem exit. * * Introduced in version 2.3 */ void (*destroy) (void *); where the void * is the private_data field returned by init. MfG Goswin |