From: Miklos S. <mi...@sz...> - 2006-06-15 13:22:57
|
> I've encountered a problem when trying to do some concurrency testing of > a FUSE device I've been working on. The test in question creates two > threads and performs a set of ls calls to the device for several > iterations checking the results each time. > > Most of the time the test seems to work fine, but from time to time the > test just hangs, and worse than that ends up hanging the entire box, > forcing a reboot. What kind of a hang is it? Unrelated applications don't respond either? Does the machine respond to SysRq (e.g. Alt-SysRq-t) commands? What is the fuse version? What is the kernel version? > Initially I was concerned that the problem was within > my own code so I inserted lots of trace messages to see which of my > methods the hang was occurring in. I was surprised to discover that the > hang occurs completely outside my code, which implies that it is > occurring somewhere in the FUSE code. > > What I'm seeing with the trace is that the hang occurs directly after a > call to getattr has finished. Running the device in debug mode shows > that the getattr has completed successfully, but it then doesn't go on > to call either opendir or readdir, and at this point the hang could > occur before either of those calls. > > If I run the device in single threaded mode I can't reproduce the > problem at all which suggests that this is a multithreading issue. > > My only concern is over the version of compiler we are using for our > development code. The linux kernel (SUSE SLES 9 SP2) is compiled using > gcc 3.3.3 as is the FUSE library. Our developed code is compiled using > gcc 3.4.5. Could this cause a problem? I don't think so. The only important thing is that the kernel and the fuse module are compiled with exactly the same version of gcc. Thanks, Miklos |