From: Csaba H. <csa...@cr...> - 2007-05-27 23:17:38
|
Hi, Following seen with a 2.6.21 kernel. I mounted the null fs (I mean, example/null.c) on /mnt/nullmp, after hacking it as follows: ================================================================ diff -r 3c7762a3379a example/null.c --- a/example/null.c Thu May 10 22:56:46 2007 +0000 +++ b/example/null.c Mon May 28 00:21:12 2007 +0200 @@ -58,11 +58,14 @@ static int null_read(const char *path, c (void) buf; (void) offset; (void) fi; + size_t rsiz; if(strcmp(path, "/") != 0) return -ENOENT; - return size; + rsiz = (size + 1) >> 1; + memset(buf, 'x', rsiz); + return rsiz; } static int null_write(const char *path, const char *buf, size_t size, @@ -75,7 +78,7 @@ static int null_write(const char *path, if(strcmp(path, "/") != 0) return -ENOENT; - return size; + return (size + 1) >> 1; } static struct fuse_operations null_oper = { ================================================================ And then: ---------------------------------------------------- $ cat foo.c #include <unistd.h> main(int argc, char **argv) { char buf[4096]; if (isatty(1)) { read(0, buf, 4096); read(0, buf, 4096); read(0, buf, 4096); } else write(1, buf, 2); } $ gcc foo.c -o foo $ strace ./foo < /mnt/nullmp [...] ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 4096 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 4096 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 exit_group(4096) = ? Process 22117 detached $ strace ./foo > /mnt/nullmp [...] ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfd0d098) = -1 ENOTTY (Inappropriate ioctl for device) write(1, "\0\0", 2) = -1 EIO (Input/output error) exit_group(-1) = ? Process 28034 detached ---------------------------------------------------- In the first case (reading) fs debug output was: ---------------------------------------------------- OPEN[0] flags: 0x8000 / unique: 4, opcode: FLUSH (25), nodeid: 1, insize: 64 FLUSH[0] unique: 4, error: -38 (Function not implemented), outsize: 16 unique: 5, opcode: READ (15), nodeid: 1, insize: 64 READ[0] 16384 bytes from 0 READ[0] 8192 bytes unique: 5, error: 0 (Success), outsize: 8208 unique: 6, opcode: READ (15), nodeid: 1, insize: 64 READ[0] 32768 bytes from 16384 READ[0] 16384 bytes unique: 6, error: 0 (Success), outsize: 16400 unique: 7, opcode: RELEASE (18), nodeid: 1, insize: 64 RELEASE[0] flags: 0x8000 unique: 7, error: 0 (Success), outsize: 16 ---------------------------------------------------- When writing, fs debug output: ---------------------------------------------------- unique: 8, opcode: SETATTR (4), nodeid: 1, insize: 128 unique: 8, error: 0 (Success), outsize: 112 unique: 9, opcode: GETATTR (3), nodeid: 1, insize: 40 unique: 9, error: 0 (Success), outsize: 112 unique: 10, opcode: OPEN (14), nodeid: 1, insize: 48 unique: 10, error: 0 (Success), outsize: 32 OPEN[0] flags: 0x8001 / unique: 11, opcode: WRITE (16), nodeid: 1, insize: 66 WRITE[0] 2 bytes to 0 WRITE[0] 1 bytes unique: 11, error: 0 (Success), outsize: 24 unique: 12, opcode: RELEASE (18), nodeid: 1, insize: 64 RELEASE[0] flags: 0x8001 unique: 12, error: 0 (Success), outsize: 16 ---------------------------------------------------- So, when the fs daemon tries to send back a short read, then the kernel silently relaxes this and fills up the missing part with zeroes (at least, if the top of the to-be-read range doesn't exceed file size); when the daemon tries to send back a short write, the kernel paranoidly returns with error. Is it the intended behaviour? This at least seems counter-intuitive for me. I'd naively expect that the kernel either (a) stops I/O when less data is transferred than the I/O size or (b) retries until the I/O size is reached cumulatively, or the daemon responds with error or 0 sized I/O. Moreover, I'd expect that the chosen strategy is same for reading and writing. (Btw, I do see the (a) type behaviour with direct_io.) Csaba |