From: SourceForge.net <no...@so...> - 2009-11-09 21:09:38
|
Bugs item #2888099, was opened at 2009-10-28 18:06 Message generated for change (Comment added) made by ferrieux You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2888099&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 25. Channel System Group: development: 8.6b1.1 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Konstantin Khomoutov (flatworm) Assigned to: Andreas Kupries (andreas_kupries) Summary: [close] discards buffered data when no space left on device Initial Comment: [close] performed on an opened file silently discards any buffered data written into it when there's no space left on the underlying device. [flush] works in this case, but the documentation explicitly states that "all buffered output is flushed to the channel's output device, any buffered input is discarded". Reproducible on Linux (Debian Lenny running 2.6.26 kernel) both with the current CVS HEAD and 8.5.3. Below are the steps to reproduce the error on Linux and the printouts of sample Tcl sessions: 1. Create a file "/var/tmp/blockdev" 1 megabyte in size: $ dd if=/dev/zero of=/var/tmp/blockdev bs=1024 count=1024 1024+0 records in 1024+0 records out 1048576 bytes (1.0 MB) copied, 0.00824364 s, 127 MB/s 2. Create ext3 filesystem on it: $ /sbin/mkfs.ext3 /var/tmp/blockdev mke2fs 1.41.3 (12-Oct-2008) /var/tmp/blockdev is not a block special device. Proceed anyway? (y,n) y Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) 128 inodes, 1024 blocks 51 blocks (4.98%) reserved for the super user First data block=1 Maximum filesystem blocks=1048576 1 block group 8192 blocks per group, 8192 fragments per group 128 inodes per group [...] 3. Set "reserved blocks count" to 0 on the filesystem: $ /sbin/tune2fs -r 0 /var/tmp/blockdev tune2fs 1.41.3 (12-Oct-2008) Setting reserved blocks count to 0 4. Learn the block size and the amount of free blocks on the resulting FS: $ /sbin/dumpe2fs -h /var/tmp/blockdev |egrep -i 'free blocks|block size' dumpe2fs 1.41.3 (12-Oct-2008) Free blocks: 986 Block size: 1024 5. Go root, create a mount point "/tmp/blockdev" and mount our file onto it: $ su # mkdir /tmp/blockdev # mount -o loop /var/tmp/blockdev /tmp/blockdev 6. Create a file which will occupy all the available space on the FS: # dd if=/dev/zero of=/tmp/blockdev/blob bs=1024 count=986 dd: writing `/tmp/blockdev/blob': No space left on device 982+0 records in 981+0 records out 1004544 bytes (1.0 MB) copied, 0.0069901 s, 144 MB/s 7. Verify no free space is really left: # /sbin/dumpe2fs -h /var/tmp/blockdev | grep -i 'free blocks' dumpe2fs 1.41.3 (12-Oct-2008) Free blocks: 0 8. Run Tcl, attempt to write 5 bytes to a fully buffered file channel, close the stream, see the data discarded: # ~kostix/tcl86/bin/tclsh8.6 % set fd [open /tmp/blockdev/foo w] file5 % chan config $fd -buffering full % puts $fd data % unset -nocomplain errorCode errorInfo % close $fd % set errorCode can't read "errorCode": no such variable % set errorInfo can't read "errorCode": no such variable while executing "set errorCode" 9. Same as before, but do [flush] and see it complain as expected: # ~kostix/tcl86/bin/tclsh8.6 % set fd [open /tmp/blockdev/foo w] file5 % chan config $fd -buffering full % puts $fd data % unset -nocomplain errorCode errorInfo % flush $fd error flushing "file5": no space left on device % set errorInfo error flushing "file5": no space left on device while executing "flush $fd" % set errorCode POSIX ENOSPC {no space left on device} % ---------------------------------------------------------------------- >Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-11-09 22:09 Message: I don't have a unix at hand right now to dig further, but let me just clarify: You're probably right on my misinterpretation...however after single-stepping through the erroneous case I saw that the WriteBytes() in the snippet caused FlushChannel() to be called. So what happens after WriteBytes() in this snippet actually hides the ENOSPC which is already there... ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2009-11-09 19:38 Message: You are misunderstanding the snippet. It looks for a bypass message, yes, and moves such a message up for use by the caller, yes. When there is no message in the bypass then the regular reporting of whatever is in errno should happen. The bypass stuff will happen only if it is used by a channel driver (see tclIORefChan/tclIORefTrans.c), and otherwise the regular error reporting is in effect. Given what you are saying with regard to [fconfigure -translation binary] causing the reporting to happen as usual I believe that we should now look for the differences in code execution between the 'configure binary' and the regular case. Also note that the snippet you refer too is run before 'FlushChannel'. Or are you saying that the bypass are had a message, and is now overriding whatever came from 'FlushChannel' ? Or is this now an artefact of trying to simulate the 'no space left' with a virtual filesystem and/or channel ? ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-11-09 18:28 Message: I think it is in the code beginning at line 3042 of tclIO.c: /* * When the channel has an escape sequence driven encoding such as * iso2022, the terminated escape sequence must write to the buffer. */ [...] WriteChars(chanPtr, "", 0); [...] if (statePtr->chanMsg != NULL) { if (interp != NULL) { Tcl_SetChannelErrorInterp(interp,statePtr->chanMsg); } As proof, just [fconfigure -translation binary]: the error "no space left on device" is reported again ! The crux of the problem is that the errorCode 28 (ENOSPC) is properly reported into the interpreter by FlushChannel, but in the snippet above, the detection is based on a bypass message. At this point there is no chanMsg, and all trace of an error is forgotten. I'd appreciate Andreas' presence to confirm the diagnosis and prescribe the medicine. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2009-11-09 17:44 Message: I can't see where the error is getting lost. Channel close-down is really rather complex... :-( ---------------------------------------------------------------------- Comment By: Jeff Rogers (dvrsn) Date: 2009-10-28 19:13 Message: ignore my comment, the script is getting killed by posix signal SIGXFSZ ---------------------------------------------------------------------- Comment By: Konstantin Khomoutov (flatworm) Date: 2009-10-28 19:06 Message: Reproduced on Windows XP for CVS HEAD: C:\>dd if=/dev/zero of=G:\blob bs=1024 count=1024 rawwrite dd for windows version 0.5. Written by John Newbigin <jn...@it...> This program is covered by the GPL. See copying.txt for details Error writing file: 112 Недостаточно места на диске 1001+0 records in 1000+0 records out C:\>c:\opt\86d\bin\tclsh86g.exe % set fd [open g:/foo w] filea02c18 % chan config $fd -buffering full % puts $fd data % close $fd % set fd [open g:/foo w] filea07bc0 % chan config $fd -buffering full % puts $fd data % flush $fd error flushing "filea07bc0": no space left on device % set errorCode POSIX ENOSPC {no space left on device} % close $fd % exit ---------------------------------------------------------------------- Comment By: Jeff Rogers (dvrsn) Date: 2009-10-28 19:00 Message: Tcl script to write too big a file raises an uncatchable error. This may be related. set fd [open /tmp/somerandomfile w] for {set c 64} {$c > 0} {incr c -1} { while {![catch {seek $fd [expr 1<<$c] current}]} { puts [tell $fd] } } seek $fd -20 current puts stderr "write" puts $fd "goodbye, world" puts stderr "close" catch {close $fd} puts stderr "done" ---------------------------------------------------------------------- Comment By: Konstantin Khomoutov (flatworm) Date: 2009-10-28 18:35 Message: Relevant trailer from the strace output of the test case with the silent [close]: open("/tmp/blockdev/foo", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 5 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0 ioctl(5, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfa60b18) = -1 ENOTTY (Inappropriate ioctl for device) write(1, "file5\r\n"..., 7) = 7 write(1, "% "..., 2) = 2 read(0, "chan config $fd -buffering full\n"..., 4096) = 32 write(1, "% "..., 2) = 2 read(0, "puts $fd test\n"..., 4096) = 14 write(1, "% "..., 2) = 2 read(0, "close $fd\n"..., 4096) = 10 write(5, "test\n"..., 5) = -1 ENOSPC (No space left on device) close(5) = 0 write(1, "% "..., 2) = 2 read(0, ""..., 4096) = 0 fcntl64(2, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(2, F_SETFL, O_RDWR) = 0 fcntl64(1, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(1, F_SETFL, O_RDWR) = 0 fcntl64(0, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(0, F_SETFL, O_RDWR) = 0 write(4, "q"..., 1) = 1 close(4) = 0 futex(0x9eafbe8, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x9eafc28, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x9eafc2c, FUTEX_WAIT_PRIVATE, 3, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb7a2dbd8, FUTEX_WAIT, 23837, NULL) = 0 futex(0x9eafbe8, FUTEX_WAKE_PRIVATE, 1) = 0 exit_group(0) = ? ---------------------------------------------------------------------- Comment By: Konstantin Khomoutov (flatworm) Date: 2009-10-28 18:30 Message: The following C program fails at the call to fclose() to demonstrate libc+kernel handle the situation correctly: #include <stdio.h> #include <string.h> int main (int argc, char **argv) { FILE *fd; char *test = "test\n"; int len, written, res; if (argc != 2) { fprintf(stderr, "usage: %s FILENAME\n", argv[0]); return 1; } fd = fopen(argv[1], "a"); if (fd == NULL) { perror("fopen failed"); return 2; } len = strlen(test); written = fwrite(test, 1, len, fd); if (written < len) { perror("fwrite failed"); return 3; } res = fclose(fd); if (res != 0) { perror("fclose failed"); return 4; } return 0; } $ gcc -o mkfile -Wall -W -Wextra mkfile.c then in the setting described above: # /tmp/mkfile /tmp/blockdev/foo fclose failed: No space left on device ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2888099&group_id=10894 |