Re: [TCLCORE] Inconsistent buffered I/O

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I'm not quoting Tom's parent post here, because it's long and a
bit rambling.  But let me try to put things in perspective.
I'm not claiming that twenty years of Tcl'ers are fools for not
finding this: I've been actively developing Tcl for by far the
better part of those twenty years, and on the TCT for nearly
half of them. And every few months, I raise an eyebrow and
say to myself, "Hmm, peculiar that nobody ever discovered *that*
bug before." I don't see that as playing Chicken Little.

I don't think that you, Tom, have comprehended the proposed
solution entirely.  It's more or less:

  - Prior to a [read] (equivalently, a [gets]), check if there
    are buffered data to write. If there are, write them before
    reading the file. (Obviously, any data in read buffers may
    be retained.)

  - Prior to a [seek], check if there are buffered data to
    write. If there are, write them before seeking.

    There's an obvious performance tweak to bypass this flushing
    behaviour if the [seek] is not actually moving the cursor,
    and further performance might be obtainable if the library
    can determine that the new cursor is inside the current
    buffer. For now, I propose to ignore those optimizations,
    which represent an unusual case that requires a fair bit
    of bookkeeping to maintain consistency.

    As far as input buffers go, I'd propose flushing them on
    [seek] as well, but it wouldn't be a tremendous amount of
    bookkeeping to detect the case where the new cursor lies
    within the buffer and adjust accordingly. This issue might
    be worth doing, because there's a plausible case where a
    program will read data from a file, use the data just read
    to infer the file's encoding, seek back to the beginning,
    configure a new encoding and read again.

  - Prior to a [puts], spoil any buffered input data, which will
    be incoherent after the write.  If the current cursor lies
    within an input buffer, it might be possible to retain the
    buffer by overwriting the data therein with the newly-written
    data. Again, I propose simply to ignore the case.

  - Prior to a [truncate], flush write buffers and discard read
    buffers.

There are no "additional bits of state" involved, because the
"safe to read" and "safe to write" states can be inferred merely
from the existence of buffered data.

You'll also notice that nowhere is this proposing to use the
'flush' facility of the operating system (unless the program
performs an explicit [flush]), so in no case should the operating
system's buffer cache be spoilt unless the program asks for it
explicitly.

You will also notice that this discussion is treating read and
write buffers separately. The -buffering option, by the way,
affects only write buffers. It controls whether a kernel-level
write is requested on every [puts] (-buffering none), on every
newline character (-buffering line), or only when a buffer fills
(-buffering full).

I do concede that the current behaviour is documented. We've
made the mistake of documenting bugs before.  Removing the
need for a workaround is not "coddling novice programmers."
No language makes it the least bit difficult to write incorrect
programs. But we do try as far as we can to make it easier
to write correct ones.

In short:
   - It's a bug.
   - Alex was asking about whether programs depend on the current
     behaviour out of an abundance of caution.
   - We'll fix it.
   - The fix will have negligible performance impact.
   - It doesn't deserve nearly the amount of public debate that
     there's been.

-- 
73 de ke9tv/2, Kevin

Re: [TCLCORE] Inconsistent buffered I/O

The Tool Command Language implementation

Re: [TCLCORE] Inconsistent buffered I/O