From: Poor Y. <org...@po...> - 2023-01-03 01:40:28
|
On 2022-12-31 13:29, Schelte Bron wrote: > On 30/12/2022 23:13, Poor Yorick wrote: >> Someone who wanted to pinpoint an encoding error encountered using >> [gets] could then switch to [read] for that purpose, picking up where >> [gets] logically left off. > > If I understand correctly, the code would need to look something like > this: > > set fd [open strictencoding.txt] > fconfigure $fd -encoding utf-8 -strictencoding 1 > try { > set linenum 1 > while {[gets $fd line] >= 0} { > puts $line > incr linenum > } > } trap {POSIX EILSEQ} {err info} { > catch {read $fd} err info > set charnum [expr {[string length [dict get $info -result]] + > 1}] > puts stderr "$err at line $linenum, character $charnum" > } > close $fd > > Running this with Ashok's example data (a\nb\xc0\nc\n) in > strictencoding.txt should report the error is at line 2, character 2. > It doesn't. It says line 2, character 1. That's because [dict get $info > -result] returns "". Not "b" as I expected. > > > Schelte. > Rolf reported this issue regarding [gets] hanging indefinitely: https://core.tcl-lang.org/tcl/info/154ed7ce56 On the "trunk-encodingdefaultstrict" branch I've fixed that issue: https://core.tcl-lang.org/tcl/info/003c9e1f2e53312b As of that commit, the example you posted works as you described: set chan [open test22.data wb] puts -nonewline $chan a\nb\xc0\nc\n close $chan set fd [open test22.data] fconfigure $fd -encoding utf-8 -encodingstrict 1 try { set linenum 1 while {[gets $fd line] >= 0} { puts $line incr linenum } } trap {POSIX EILSEQ} {err info} { catch {read $fd} err info set charnum [expr {[string length [dict get $info -result]] + 1}] puts stderr "$err at line $linenum, character $charnum" } close $fd The output is: a error reading "file*": illegal byte sequence at line 2, character 2 Because "-nocomplain" has also been eliminated on that banch, the option "-strictencoding" has been changed to "-encodingstrict". On that branch I am currently working on fixing every encoding/decoding behaviour that is less than ideal. Any oher illustrations of undesired or desired behaviour on that branch are very welcome. -- Yorick |