I finally traced what I believe to be a Jacl bug while working on my 'hyde'
package the last few days. I had a situation where my code worked fine on
one machine, but didn't on another: same Linux OS release, same Java JDK !!
My hyde package reads Java *.class files after an on-the-fly compilation,
and writes out the byte codes as a hex string, converted by Tcl's 'binary'
command for subsequent loading in the JVM with java::defineclass.
It seems that Jacl/Java file I/O ignores
fconfigure $fd -encoding binary -translation binary
instead, it appears to use the value of env(file.encoding) for treatment of
raw bytes on an input stream. (Property file.encoding presumably set by
the JVM somewhere.)
Here's my test program, fc.tcl:
----------------------------------------------------------------------------
set in [lindex $argv 0]
set out [lindex $argv 1]
set fd [open $in]
fconfigure $fd -encoding binary -translation binary -eofchar {}
set bytes [read $fd]
close $fd
set fd [open $out w]
fconfigure $fd -encoding binary -translation binary -eofchar {}
puts -nonewline $fd $bytes
close $fd
----------------------------------------------------------------------------
Jacl was built from yesterday's CVS checkout.
My test data is generated from /dev/random:
$ dd if=/dev/random of=rand1 bs=1 count=1000
Now, try a binary copy of rand1:
$ jaclsh fc.tcl rand1 rand2
$ ls -al rand1 rand2
-rw-rw-r-- 1 tpoindex tpoindex 1000 Jul 24 22:09 rand1
-rw-rw-r-- 1 tpoindex tpoindex 1000 Jul 24 22:14 rand2
But 'sum' shows that the bytes have been changed:
$ sum rand1 rand2
14321 1 rand1
29958 1 rand2
The culprit is /etc/sysconfig/i18n, which is where RedHat 7.3 stores locale
info:
LANG="en_US.iso885915"
SUPPORTED="en_US.iso885915:en_US:en"
SYSFONT="lat0-sun16"
SYSFONTACM="iso15"
This file is sourced by bash on login and those variables are exported.
Setting LANG=en_US, or just unsetting LANG, jaclsh reports env(file.encoding)
as 'ISO-8859-1'.
Running fc.tcl again:
$ jaclsh fc.tcl rand1 rand3
$ ls -al rand1 rand3
-rw-rw-r-- 1 tpoindex tpoindex 1000 Jul 24 22:09 rand1
-rw-rw-r-- 1 tpoindex tpoindex 1000 Jul 24 22:38 rand3
$ sum rand1 rand3
14321 1 rand1
14321 1 rand3
Ah ha! that did the trick!
I'm not familiar enough with src/jacl/tcl/lang/Channel.java, FileChannel.java,
Encoding.java, Fconfigure.java, TclIO.java etc. to spot the problem.
I also tried using the 'encoding' command to 'undo' the coding on read, but
didn't have any luck; it's quite possible I am using 'encoding' incorrectly.
Clues?
--
Tom Poindexter
tpo...@ny...
http://www.nyx.net/~tpoindex/
|