I think I found a bug in Tcl that prevents a tclkit to
find and use the proper system encoding in a
non-English locale.
Of course, it is not a bug for Tcl itself, because this
function
is really called just once on startup (and does its job).
But tclkit has to call it twice --- on startup and after
mounting VFS which contains encoding files.
A patch to unix/tclUnixInit.c:
-----------------PATCH---------------------
--- tclUnixInit.c Thu May 12 20:18:51 2005
+++ tclUnixInit.c.new Thu May 12 20:14:03 2005
@@ -485,7 +485,8 @@
void
TclpSetInitialEncodings()
{
- if (libraryPathEncodingFixed == 0) {
+
+/* if (libraryPathEncodingFixed == 0) { */
CONST char *encoding = NULL;
int i, setSysEncCode = TCL_ERROR;
Tcl_Obj *pathPtr;
@@ -647,6 +648,7 @@
* dependent behavior.
*/
+ if (libraryPathEncodingFixed == 0) {
setlocale(LC_NUMERIC, "C");
/*
-----------------PATCH---------------------
This problem really needs to be fixed --- it makes tclkit
practically unusable for non-English users. And I don't
see a way how Tcl will suffer from it.
Logged In: YES
user_id=80530
Just want to confirm that this
report is against release 8.4.9
of Tcl?
And it does not pertain to
the heavily modified intialization
routines in Tcl 8.5a2 ?
Is this report in any way a new bug?
Why is it arising only now?
Logged In: YES
user_id=1272045
Yes, it's really agains 8.4.9.
This bug was mentioned before (look in tclkit developer's forum
arhives), but solution was not found before.
I didn't look in 8.5a2 sources yet.
Logged In: YES
user_id=1983
Don, unfortunately I'm not really the person to ask for comments on this.
Perhaps Vince or Jeff can comment. I have never grasped the intricacies of
Tcl's init sequence.
With that out of the way... if avoiding double init is merely an optimization,
then it seems to me that the above change would be ok. It essentially lets
some of the code run on every call to TclpSetInitialEncodings. Might also be a
hint about splitting this in two separate functions. This code only affects the
Unix builds, I don't know how Windows differs.
What Tclkit needs has always been the same: set up a Tcl interp without
encodings at hand to let it run a bit of plain ASCII Tcl, then tell the system
about encoding files which now are available.
Sorry I can't be of much help.
-jcw
Logged In: YES
user_id=80530
Sorry, jcw, I didn't ask the Q clearly enough.
I'm not looking for an approving opinion
on the proposed patch; I'm looking for the
bug report, and some background and
history.
Yarslov says the description of the problem
is in the archives of a "tclkit developer's forum" (?)
Can anyone give me a more direct pointer than that?
Rather than just accept a contributed patch,
I'd like to understand the problem being solved.
Also with a clearer understanding of the
problem faced, I'll have a better idea whether
it's already fixed in Tcl 8.5 development.
Logged In: YES
user_id=1272045
Sorry, I misled you (and forgot to describe a problem).
This is a part from original letter (not in tclkit forum,
sorry again):
-------------------------
Sergey Vlasov vsu at altlinux.ru
Sat Apr 16 18:32:59 CEST 2005
Does anyone know how to make tclkit find and use the proper
system encoding in a non-English locale?
I have tried tclkit-linux-x86-8.4.9 with LANG=ru_RU.KOI8-R,
and also the 8.4.9 Win32 version under Windows 98, and
both have the same problem: even if I rebuild tclkit with all
encoding files (as described at
http://www.equi4.com/tkunicode.html\), I still get:
$ tclkit
% encoding system
iso8859-1
(the Win32 version gives cp1252, which is also bad -
the real system encoding in that case is cp1251).
When a Tk application is launched in such environment (I tried
Notebook - http://notebook.wjduquette.com/\), it has major
problems:
all keyboard input is assumed to be in the broken system
encoding, therefore I get iso8859-1 accented letters instead
of Cyrillic characters. Obviously, this makes starkits
unusable.
I tried the newer tclkit version (8.5.a2) on Linux
(http://www.equi4.com/pub/tk/8.5a2/tclkit-linux-x86.gz), and
with
LANG-ru_RU.KOI8-R it does not start at all:
$ tclkit-linux-x86-8.5a2
system encoding "
zsh: abort tclkit-linux-x86-8.5a2
If I execute "encoding system koi8-r" (after either adding the
appropriate encoding files to tclkit, or copying them as
described in
http://wiki.tcl.tk/10382\), then it seems to work (I tried to
insert
this statement into
notebook2.1.1.vfs/lib/app-notebook/notebook.tcl
after copying of the encoding files, and such hacked
Notebook can
handle Cyrillic characters properly). But obviously
hardcoding the
encoding name is not acceptable - the encoding should be
determined automatically, like the "real" Tcl does it.
Looking at tclUnixInit.c:TclpSetInitialEncodings(), I see
that Tcl
tries several methods to detect the system encoding and uses
the
first encoding for which Tcl_SetSystemEncoding() succeeds.
However, at this point the encoding files stored inside
tclkit are
not yet available (because vfs is not initialized),
therefore all
calls to Tcl_SetSystemEncoding() fail, and system encoding is
left set to "identity".
-------------------------------------------------
My additons to it:
On windows, tclkit fails because of its problems in working
with VFS. This is from my bug report to them:
----------------------------
On windows (and at all), there is a bug it boot.tcl in tclkit:
vfs::filesystem unmount $noe
Example in russian locale on Win98 (cp1251):
noe: D:/TCLKIT/русский/TCLKIT-WIN32-NEWEST.EXE
::vfs::filesystem info:
d:/tclkit/русский/tclkit-win32-newest.exe
Case differs, so it says: "no such mount" and crashes
(on Win98 only).
Replaced with:
vfs::filesystem unmount [::vfs::filesystem info]
tclkit-win32-newest.exe with added encodings by
tkunicode.html on the site works ok.
----------------------------
So, there is no bug in Tcl itself on Windows, but on Unix,
when tclkit tries to set system encoding second time (with
encodings loaded), this function just doesn't do it as I
described
before. So, system encoding just gets "fixed" by tclkit startup
code from "identity" to iso8859-1, which is not correct.
Logged In: YES
user_id=80530
thanks for the details.
part of that message reports
that a Tclkit built on
Tcl8.5a2 doesn't work at all.
Can we construct a Tclkit
from the current CVS sources
of both Tcl and Tclkit, and try
that again? An official Tcl 8.5a3
release should be out in a few weeks,
and it would be best to find out now
if that combination will be broken.
Logged In: YES
user_id=80530
I don't follow this part of the report:
"I tried the newer tclkit version (8.5.a2) on Linux
... it does not start at all ...
If I execute "encoding system koi8-r" ...
then it seems to work."
If the program will not start, then what does
it mean to "execute 'encoding system koi8-r'" ?
Logged In: YES
user_id=80530
In Tcl 8.5a3, the TclpSetInitialEncodings
routine no longer suffers from the limitation
reported. If Tclkit sources get updated to
use the new command
[::tcl::unsupported::EncodingDirs]
or the corresponding private C routine
TclSetEncodingSearchPath
before the second call to TclpSetInitialEncodings...
...AND if the encoding files are actually
in the directory to be found...
...then this issue should be solved
with 8.5a3 as a base. A successful
test with the next Tcl release would be
a very good thing, and would be good
motivation to push these "unsupported"
and "private" interfaces public.
I still need to look into what can be
done for Tcl 8.4.10, if anything.
Logged In: YES
user_id=1983
When I follow the tclkit build instructions at http://www.equi4.com/218
(replacing genkit by genkit85 everywhere, and tars by tars85), I get the
following error while in the first compile (tclsh genkit85 B tcl):
gcc -pipe -c -O2 -Wall -Wno-implicit-int -fPIC -I. -I../../../src/tcl/unix -I../../../src/
tcl/unix/../generic -DTCL_TOMMATH -I../../../src/tcl/unix/../libtommath -
DPACKAGE_NAME=\"tcl\" -DPACKAGE_TARNAME=\"tcl\" -
DPACKAGE_VERSION=\"8.5\" -DPACKAGE_STRING=\"tcl\ 8.5\" -
DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -
DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -
DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -
DHAVE_LIMITS_H=1 -DHAVE_SYS_PARAM_H=1 -
DTCL_CFGVAL_ENCODING=\"iso8859-1\" -DSTATIC_BUILD=1 -
DPEEK_XCLOSEIM=1 -DTCL_SHLIB_EXT=\".so\" -
DTCL_CFG_OPTIMIZED=1 -DTCL_CFG_DEBUG=1 -
D_LARGEFILE64_SOURCE=1 -DTCL_WIDE_INT_TYPE=long\ long -
DHAVE_STRUCT_STAT64=1 -DHAVE_OPEN64=1 -DHAVE_LSEEK64=1 -
DHAVE_TYPE_OFF64_T=1 -DHAVE_GETCWD=1 -DHAVE_OPENDIR=1 -
DHAVE_STRTOL=1 -DHAVE_STRTOLL=1 -DHAVE_STRTOULL=1 -
DHAVE_TMPNAM=1 -DHAVE_WAITPID=1 -DUSE_TERMIOS=1 -
DHAVE_SYS_TIME_H=1 -DTIME_WITH_SYS_TIME=1 -
DHAVE_STRUCT_TM_TM_ZONE=1 -DHAVE_TM_ZONE=1 -
DHAVE_GMTIME_R=1 -DHAVE_LOCALTIME_R=1 -DHAVE_MKTIME=1 -
DHAVE_TM_GMTOFF=1 -DHAVE_TIMEZONE_VAR=1 -
DHAVE_STRUCT_STAT_ST_BLKSIZE=1 -DHAVE_ST_BLKSIZE=1 -
DHAVE_SIGNED_CHAR=1 -DHAVE_LANGINFO=1 -
DHAVE_SYS_IOCTL_H=1 -DTCL_UNLOAD_DLLS=1 ../../../src/tcl/unix/../
generic/tclObj.c
In file included from ../../../src/tcl/generic/tclObj.c:19:
../../../src/tcl/generic/tommath.h:31:27: tommath_class.h: No such file or
directory
make: *** [tclObj.o] Error 1
I did a "cvs update" in all src/* directories after the "tclsh genkit A" step, which
fetched everything.
Has something changed? Do I need to get something else? I can attach the
build transcript if needed.
Logged In: YES
user_id=80530
Yes, the "tcl" cvs module has a
new "libtommath" submodule and
CVS limitations make the transition
a bit non-trivial.
Simplest thing to do is just get a
completely fresh tcl checkout:
cvs -d .... checkout tcl
and notice the new subdirectory
tcl/libtommath .
If that's unattractive (maybe your
existing checkout has mods you
don't want to lose?), then we can
doing something else a bit more
involved.
Logged In: YES
user_id=80530
Here's a patch against
the core-8-4-branch of
development. Ought to apply
to Tcl 8.4.9 as well.
Please test to see whether
it addresses the reported
problem.
Logged In: YES
user_id=1983
Ok, thx. A trial build for Linux is at http://www.equi4.com/tclkit85try.gz - it has
the following dynlib dependencies:
$ ldd tclkit-dellie
linux-gate.so.1 => (0xffffe000)
libdl.so.2 => /lib/libdl.so.2 (0xb7fde000)
libstdc++.so.5 => /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/libstdc++.so.5
(0xb7f24000) libm.so.6 => /lib/libm.so.6 (0xb7f02000) libgcc_s.so.1
=> /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/libgcc_s.so.1 (0xb7ef9000)
libc.so.6 => /lib/libc.so.6 (0xb7de4000) /lib/ld-linux.so.2 => /lib/ld-linux.so.
2 (0xb7fea000)
$
The CVS checkout was anonymous, not an SF dev checkout so it may lag a
couple of hours. Oh, and the embedded Tcl runtime scripts are not updated
from 8.5a2 (genkit just slaps a fixed starkit onto the end of the exe).
I can't do any testing right now, but please let me know if I need to tweak
things and rebuild.
Logged In: YES
user_id=80530
any luck testing this? 8.5a3 release
is coming quickly.
Logged In: YES
user_id=80530
I received no reports about testing
the patch, so it will not be part
of Tcl 8.4.10 release.
Logged In: YES
user_id=80530
working on the 8.4.11 release now...
Can't anyone comment on whether
the attached patch does any good
for solving the reported problem?
I'm not willing to change the
"stable" branch of Tcl on this point
without a minimum endorsement
of effectiveness.
Logged In: YES
user_id=80530
dgp patch is there; just needs testing.
[21:31] dgp that's an 8.4 branch matter though.
[21:31] stevel that would be a good one to assign to Andreas
Kupries
Logged In: YES
user_id=1272045
This is my answer to the letter from Andreas
Kupries (looks like he doesn't received it).
> Attached to the bug is a patch claiming to fix the
> problem. I have now created two sets of tclkits, one
> with and one without this patch. The kits are built
> for Linux/Intel.
Hello, Andreas.
I've tried your kits. They don't fix the problem.
Transcript:
% encoding system iso8859-1
So, I want to repeat my patch (it is in original report) to
unix/tclUnixInit.c:
-----------------PATCH---------------------
--- tclUnixInit.c Thu May 12 20:18:51 2005
+++ tclUnixInit.c.new Thu May 12 20:14:03 2005
@@ -485,7 +485,8 @@
void
TclpSetInitialEncodings()
{
- if (libraryPathEncodingFixed == 0) {
+
+/* if (libraryPathEncodingFixed == 0) { */
CONST char *encoding = NULL;
int i, setSysEncCode = TCL_ERROR;
Tcl_Obj *pathPtr;
@@ -647,6 +648,7 @@
* dependent behavior.
*/
+ if (libraryPathEncodingFixed == 0) {
setlocale(LC_NUMERIC, "C");
/*
-----------------PATCH---------------------
Tclkit I've built with it really fixes the problem.
Transcript:
% encoding system
koi8-r
Why don't you accept it? If you look into Windows
implementation of TclpSetInitialEncodings, you'll see
almost the same code I'm offering. What's wrong with it?
Logged In: YES
user_id=75003
Oh, I have received it, it being the mail. However the
accusatory undertone I found regarding my choice of patch
has me thoroughly demotivated to continue work on this,
derailing my initial plan of going through all the patches
here, from last to first, to see which of them fix the
problem and which don't. I also decided to not answer the
mail until I have cooled down and able to be polite despite
this. However now that this private communication is made
public I feel compelled to answer, even if with anger in my
heart.
Logged In: YES
user_id=80530
Yaroslav, let's figure out a time
we can work together on this.
With testing feedback from you,
I'm sure we can resolve this.
What platform are you testing on?
Logged In: YES
user_id=1272045
I've tested on:
$ uname -a
Linux NOSOR 2.6.3-7mdk #1 Wed Mar 17 15:56:42 CET 2004 i686
unknown unknown GNU/Linux
And:
$ uname -a
Linux debian 2.4.27-2-686 #1 Mon May 16 17:03:22 JST 2005
i686 GNU/Linux
Logged In: YES
user_id=80530
ok, I went through the exercise of
building a Tclkit and it seems
the root of the problem is that the
koi8-r.enc file is not included
in a standard Tclkit. How are you
dealing with that?