non-ascii characters on a non-western character environment
Status: Alpha
Brought to you by:
btheado
I use tkoutline version 0.92 & 0.93 on a windows xp
machine with 1253 default charset. I have a problem
with non-ascii chars (e.g. chars with 1252 diacritics):
although I can see them on the display they turn to
question marcs (?) as soon as I save/close/re-open the
file.
What can I do with this problem?
Thanks in advance,
rosana hugenau
Logged In: YES
user_id=6544
Sorry for such a delayed response (8 months!). Noone has
used sourceforge to file a bug report for tkoutline before
and I only just now noticed this report.
I don't understand charsets, encoding, etc. Could you
explain how I can insert characters with 1252 diacritics so
I can play with it?
Maybe the files that tkoutline writes are ascii-only and the
extra information is being lost when the files are written.
I see information here (http://wiki.tcl.tk/encoding) and
here (http://www.tcl.tk/man/tcl8.4/TclCmd/fconfigure.htm#M8)
that might be related and I don't really understand it.
Logged In: YES
user_id=6544
Ok. I figured out that on Windows XP I can use the utility
called charmap to insert non-ascii characters into
tkoutline. Even better, I can run a command from with
tkoutline's console (hit <F2> key) like the following to
insert a non-ascii character:
text insert insert \u045B
This inserts the a character that is described as a
'Cyrillic small letter Tshe'. After this command, I can see
a glyph that looks like a lower case 'h' with a horizontal
bar through the upper portion of the stem. When I save the
file and re-open it, I see a question mark in place of the
Tshe glyph.
Here is some other information I found:
http://www.equi4.com/tkunicode.html
http://www.equi4.com/pipermail/starkit/2004-May/001995.html
Apparently, the way I package up tkoutline involves leaving
out some of Tcl's character encoding files. I tried
following the instructions in the second link, but I still
got question marks for the Tshe glyph. I don't know if I
just followed the instructions incorrectly, or if the glyph
I picked doesn't have an encoding even in the full Tcl
distribution.
So could you send the unicode number for the characters you
are trying, so I can test the instructions from the 2nd link
above, again?
Logged In: YES
user_id=6544
Here's something I tried that works for preserving the
non-ascii character I was playing with before.
Tcl has a command called "encoding system" which returns the
current default encoding. On my machine it returns
"cp1252". The same command can be used to change the
default encoding. Apparently not all encodings can
represent all Unicode characters. The unicode character
\u045B from my example below is apparently one of those
characters that can't be represented by cp1252. So in order
to work around this, I opened tkoutline's console (hit <F2>)
and typed the following command:
encoding system utf-8
The "utf-8" encoding can represent all Unicode characters
(if I understand correctly). For sure, I know it can
represent \u045B, because when I insert that character, save
the file and re-open it, I get the correct symbol rather
than a question mark.
There may be undesirable side-effects in fixing the problem
this way. This stuff is all new to me, so I don't know.
If this does fix your problem, then you can just put the
line "encoding system utf-8" in your startup script and be
done with it.