Menu

#1 non-ascii characters on a non-western character environment

open
nobody
None
5
2004-01-27
2004-01-27
No

I use tkoutline version 0.92 & 0.93 on a windows xp
machine with 1253 default charset. I have a problem
with non-ascii chars (e.g. chars with 1252 diacritics):
although I can see them on the display they turn to
question marcs (?) as soon as I save/close/re-open the
file.

What can I do with this problem?

Thanks in advance,

rosana hugenau

Discussion

  • Brian Theado

    Brian Theado - 2004-09-04

    Logged In: YES
    user_id=6544

    Sorry for such a delayed response (8 months!). Noone has
    used sourceforge to file a bug report for tkoutline before
    and I only just now noticed this report.

    I don't understand charsets, encoding, etc. Could you
    explain how I can insert characters with 1252 diacritics so
    I can play with it?

    Maybe the files that tkoutline writes are ascii-only and the
    extra information is being lost when the files are written.

    I see information here (http://wiki.tcl.tk/encoding) and
    here (http://www.tcl.tk/man/tcl8.4/TclCmd/fconfigure.htm#M8)
    that might be related and I don't really understand it.

     
  • Brian Theado

    Brian Theado - 2004-09-04

    Logged In: YES
    user_id=6544

    Ok. I figured out that on Windows XP I can use the utility
    called charmap to insert non-ascii characters into
    tkoutline. Even better, I can run a command from with
    tkoutline's console (hit <F2> key) like the following to
    insert a non-ascii character:

    text insert insert \u045B

    This inserts the a character that is described as a
    'Cyrillic small letter Tshe'. After this command, I can see
    a glyph that looks like a lower case 'h' with a horizontal
    bar through the upper portion of the stem. When I save the
    file and re-open it, I see a question mark in place of the
    Tshe glyph.

    Here is some other information I found:
    http://www.equi4.com/tkunicode.html
    http://www.equi4.com/pipermail/starkit/2004-May/001995.html

    Apparently, the way I package up tkoutline involves leaving
    out some of Tcl's character encoding files. I tried
    following the instructions in the second link, but I still
    got question marks for the Tshe glyph. I don't know if I
    just followed the instructions incorrectly, or if the glyph
    I picked doesn't have an encoding even in the full Tcl
    distribution.

    So could you send the unicode number for the characters you
    are trying, so I can test the instructions from the 2nd link
    above, again?

     
  • Brian Theado

    Brian Theado - 2004-09-04

    Logged In: YES
    user_id=6544

    Here's something I tried that works for preserving the
    non-ascii character I was playing with before.

    Tcl has a command called "encoding system" which returns the
    current default encoding. On my machine it returns
    "cp1252". The same command can be used to change the
    default encoding. Apparently not all encodings can
    represent all Unicode characters. The unicode character
    \u045B from my example below is apparently one of those
    characters that can't be represented by cp1252. So in order
    to work around this, I opened tkoutline's console (hit <F2>)
    and typed the following command:

    encoding system utf-8

    The "utf-8" encoding can represent all Unicode characters
    (if I understand correctly). For sure, I know it can
    represent \u045B, because when I insert that character, save
    the file and re-open it, I get the correct symbol rather
    than a question mark.

    There may be undesirable side-effects in fixing the problem
    this way. This stuff is all new to me, so I don't know.

    If this does fix your problem, then you can just put the
    line "encoding system utf-8" in your startup script and be
    done with it.

     

Log in to post a comment.