#1302 tcl8.2.2 can not handle iso2022-jp strings

obsolete: 8.2.2
closed-fixed
7
2002-03-04
2000-10-26
Anonymous
No

OriginalBugID: 3670 Bug
Version: 8.2.2
SubmitDate: '1999-11-23'
LastModified: '1999-12-09'
Severity: CRIT
Status: UnAssn
Submitter: techsupp
ChangedBy: hobbs
OS: BSD
FixedDate: '2000-10-25'
ClosedDate: '2000-10-25'

Name:
Taguchi,Takeshi

ReproducibleScript:
system encoding iso2022-jp
fconfigure stdin -encoding iso2022-jp
fconfigure stdout -encoding iso2022-jp
# Ok, I think we can input iso2022-jp string ...
set a {Some_ISO2022-JP_String}
EscapeToUtfProc: Invalid sub table
Abort trap (core dumped)

ObservedBehavior:
(gdb) where
#0 0x2812b4c4 in kill () from /usr/lib/libc.so.3
#1 0x2815f93f in abort () from /usr/lib/libc.so.3
#2 0x280aec5e in Tcl_PanicVA () from /usr/local/lib/libtcl8.2.so
#3 0x280aec84 in Tcl_Panic () from /usr/local/lib/libtcl8.2.so
#4 0x280936e5 in Tcl_FindExecutable () from /usr/local/lib/libtcl8.2.so
#5 0x2809327d in Tcl_FindExecutable () from /usr/local/lib/libtcl8.2.so
#6 0x28091d7b in Tcl_ExternalToUtf () from /usr/local/lib/libtcl8.2.so
#7 0x280a2233 in Tcl_GetsObj () from /usr/local/lib/libtcl8.2.so
#8 0x280a1da4 in Tcl_GetsObj () from /usr/local/lib/libtcl8.2.so
#9 0x280aa6c1 in Tcl_Main () from /usr/local/lib/libtcl8.2.so
#10 0x8048529 in main (argc=1, argv=0xbfbfd9f0) at ./../unix/tclAppInit.c:83
#11 0x80484a5 in _start ()

DesiredBehavior:
In interactive mode, I want to input multibyte string which has system encoding.
I think tcl8.2 can do it.......

Discussion

  • Don Porter

    Don Porter - 2001-04-05
    • labels: 104244 --> 10. Objects
     
  • Donal K. Fellows

    • assigned_to: nobody --> hobbs
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-12

    Logged In: YES
    user_id=72656

    We would need to know exactly what the string was that
    caused the problem to be able to reproduce it.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-12
    • status: open --> pending
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-14
     
    Attachments
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-14
    • status: pending --> open
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-14

    Logged In: YES
    user_id=72656

    This was confirmed by taguchi at tohoku.iij.ad.jp to still
    crash on his configuration, but I cannot repeat it. The
    attached script is supposed to crash.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-18

    Logged In: YES
    user_id=72656

    This is now confirmed with the latest script from Taguchi:

    ----8<----8<----8<----8<----
    #!/usr/local/bin/tclsh8.4
    encoding system iso2022-jp
    set a "\u4e4e\u4e5e\u4e5f"; # String with 3 Kanji chars
    puts $a
    ----8<----8<----8<----8<----

    That needs to be run as a script to trigger the bug.
    Occurs in 8.3.4cvs and 8.4a4cvs.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-18
    • priority: 5 --> 7
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-18

    Logged In: YES
    user_id=72656

    OK, if I translate this right, this can be massaged like so:

    set a "\u4e4e\u4e5e\u4e5f"
    set b [encoding convertto iso2022-jp $a]

    This makes b == "\x1b(B\x1b$@8C8pLi"

    looking at iso2022-jp.enc, that means there is a signal
    to use iso8859-1 immediately followed by the signal to
    use jis0208. This is confirmed when I get the same
    chars in tkcon on Windows by just getting the value of:

    encoding convertfrom jis0208 8C8pLi

    I'm now trying to figure out why the escape driven encoding
    doesn't work right...

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-10-18

    Logged In: YES
    user_id=72656

    The problem seems to be in a recursive need to access file
    encodings.

    Once we are switched into the iso2022-jp encoding for the
    system, when we need to convert a string, it goes through
    tclEncoding.c:GetTableEncoding, which will load the
    encoding when necessary. When the first escape encoding
    for jis0201 needs to be loaded, we have a problem because
    the system finds the jis0201.enc file, but wants to convert
    that name since the system thinks it everything (including
    system file names) are iso2022-jp encoding.

    We need to fix this, but it also look like the general
    encoding system idea may not be what is necessary.

     
  • Taguchi, Takeshi.

    Logged In: YES
    user_id=357728

    I've uploaded patch as #474358.
    It's seem work.
    But I do not understand tclEncoding.c
    So I afraid this patch may contain bugs.

    Thanks.
    ---
    Taguchi,T.

     
  • Andreas Kupries

    Andreas Kupries - 2002-01-21

    Logged In: YES
    user_id=75003

    I applied this patch to the current state of 8.3.4 and
    8.4cvs head. In both cases there are three encoding-related
    tests which will fail when the testsuite is run. See below.

    I am not well versed enough in this area to know if the
    change should make the tests fail (and thus the tests have
    to be updated) or if the failure points to a bug in the
    patch itself.

    Because of this I believe that the patch as it is now is
    not applicable. If the tests have to be changed the patch
    should contain the updates to the testsuite.

    ==== encoding-11.5 LoadEncodingFile: escape file FAILED
    ==== Contents of test case:

    encoding convertto iso2022 \u4e4e

    ---- Result was:
    ESC$@8CESC(B
    ---- Result should have been:
    ESC(BESC$@8C
    ==== encoding-11.5 FAILED

    ==== encoding-13.1 LoadEscapeTable FAILED
    ==== Contents of test case:

    set x [encoding convertto iso2022 ab\u4e4e\u68d9g]

    ---- Result was:
    abESC$@8CESC$(DD%ESC(Bg
    ---- Result should have been:
    ESC(BabESC$@8CESC$(DD%ESC(Bg
    ==== encoding-13.1 FAILED

    ==== io-1.8 Tcl_WriteChars: WriteChars FAILED
    ==== Contents of test case:

    # This test written for SF bug #506297.
    #
    # Executing this test without the fix for the
    referenced bug
    # applied to tcl will cause tcl, more specifically
    WriteChars, to
    # go into an infinite loop.

    set f [open test2 w]
    fconfigure $f -encoding iso2022-jp
    puts -nonewline $f [format %s%c [string repeat " " 4]
    12399]
    close $f
    contents test2

    ---- Result was:
    ESC$@$OESC(B
    ---- Result should have been:
    ESC(B ESC$@$O
    ==== io-1.8 FAILED

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-03-04
    • status: open --> closed-fixed
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-03-04

    Logged In: YES
    user_id=72656

    This wasn't really fixed by the noted patches. It was a
    couple of things. The escapes needed to be fixed, but also
    the finalization of encodings wasn't correctly handling
    refcounts of encodings. This is fixed for 8.4a4 and 8.3.4+.
    See also bug 524674 and patch 474358.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-03-04

    Logged In: YES
    user_id=72656

    See http://sourceforge.net/tracker/?
    func=detail&aid=474358&group_id=10894&atid=310894 for
    resolution.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks