Menu

#187 Backslash substitution is broken with non-ASCII chars

obsolete: 8.3.2
closed-fixed
2
2001-06-28
2000-10-26
Anonymous
No

OriginalBugID: 6162 Bug
Version: 8.3.2
SubmitDate: '2000-08-23'
LastModified: '2000-09-05'
Severity: MED
Status: Closed
Submitter: techsupp
ChangedBy: hobbs
RelatedBugIDs: 6166
OS: Windows NT
FixedDate: '2000-09-05'
ClosedDate: '2000-10-25'

Name:

Benjamin Riefenstahl

CVS:

tclUtf.c,v 1.11 2000/01/11 22:09:00 hobbs Exp

Comments:

The patch is against the tcl-8.3.3 branch.

ReproducibleScript:

puts \ä

ObservedBehavior:

The script above displays garbage.

DesiredBehavior:

The script above should produce one line with an "ä".

Patch:

Index: tclUtf.c

===================================================================

RCS file: /cvsroot/tcl/generic/tclUtf.c,v

retrieving revision 1.11

diff -c -r1.11 tclUtf.c

*** tclUtf.c 2000/01/11 22:09:00 1.11

--- tclUtf.c 2000/08/23 18:54:26

***************

*** 781,787 ****

* backslash sequence. */

{

register CONST char *p = src+1;

! int result, count, n;

char buf[TCL_UTF_MAX];

if (dst == NULL) {

--- 781,787 ----

* backslash sequence. */

{

register CONST char *p = src+1;

! int result, count, n, utfconvert;

char buf[TCL_UTF_MAX];

if (dst == NULL) {

***************

*** 789,794 ****

--- 789,795 ----

}

count = 2;

+ utfconvert = 0;

switch (*p) {

/*

* Note: in the conversions below, use absolute values (e.g.,

***************

*** 823,828 ****

--- 824,830 ----

if (isxdigit(UCHAR(p[1]))) { /* INTL: digit */

char *end;

+ utfconvert = 1;

result = (unsigned char) strtoul(p+1, &end, 16);

count = end - src;

} else {

***************

*** 831,836 ****

--- 833,839 ----

}

break;

case 'u':

+ utfconvert = 1;

result = 0;

for (count = 0; count < 4; count++) {

p++;

***************

*** 868,873 ****

--- 871,877 ----

* Check for an octal number \oo?o?

*/

if (isdigit(UCHAR(*p)) && (UCHAR(*p) < '8')) { /* INTL: digit */

+ utfconvert = 1;

result = (unsigned char)(*p - '0');

p++;

if (!isdigit(UCHAR(*p)) || (UCHAR(*p) >= '8')) { /* INTL: digit */

***************

*** 891,897 ****

if (readPtr != NULL) {

*readPtr = count;

}

! return Tcl_UniCharToUtf(result, dst);

}

/*

--- 895,906 ----

if (readPtr != NULL) {

*readPtr = count;

}

! if (!utfconvert) {

! *dst = (char) result;

! return 1;

! } else {

! return Tcl_UniCharToUtf(result, dst);

! }

}

/*

PatchFiles:

tclUtf.c

Discussion

  • Brent B. Welch

    Brent B. Welch - 2000-10-26
    • priority: 5 --> 2
    • status: open --> closed-fixed
     
  • Benjamin Riefenstahl

    The Problem still exists in 8.4 in CVS, and it doesn't seem to be fixed in the 8.3.2 branch either ??.

    I attach a patch against the main branch in CVS here:

    Index: tclUtf.c

    RCS file: /cvsroot/tcl/tcl/generic/tclUtf.c,v
    retrieving revision 1.14
    diff -c -r1.14 tclUtf.c
    *** tclUtf.c 2000/06/05 23:36:21 1.14
    --- tclUtf.c 2001/01/29 16:46:49
    ***************
    *** 781,787 ****
    * backslash sequence. */
    {
    register CONST char *p = src+1;
    ! int result, count, n;
    char buf[TCL_UTF_MAX];

    if (dst == NULL) {
    --- 781,787 ----
    * backslash sequence. */
    {
    register CONST char *p = src+1;
    ! int result, count, n, utfconvert;
    char buf[TCL_UTF_MAX];

    if (dst == NULL) {
    ***************
    *** 789,794 ****
    --- 789,795 ----
    }

    count = 2;
    + utfconvert = 0; /*false*/
    switch (*p) {
    /*
    * Note: in the conversions below, use absolute values (e.g.,
    ***************
    *** 820,825 ****
    --- 821,827 ----
    result = 0xb;
    break;
    case 'x':
    + utfconvert = 1; /*true*/
    if (isxdigit(UCHAR(p[1]))) { /* INTL: digit */
    char *end;

    ***************
    *** 831,836 ****
    --- 833,839 ----
    }
    break;
    case 'u':
    + utfconvert = 1; /*true*/
    result = 0;
    for (count = 0; count < 4; count++) {
    p++;
    ***************
    *** 868,873 ****
    --- 871,877 ----
    * Check for an octal number \oo?o?
    */
    if (isdigit(UCHAR(*p)) && (UCHAR(*p) < '8')) { /* INTL: digit */
    + utfconvert = 1; /*true*/
    result = (unsigned char)(*p - '0');
    p++;
    if (!isdigit(UCHAR(*p)) || (UCHAR(*p) >= '8')) { /* INTL: digit */
    ***************
    *** 891,897 ****
    if (readPtr != NULL) {
    *readPtr = count;
    }
    ! return Tcl_UniCharToUtf(result, dst);
    }

    /*
    --- 895,909 ----
    if (readPtr != NULL) {
    *readPtr = count;
    }
    ! if (!utfconvert)
    ! {
    ! *dst = (char)result;
    ! return 1;
    ! }
    ! else
    ! {
    ! return Tcl_UniCharToUtf(result, dst);
    ! }
    }

    /*

     
  • Don Porter

    Don Porter - 2001-04-13
    • labels: 104246 --> 44. UTF-8 Strings
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-06-20
    • assigned_to: nobody --> hobbs
    • status: closed-fixed --> open-fixed
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-06-20

    Logged In: YES
    user_id=72656

    Not sure why this got closed, but it is still a valid bug
    in 8.3.3/8.4a2.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-06-28
    • status: open-fixed --> closed-fixed
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-06-28

    Logged In: YES
    user_id=72656

    I solved this with a slightly cleaner patch that is
    attached for 8.4a3cvs.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-06-28