Menu

#2068 Tcl_AppendToObj docs confusing

obsolete: 8.4.0
open-remind
4
2003-02-05
2002-10-17
Anonymous
No

TCL 8.4.0
Windows XP

info exists array(name) fails when name is long.

Example (show both working and failing functions)

% set studyuid $a(0020 000d)
1.2.840.113619.2.43.16112.2141964.87.55.870876696.1
% set seriesuid a(0020 000e)
1.2.840.113619.2.43.16112.2141964.41.48.870879458.1
5
% set set study($studyuid) $seriesuid
1.2.840.113619.2.43.16112.2141964.41.48.870879458.1
5
% puts $study($studyuid)
1.2.840.113619.2.43.16112.2141964.41.48.870879458.1
5
% set study(test) 2
2
% puts [info exists study(test)]
1
% set b test
test
% puts [info exists study($b)]
1
% puts [info exists study($studyuid)]
0

This should be 1

Discussion

1 2 > >> (Page 1 of 2)
  • Don Porter

    Don Porter - 2002-10-18

    Logged In: YES
    user_id=80530

    Can anyone else make sense of this?

    Can the submitter try again? A cut and
    paste of an actual interactive session,
    or a demo script would be an improvement.

     
  • Donal K. Fellows

    • labels: 105658 --> 104254
    • status: open --> closed-works-for-me
     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Beats me what's going on, though those long strings remind
    me of LDAP, so if there's a problem with an extension
    mutating objects when it shouldn't, that could be what's
    going on.

    In any case, it works for me going on the basis of what I
    can type in.

     
  • Nobody/Anonymous

    Logged In: NO

    Yeah, worked for me on what I could type in, too, but the bug
    persists. I did some more digging. Here's the scenario.
    The offending array arguments have embedded nulls at the
    end. The objects are created in C:
    dicomDumpObject( ...tcl args ...) {
    char buffer[8192];
    Tcl_Obj *theSize, *element, *theTagd, *theTagl,
    *elementsize, *theList, *theString;
    sprintf(buffer,"");
    element = Tcl_NewStringObj(buffer, strlen(buffer));
    theString = Tcl_NewStringObj("ELEMENTLIST", -1);
    theList = Tcl_NewListObj(1, &theString);
    stringLength = elementItem->element.length;
    strncpy((char *)scratch, elementItem->element.d.string,
    stringLength);
    scratch[stringLength] = '\0';
    Tcl_AppendToObj(element,(char *)scratch,stringLength);
    Tcl_ListObjAppendElement(NULL,theList,element);
    Tcl_SetObjResult(interp,theList);
    return(TCL_OK);
    }

    There are other items in the list, such that the entire list is
    putarray format, so to read the objects into TCL, I use:
    array set a [dicomDumpObject $o]
    This populates the array correctly, but the subscripts that
    contain appended nulls fail when using [info exists a($v)].

    Yes, the string array names are ASN values, probably much
    like LDAP.

    This procedure uses TCL from end to end, after the values
    are copied in using Tcl_AppendToObj, which according to the
    man page, handles almost anything.

    -Mahlon

     
  • Don Porter

    Don Porter - 2002-10-21
    • status: closed-works-for-me --> pending-works-for-me
     
  • Don Porter

    Don Porter - 2002-10-21

    Logged In: YES
    user_id=80530

    I think we have to assume this is a bug in
    your program, unless you provide a complete
    bit of code that supplies legal arguments
    to Tcl_AppendToObj(), but then produces
    results that are contrary to the documentation.

    Your followup is an improvement over the
    original report, but still does not provide
    enough information for anyone else to reproduce
    your problem.

    (What is "elementItem" ? What values do
    *scratch and stringLength have when passed
    into Tcl_AppendToObj(). etc...)

     
  • Nobody/Anonymous

    Logged In: NO

    Fair enough. Here's a sample that fails.

    C Source code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    #ifdef _MSC_VER
    #include <io.h>
    #include <fcntl.h>
    #include <winsock.h>
    #endif
    #include "tcl.h"

    int test_tcl(ClientData clientData, Tcl_Interp *interp, int argc,
    char **argv) {
    int i;
    char buffer[512];
    Tcl_Obj *element, *theList, *theString;

    theString = Tcl_NewStringObj("ELEMENTLIST", -
    1);
    theList = Tcl_NewListObj(1, &theString);

    sprintf(buffer,"");
    element = Tcl_NewStringObj(buffer, strlen(buffer));

    strcpy(buffer,"21 test");
    i = strlen(buffer);
    buffer[i] = (char) NULL;
    i++;
    Tcl_AppendToObj(element,buffer,i);

    Tcl_ListObjAppendElement(NULL,theList,element);
    Tcl_SetObjResult(interp,theList);
    return(TCL_OK);
    }

    #ifdef WIN32
    __declspec(dllexport)
    #endif
    int Testtcl_Init(Tcl_Interp *interp)
    {
    Tcl_CreateCommand
    (interp, "testobject", test_tcl, (ClientData) NULL,
    (Tcl_CmdDeleteProc *)NULL);
    return(TCL_OK);
    }

    Compile the source into a shared library (I've done this on
    both PC and SGI, and both fail).

    Then start tclsh and execute this script:
    % load testtcl.dll
    % array set a [testobject]
    % parray a
    a(ELEMENTLIST) = 21 test
    % set b $a(ELEMENTLIST)
    21 test
    % string length $b
    8
    % set r($b) 2
    2
    % puts $r($b)
    2
    % info exists r($b)
    0

    -Mahlon

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-10-21

    Logged In: YES
    user_id=72656

    Woah, bogosity filter hitting hard:

    strcpy(buffer,"21 test");
    i = strlen(buffer);
    buffer[i] = (char) NULL;
    i++;
    Tcl_AppendToObj(element,buffer,i);

    What's with i++ here? That's telling AppendToObj to take
    more bytes than are valid out of buffer ...

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-10-21
    • status: pending-works-for-me --> open-works-for-me
     
  • Mahlon Stacy

    Mahlon Stacy - 2002-10-21

    Logged In: YES
    user_id=595029

    No, I don't think so.

    strlen(buffer) is 7; i = 7
    range of buffer[] is 0 - 6
    buffer[7] = NULL sets the 8th char to NULL
    i++ increments the length to 8;
    in this example, buffer[7] was already null because we used
    strcpy. But in my working program, the values are not null
    terminated, they are described by length. Using the construct
    above just guarantees a null at the end of the value.
    Also, buffer is declared as an array... there's no overrun.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-10-21

    Logged In: YES
    user_id=72656

    That's not the correct thing to do. Tcl_Obj's are supposed to
    be utf-8 correct as strings, with the minor exception that
    NULLs are represented as two bytes (\xC0\x80 IIRC) to allow
    them to be passed around safely. The violation of this *may*
    cause problems, which was the red flag that waved at me.

     
  • Mahlon Stacy

    Mahlon Stacy - 2002-10-21

    Logged In: YES
    user_id=595029

    Makes sense. But shouldn't either
    a) Tcl_AppendToObject catch and fix an embedded NULL,
    and/or
    b) whatever the subscript, if you can print the value of an
    object, shouldn't [info exists] on that same object always be
    true?

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-10-21

    Logged In: YES
    user_id=72656

    Do you really intend to have the NULL there, or do you just
    want to ensure that it's null terminated? If the latter, don't do
    anything extra - Tcl handles that. If the former, then you
    should either be using the Tcl_ByteArrayObj stuff, or you
    should use Tcl_ExternalToUtf and friends. That said:

    a) No, there are lots of other APIs to handle that, as noted
    above.

    b) This may indicate exactly where the problem is. While
    you can print it just fine (and it may be holding the NULL in
    there, you just can't see it), the info exists may not include
    the null when it passes the value through a strlen or such
    (that's why NULL get's special encoding), which is the source
    of the problem you are seeing.

     
  • Mahlon Stacy

    Mahlon Stacy - 2002-10-21

    Logged In: YES
    user_id=595029

    OK, thanks Jeff. This complicates the coding for me, but I
    understand the issues. I do need to keep the NULL, when it's
    present, but I'll have to manage the use of the value as a
    subscript in another way.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-10-21
    • status: open-works-for-me --> closed-invalid
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-10-21

    Logged In: YES
    user_id=72656

    I don't think it does complicate things at all, you are just
    looking at the wrong kind of object. A "String" is a utf-8
    string. You want "ByteArray"s, so string map {String
    ByteArray} in your code - the APIs are all there.

     
  • Mahlon Stacy

    Mahlon Stacy - 2002-10-21

    Logged In: YES
    user_id=595029

    OK, I converted the offending TclStringObj to
    TclByteArrayObj, without other changes, and it seems to
    work OK. (we'll do more testing).

    FWIW, I interpreted the man page for Tcl_AppendToObj to
    suggest that the function would properly encode any string
    passed into it. Guess not.

    Thanks for clearing this up.

     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Hmm. I agree that the Tcl_AppendToObj documentation could
    be much clearer, and might possibly even be wrong; can we
    *really* take embedded NULs in the "bytes" argument
    correctly, or do we always need them two-byte encoded if
    they are not the end-of-string marker?

     
  • Donal K. Fellows

    • labels: 104254 --> 10. Objects
    • summary: info exists fails on long array names --> Tcl_AppendToObj docs confusing
    • status: closed-invalid --> open-invalid
     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Review of the code indicates that the manpage is wrong;
    'bytes' may not contain NUL bytes (well, it won't cause a
    memory fault, but strange effects - as seen here - might
    happen) except as an end-of-string marker when 'length' is
    -1 (or not long enough to overlap the NUL). Indeed, it is
    not even documented that 'bytes' is UTF8! Will fix...

     
  • Donal K. Fellows

    • status: open-invalid --> open-accepted
     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Hmm. The documentation was completely out of synch with the
    implementation; what was written and what was true were
    quite different things, and had been since Tcl8.1...

     
  • Donal K. Fellows

    • status: open-accepted --> closed-fixed
     
  • Don Porter

    Don Porter - 2002-10-22
    • status: closed-fixed --> open-remind
     
  • Don Porter

    Don Porter - 2002-10-22

    Logged In: YES
    user_id=80530

    Note that this is a general problem.
    Uncertainty about what encoding is
    required for the string pointed to by
    objPtr->bytes.

    It arose in Tcl Bug 584603 as well.
    If we are requiring UTF-8, then there's
    probably additional places in the docs
    to note this. Also, do we know of any
    extensions/users of Tcl_Obj's that have
    used the documented freedom to have
    non-encoded embedded NULLs in the
    counted strings pointed to by objPtr->bytes
    that we are now belatedly declaring illegal?

    Note in particular that Tcl's own command
    [encoding convertfrom identity] is now illegal
    by this documentation change, since it can
    return a Tcl_Obj with a non-UTF8 string rep.

     
1 2 > >> (Page 1 of 2)