From: <no...@so...> - 2002-10-22 09:44:33
|
Bugs item #624919, was opened at 2002-10-17 23:07 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=624919&group_id=10894 Category: 10. Objects Group: 8.4.0 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Donal K. Fellows (dkf) Summary: Tcl_AppendToObj docs confusing Initial Comment: TCL 8.4.0 Windows XP info exists array(name) fails when name is long. Example (show both working and failing functions) % set studyuid $a(0020 000d) 1.2.840.113619.2.43.16112.2141964.87.55.870876696.1 % set seriesuid a(0020 000e) 1.2.840.113619.2.43.16112.2141964.41.48.870879458.1 5 % set set study($studyuid) $seriesuid 1.2.840.113619.2.43.16112.2141964.41.48.870879458.1 5 % puts $study($studyuid) 1.2.840.113619.2.43.16112.2141964.41.48.870879458.1 5 % set study(test) 2 2 % puts [info exists study(test)] 1 % set b test test % puts [info exists study($b)] 1 % puts [info exists study($studyuid)] 0 This should be 1 ---------------------------------------------------------------------- >Comment By: Donal K. Fellows (dkf) Date: 2002-10-22 10:44 Message: Logged In: YES user_id=79902 Review of the code indicates that the manpage is wrong; 'bytes' may not contain NUL bytes (well, it won't cause a memory fault, but strange effects - as seen here - might happen) except as an end-of-string marker when 'length' is -1 (or not long enough to overlap the NUL). Indeed, it is not even documented that 'bytes' is UTF8! Will fix... ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2002-10-22 09:30 Message: Logged In: YES user_id=79902 Hmm. I agree that the Tcl_AppendToObj documentation could be much clearer, and might possibly even be wrong; can we *really* take embedded NULs in the "bytes" argument correctly, or do we always need them two-byte encoded if they are not the end-of-string marker? ---------------------------------------------------------------------- Comment By: Mahlon Stacy (mahlonstacy) Date: 2002-10-21 19:49 Message: Logged In: YES user_id=595029 OK, I converted the offending TclStringObj to TclByteArrayObj, without other changes, and it seems to work OK. (we'll do more testing). FWIW, I interpreted the man page for Tcl_AppendToObj to suggest that the function would properly encode any string passed into it. Guess not. Thanks for clearing this up. ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2002-10-21 18:52 Message: Logged In: YES user_id=72656 I don't think it does complicate things at all, you are just looking at the wrong kind of object. A "String" is a utf-8 string. You want "ByteArray"s, so string map {String ByteArray} in your code - the APIs are all there. ---------------------------------------------------------------------- Comment By: Mahlon Stacy (mahlonstacy) Date: 2002-10-21 18:48 Message: Logged In: YES user_id=595029 OK, thanks Jeff. This complicates the coding for me, but I understand the issues. I do need to keep the NULL, when it's present, but I'll have to manage the use of the value as a subscript in another way. ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2002-10-21 18:40 Message: Logged In: YES user_id=72656 Do you really intend to have the NULL there, or do you just want to ensure that it's null terminated? If the latter, don't do anything extra - Tcl handles that. If the former, then you should either be using the Tcl_ByteArrayObj stuff, or you should use Tcl_ExternalToUtf and friends. That said: a) No, there are lots of other APIs to handle that, as noted above. b) This may indicate exactly where the problem is. While you can print it just fine (and it may be holding the NULL in there, you just can't see it), the info exists may not include the null when it passes the value through a strlen or such (that's why NULL get's special encoding), which is the source of the problem you are seeing. ---------------------------------------------------------------------- Comment By: Mahlon Stacy (mahlonstacy) Date: 2002-10-21 18:34 Message: Logged In: YES user_id=595029 Makes sense. But shouldn't either a) Tcl_AppendToObject catch and fix an embedded NULL, and/or b) whatever the subscript, if you can print the value of an object, shouldn't [info exists] on that same object always be true? ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2002-10-21 18:30 Message: Logged In: YES user_id=72656 That's not the correct thing to do. Tcl_Obj's are supposed to be utf-8 correct as strings, with the minor exception that NULLs are represented as two bytes (\xC0\x80 IIRC) to allow them to be passed around safely. The violation of this *may* cause problems, which was the red flag that waved at me. ---------------------------------------------------------------------- Comment By: Mahlon Stacy (mahlonstacy) Date: 2002-10-21 18:25 Message: Logged In: YES user_id=595029 No, I don't think so. strlen(buffer) is 7; i = 7 range of buffer[] is 0 - 6 buffer[7] = NULL sets the 8th char to NULL i++ increments the length to 8; in this example, buffer[7] was already null because we used strcpy. But in my working program, the values are not null terminated, they are described by length. Using the construct above just guarantees a null at the end of the value. Also, buffer is declared as an array... there's no overrun. ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2002-10-21 18:14 Message: Logged In: YES user_id=72656 Woah, bogosity filter hitting hard: strcpy(buffer,"21 test"); i = strlen(buffer); buffer[i] = (char) NULL; i++; Tcl_AppendToObj(element,buffer,i); What's with i++ here? That's telling AppendToObj to take more bytes than are valid out of buffer ... ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2002-10-21 17:55 Message: Logged In: NO Fair enough. Here's a sample that fails. C Source code: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <ctype.h> #ifdef _MSC_VER #include <io.h> #include <fcntl.h> #include <winsock.h> #endif #include "tcl.h" int test_tcl(ClientData clientData, Tcl_Interp *interp, int argc, char **argv) { int i; char buffer[512]; Tcl_Obj *element, *theList, *theString; theString = Tcl_NewStringObj("ELEMENTLIST", - 1); theList = Tcl_NewListObj(1, &theString); sprintf(buffer,""); element = Tcl_NewStringObj(buffer, strlen(buffer)); strcpy(buffer,"21 test"); i = strlen(buffer); buffer[i] = (char) NULL; i++; Tcl_AppendToObj(element,buffer,i); Tcl_ListObjAppendElement(NULL,theList,element); Tcl_SetObjResult(interp,theList); return(TCL_OK); } #ifdef WIN32 __declspec(dllexport) #endif int Testtcl_Init(Tcl_Interp *interp) { Tcl_CreateCommand (interp, "testobject", test_tcl, (ClientData) NULL, (Tcl_CmdDeleteProc *)NULL); return(TCL_OK); } Compile the source into a shared library (I've done this on both PC and SGI, and both fail). Then start tclsh and execute this script: % load testtcl.dll % array set a [testobject] % parray a a(ELEMENTLIST) = 21 test % set b $a(ELEMENTLIST) 21 test % string length $b 8 % set r($b) 2 2 % puts $r($b) 2 % info exists r($b) 0 -Mahlon ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2002-10-21 17:08 Message: Logged In: YES user_id=80530 I think we have to assume this is a bug in your program, unless you provide a complete bit of code that supplies legal arguments to Tcl_AppendToObj(), but then produces results that are contrary to the documentation. Your followup is an improvement over the original report, but still does not provide enough information for anyone else to reproduce your problem. (What is "elementItem" ? What values do *scratch and stringLength have when passed into Tcl_AppendToObj(). etc...) ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2002-10-21 16:49 Message: Logged In: NO Yeah, worked for me on what I could type in, too, but the bug persists. I did some more digging. Here's the scenario. The offending array arguments have embedded nulls at the end. The objects are created in C: dicomDumpObject( ...tcl args ...) { char buffer[8192]; Tcl_Obj *theSize, *element, *theTagd, *theTagl, *elementsize, *theList, *theString; sprintf(buffer,""); element = Tcl_NewStringObj(buffer, strlen(buffer)); theString = Tcl_NewStringObj("ELEMENTLIST", -1); theList = Tcl_NewListObj(1, &theString); stringLength = elementItem->element.length; strncpy((char *)scratch, elementItem->element.d.string, stringLength); scratch[stringLength] = '\0'; Tcl_AppendToObj(element,(char *)scratch,stringLength); Tcl_ListObjAppendElement(NULL,theList,element); Tcl_SetObjResult(interp,theList); return(TCL_OK); } There are other items in the list, such that the entire list is putarray format, so to read the objects into TCL, I use: array set a [dicomDumpObject $o] This populates the array correctly, but the subscripts that contain appended nulls fail when using [info exists a($v)]. Yes, the string array names are ASN values, probably much like LDAP. This procedure uses TCL from end to end, after the values are copied in using Tcl_AppendToObj, which according to the man page, handles almost anything. -Mahlon ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2002-10-18 10:42 Message: Logged In: YES user_id=79902 Beats me what's going on, though those long strings remind me of LDAP, so if there's a problem with an extension mutating objects when it shouldn't, that could be what's going on. In any case, it works for me going on the basis of what I can type in. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2002-10-18 04:55 Message: Logged In: YES user_id=80530 Can anyone else make sense of this? Can the submitter try again? A cut and paste of an actual interactive session, or a demo script would be an improvement. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=624919&group_id=10894 |