From: Frédéric B. <fre...@fr...> - 2012-08-25 12:03:31
|
Le 25/08/2012 12:48, Andreas Leitgeb a écrit : > On Fri, Aug 24, 2012 at 04:14:38PM -0700, Jeff Rogers wrote: >> This is exactly the pattern I wanted to simplify. This example becomes >> binary dscan data "I" num >> binary dscan data "a${num}" text > > A) either it is inefficient by having to copy/shift around the whole > remaining string after each extraction, > B) or it requires some major rewrite of Tcl's internal string > representation to include an offset, which just isn't > going to happen in foreseeable future. (just as it > unfortunately won't happen with lists, either) > > A is plausible but slow, B is unlikely to happen. > (I'm assuming you do not suggest any EIAS-violations.) FWIW Colibri strings use rope representations (i.e. balanced binary trees of flat strings) with subrope capabilities, similar to lists (actually it's the other way round, list implementation was adapted from rope's). And as you said and has been mentioned earlier this won't happen (until Tcl9?) given the API incompatibilies. However there is a third way that I believe should work without much hassle: a "string tail" object, using Tcl_Obj's ptrAndLongRep: - objPtr->internalRep.ptrAndLongRep.ptr would point to the original string object - objPtr->internalRep.ptrAndLongRep.value would give the starting index - Tcl_ObjType's updateStringProc would simply extract the substring from the starting index. That way temporary substrings would simply point to the original object as long as the string rep is not requested ("string tail"-aware code would use the index and original object instead of the string rep), and unshared objects would behave like iterators ("string tail"-aware code would update the index field of unshared objects). This is a poor man's substring with a variable start index and a fixed end index. But with a "string head" counterpart (i.e. with an ending index instead of a starting index) this could work with arbitrary substrings at the expense of two objects instead of one. |