Menu

#2404 Tcl_UtfPrev not as documented ?

obsolete: 8.4.3
closed-fixed
9
2003-07-18
2003-07-11
Don Porter
No

Docs for Tcl_UtfPrev(src, start) say:

Given src, a pointer to some location in a
UTF-8 string,
Tcl_UtfPrev returns a pointer to the previous
UTF-8 charac-
ter in the string. This function will not back
up to a
position before start, the start of the UTF-8
string. If
src was already at start, the return value will be
start.

The source code comments just before Tcl_UtfPrev
also say:

* Given a pointer to some current location in a
UTF-8 string,
* move backwards one character. This works
correctly when the
* pointer is in the middle of a UTF-8 character.

So the claim is that the src argument can be a pointer
to a trailing byte of a multi-byte character.

However, that does not appear to be what the
routine really does. Instead, the routine starts
a search at one byte before the src argument, and
searches backward for the beginning of a UTF8
character byte-sequence and returns that.

So, for example,

CONST char *word="ab\303\521";
Tcl_UtfPrev(word+3,word);

will return word+2 (pointer to the é character in UTF8)
and not word+1 (pointer to the 'b') as I would expect.

One feature of the current Tcl_UtfPrev()
implementation is that it is currently safe
to pass in a 'src' argument, that points one
byte past the end of the allocated buffer,
which can be useful, even though it seems
counter to the documentation.

Either implementation or documentation
ought to be changed to bring them into
agreement.

Discussion

  • Don Porter

    Don Porter - 2003-07-11
    • assigned_to: nijtmans --> hobbs
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2003-07-17

    Logged In: YES
    user_id=72656

    I think docs should be changed. IIRC there are uses of this
    that only Tk makes use of that have to be checked against.

     
  • Don Porter

    Don Porter - 2003-07-17

    Logged In: YES
    user_id=80530

    Yes, that is the consensus.
    Clarify the docs to accurately
    describe existing behavior.
    claiming...

     
  • Don Porter

    Don Porter - 2003-07-17
    • priority: 5 --> 9
    • assigned_to: hobbs --> dgp
     
  • Don Porter

    Don Porter - 2003-07-18
    • priority: 9 --> 8
     
  • Donal K. Fellows

    • priority: 8 --> 9
    • assigned_to: dgp --> dkf
     
  • Donal K. Fellows

    • status: open --> closed-fixed
     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Minor tightening of Tcl_UtfNext() done at same time.