From: SourceForge.net <no...@so...> - 2003-07-11 21:43:34
|
Bugs item #769895, was opened at 2003-07-11 17:43 Message generated for change (Settings changed) made by dgp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=769895&group_id=10894 Category: 43. UTF-8 Strings Group: 8.4.3 Status: Open Resolution: None Priority: 5 Submitted By: Don Porter (dgp) >Assigned to: Jeffrey Hobbs (hobbs) Summary: Tcl_UtfPrev not as documented ? Initial Comment: Docs for Tcl_UtfPrev(src, start) say: Given src, a pointer to some location in a UTF-8 string, Tcl_UtfPrev returns a pointer to the previous UTF-8 charac- ter in the string. This function will not back up to a position before start, the start of the UTF-8 string. If src was already at start, the return value will be start. The source code comments just before Tcl_UtfPrev also say: * Given a pointer to some current location in a UTF-8 string, * move backwards one character. This works correctly when the * pointer is in the middle of a UTF-8 character. So the claim is that the src argument can be a pointer to a trailing byte of a multi-byte character. However, that does not appear to be what the routine really does. Instead, the routine starts a search at one byte before the src argument, and searches backward for the beginning of a UTF8 character byte-sequence and returns that. So, for example, CONST char *word="ab\303\521"; Tcl_UtfPrev(word+3,word); will return word+2 (pointer to the é character in UTF8) and not word+1 (pointer to the 'b') as I would expect. One feature of the current Tcl_UtfPrev() implementation is that it is currently safe to pass in a 'src' argument, that points one byte past the end of the allocated buffer, which can be useful, even though it seems counter to the documentation. Either implementation or documentation ought to be changed to bring them into agreement. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=769895&group_id=10894 |