Soft-hyphens '\u00AD' are not properly handled in text
widgets.
They are always shown as '-' (hyphen) but a soft-hyphen
should only appear if it is the last character in a line.
I.E
"abc\u00AD123" should be displayed as
abc123
or
abc-
123
but is always displayed as
abc-123
Tested on win/linux
8.4.6 / 8.4.7 / 8.5a2
Logged In: YES
user_id=79902
Issue originally raised on comp.lang.tcl
The word-break algorithm is also wrong ('-wrap word'
considers \u00AD to not be a potential break point)
This is probably also broken in multiline labels/buttons.
Logged In: YES
user_id=32170
The bug in question is around here:
if (wrapMode != TEXT_WRAPMODE_WORD) {
chunkPtr->breakIndex = chunkPtr->numBytes;
} else {
for (count = bytesThatFit, p += bytesThatFit - 1; count > 0;
count--, p--) {
if (isspace(UCHAR(*p))) {
chunkPtr->breakIndex = count;
break;
}
}
in tkTextDisp.c (line 6668 onwards). That 'isspace(UCHAR)'
needs to be made utf aware and look for special characters
like \u00AD.
Code contributions (and new tests) appreciated. This is not
on my immediate to-do list.
Logged In: YES
user_id=32170
I should add that my previous comment refers to where the
fix for word-wrapping at a soft-hyphen should go.
To make soft-hyphens not display when they're not at the end
of a line will require more changes both earlier on in the
same TkTextCharLayoutProc (or more likely in the helper
routine MeasureChars), and in CharDisplayProc. These
additional changes are non-trivial.
Logged In: YES
user_id=32170
Note, even beyond this bug, shouldn't "word wrapping"
actually wrap at ordinary hyphens as well? Currently all of
Tk's word wrapping wraps only at whitespace.