Hello everyone,
we are using scintilla indirect via ScintillaNET and it works very well.
But now we have a curious behaviour detected. If we want get some
word from position which is enclosed with guillemets then we get a wrong result.
Example:
The cursor stand on »Mons|ieur« then call the function and the result is »Monsieur«.
It return the word + the Guillemets. But only the word was expected.
I make also a test with notepad++ and I get the same result:
So we think it is a bug, because guillemets are not word character.
Thanks, look forward to hear from you.
Lars
I have forget to say, the test scenario with notepad++ was:
The cursor is also placed on »Mons|ieur« then I open the find dialog and
the input field will be autofill with »Monsieur« instead of Monsieur.
Last edit: Lars Voigt 2016-05-26
If the document is in the Windows-1252 code page then you should set the word characters with APIs like http://www.scintilla.org/ScintillaDoc.html#SCI_SETWORDCHARS and http://www.scintilla.org/ScintillaDoc.html#SCI_SETPUNCTUATIONCHARS . If the document is in the UTF-8 encoding, you will have to write your own code as Scintilla does not handle multi-byte word characters.
This report does not reveal which functions are being called or the encoding of the file. No matter what Scintilla's API provides with word-oriented functions, you can always write your own code that deals with the document as a byte sequence and behaves in whatever way you want.
Scintilla now treats the '»' and '«' as non-word characters in UTF-8 mode. The application chooses their character class in single byte character sets. The behaviour of Notepad++ depends on the Notepad++ implementation and may or may not be influenced by this change.