From: Kazutoshi S. <k_s...@f2...> - 2011-02-16 23:17:31
|
Matthieu Casanova wrote: > 2011/2/16 Kazutoshi Satoda<k_s...@f2...>: >> If you want to avoid possible surprise that "abc(" is not found in >> "abc(x", there maybe a possible solution: Putting "\b" only if the >> start/end character is a word character, where ... >> boolean isWordCharacter(Char c) >> { >> return true if regex search for "\b" in ("w" + c) found a >> boundary at the next of the "w". >> } Sorry, the above logic of isWordCharacter was converse. The correct one is: boolean isWordCharacter(Char c) { return true if regex search for "\b" in (" " + c) found a boundary at the next of the " ". // " " can be any other certain non-word character. } > I think I agree with the idea of removing any \b if the beginning or > the end of the word is not a word character. > For example if I search for abc( and have in the buffer abc((int)x) I > don't want to get abc(( It's another good example. And since you agreed, I finally propose the conditional use of "\b" with above isWordCharacter() local method. > To find a word character it is finally simple : > In the Pattern documentation there is a character class for word characters : > > \w A word character: [a-zA-Z_0-9] > > So a word character is a letter or digit or _ > Do you agree with that ? I already noted in previous post that "\w" doesn't work for non-ASCII character, while "\b" recognize some non-ASCII word boundaries. -- k_satoda |