This folder contains my latest regex code (as of may 2013) for Notepad++ which is not yet in the release version.
The SciLexer.dll can directly replace the one from latest version of Notepad++ but not all features are accessible since the user interface has not been updated to support some new features.
It passes all automated tests that were done for the "new regex code" which is in current release, plus:
- correctly supports code points outside BPM (search is done with 32 bit codepoints instead of UTF-16);
- both search and replace strings can contain embedded null characters and/or escape sequences for null characters;
- lookbehinds are correctly handled in search and replace, even those overlapping with end of previous match;
- a new [[:inval:]] character class, to find invalid UTF-8 sequences;
- invalid UTF-8 characters can be kept in replace (e.g. replacing "(.*)" by "ab\1cd" will keep invalid UTF-8 sequences);
The following new features are not accessible in current Notepad++ user interface:
- a new SCFIND_REGEXP_LOCALEORDER option, to have character ranges in locale order instead of code point order ('à' is between 'a' and 'b' at least in French locale order, but is after in code point order, thus [a-b] will match also 'à' and other characters that would be between 'a' and 'b' in a dictionary);
- the error message can now be known when the regex is invalid (e.g. regex "(" will report an "Unmatched marking parenthesis", while current Notepad++ only knows it is an "Invalid regular expression");