I find Document::BraceMatch() can be optimized while try to fix https://github.com/zufuliu/notepad4/issues/911
NextPosition() (takes much time than style comparison) can be eliminated:position += direction; is enough.First approach is making faster version (only position += direction;) for UTF-8 and single byte code pages like Document::FindText().
I'm using following safe char approach for Notepad4, for UTF-8, it has no obvious performance lost.
safeChar = 0xff;
if (dbcsCodePage != 0 && dbcsCodePage != CpUtf8) {
safeChar = (direction >= 0) ? 0x80 : 0x30 /*get min trailByte - 1*/;
}
if (chAtPos <= safeChar) {
position += direction;
} else {
// NextPosition(position, direction)
}
This approach (which different safe char values) can be used to simplify following block in Document::FindText():
// current code
if (forward && UTF8IsAscii(leadByte)) {
pos++;
} else {
if (dbcsCodePage) {
if (!NextCharacter(pos, increment)) {
break;
}
} else {
pos += increment;
}
}
// safe char
if (leadByte <= safeChar) {
pos += increment;
} else if (!NextCharacter(pos, increment)) {
break;
}
First patch to delay style comparison:
Full change for my safe char approach:
https://github.com/zufuliu/notepad4/commit/7e449bcb1e5a02096ba38eb7f1246199de147beb
Diff:
I measured in the wrong place, results (from fast to slow) after moving
ElapsedPeriodcode intocase Message::BraceMatch:inEditor::WndProc()to measureDocument::BraceMatch()method call:1. only a single block (e.g. treat DBCS as ASCII safe) with
position += direction;.2. two blocks, first one is ASCII safe, move DBCS safe char approach into separate function and call it.
3. two blocks, first one is ASCII safe, DBCS safe char approach as later block.
4. single safe char approach.
Moving the style check (BraceMatch-1112.diff) is good. I have a 4.3 GB file used for testing containing repeated copies of
scite/src/SciTEBase.cxxand typing '{' at the start to produce a match failure with the '{' coloured red takes 22 seconds with current code as it scans the whole file. This change reduces the time to 17.4 seconds.Using
safeCharreduces this further to 10.8 seconds, so half the current code's time. Not checking the character and always doingposition += direction;improves the time to 8.6 seconds. It would be fine to use specialized loops, checking for DBCS and usingsafeCharin its loop but not for the more common encodings.However, I've recently been looking at the case-sensitive search code which does a similar scan and is much faster due to calling memchr which is optimized and checks 32 bytes at a time in its inner loop. Searching for each instance of an identifier that starts with an unpopular character takes 0.39 seconds which is 85% of the memory bandwidth on the test machine which has base-speed DDR4. An implementation that looked through the search string for the least common character (
SaveAsXML-> search forX) could use this to improve performance.A variant of
memchrthat looked for either of thechBraceandchSeekwould be great here as it would work correctly in UTF-8 and 8-bit encodings with the set of supported brace characters. However, for portability, I don't want to add SSE code like Microsoft'smemchrimplementation. This could be implemented by calling memchr for both characters and returning the minimum.Calling
mem2chron the whole document may run into bad worst cases where there aren't any more of one or not for a long distance. A limit could be placed on the distance checked for both then moved on if there isn't a match. Something like this which hasn't been tested much so may contain major bugs.Performance is sensitive to the
findEitherBlockvalue but was best at 1024 with 2.8 seconds. If this code handled all cases then I'd want it even though it is starting to get bulky but it doesn't handle backwards scanning which is more complex as Visual C++ doesn't implementmemrchr. DBCS isn't handled but that may be fixable with an inside-character check on the result. DBCS is also a small proportion of use so usingsafeCharfor DBCS and accepting much better optimization of the common encodings is fine.There are further potential optimizations: the 'losing' value from the
minis forgotten and will eventually be needed so could be remembered.Since waiting even a second for the brace to turn red would make brace matching more annoying than useful and 22 seconds is dreadful, there should be some limit to how far brace matching will scan.
Last edit: Neil Hodgson 2024-11-14
Even for DBCS,
(and)are not trail byte in all DBCS code pages,<and>are only trail byte in Johab.if (chBrace <= asciiBackwardSafeChar)then use specialized loops.I was thought adding a timeout (e.g. 200ms) instead of byte limit, after checked every 1MB bytes, if timeout and remaining text lager than a threshold, then returns
-positionto indicate timeout instead of miss match.Using
SplitFindEitherChar()is indeed faster, so I'm planning to add SSE version (faster thanmem2chr, not teststrpbrk) for Notepad4, something like following (lacks boundary check for now):Committed the style optimization as [7a8105]. A version of the safe char approach committed as [1ce781] with new
DBCSMinTrailBytemethod.Related
Commit: [1ce781]
Commit: [7a8105]
Shorter and around 6% faster version of
SplitFindEitherChar. Speed due to inlining 2 argument version ofstd::minwhere initializer list version is a call. This isn't being committed but may be worked on further.Just micro-optimization, I'm currently use something like following to avoid
NextPosition()for backward DBCS match when brace is found:NextCharacter()is used to avoid MSVC treatreturn -1;as hot path and reorder it before style comparison block, so this change may not worth it.Will try the short version
SplitFindEitherChar().Single
std::minseems is also working:std::string_view::find_first_of()likeSplitFindEitherChar(), it's faster than current code, slower and less magic thanmem2chr(), can be made to work for backward search withoutmemrchr().inlined
SplitFindEitherChar():Last edit: Zufu Liu 2024-11-24
Optimize forward brace match is worth for following satiations:
1. Open large JSON file and match top level
{or[at file begging.2. Open large XML file and match
<(<?xml) at file begging, lead to whole document scanning due to?>currently colored with different style.Patch optimized DBCS brace match:
maxSafeCharcompare to only when brace with same style is found, so should make UTF-8 and single byte code pages a lettle faster.MovePositionOutsideChar()to check whether current brace is trail byte, if it is then skip it. This is faster (especially for backward match) than usingNextPosition(), as there are only a few CJK characters using braces as trail byte, and in general programming text, when matching brace operators, it's unlikely preceding character is dual-byte. with [feature-requests:#1535] it's even faster, still needs to figure out why backwardNextPosition()is much slow.Here is my script to make large JSON test file:
Related
Feature Requests:
#1535A slightly different approach is to separate the search for the terminating match
chSeekfrom counting the depth-increasingchBraceas this allows use of the optimizedmemchrthroughSplitFindCharor reverse casestd::string_view::find_last_of. This reduces forward searches from 2.8 to 2.0 seconds for the unmatched '{' example above. The reverse search in the same file takes approximately 4 seconds but this varies depending on some implementation details I'm unclear on (from 3.4 to 4.4 seconds).First define a reverse version of
SplitFindChar. This takes anendargument instead of alengthas there were negative-length termination condition issues at one time but that may be worked on more as the functions should match.The main loop was unfused into forwards and backwards variants as they use different calls.
This is just the implementation when there is no potential for character fragment confusion: for DBCS seeking non-safe braces, a variant of the current code would be used.
Since this is quite long, a lambda was tried for the counting part. This penalized the forwards case 5% but it improved the reverse case 20%. Its better than inlining the same code which I don't understand unless there is a mistake.
Find next chSeekblock may entered into infinite loop when the foundchSeekhas different style. Alos it may be worth to cacheGetEndStyled()andLengthNoExcept()on stack.main loop with DBCS and infinite loop fixed:
That variant has trouble with
{{}}. It's easy to have off-by-one problems with empty constructs. My test file is:This went back to working with the setup of position returning to
for the main part and the
position +=adjustment removed from reverse. The initially-adjusted case is moved into the DBCS hard case.It still has off-by-one. following is my change fixed off-by-one:
test case for
BraceMatchNext()is not good, probably can be removed.Committed test with [4b011b].
Related
Commit: [4b011b]