Scintilla / Feature Requests / #1238 Improve performance of PositionCache::MeasureWidths()

Zufu Liu - 2018-10-26

with

if (size_ != 0) { size_ = NextPowerOfTwo(size_); }

changed to

if (size_ & (size_ - 1)) { size_ = NextPowerOfTwo(size_); }

maybe if ((probe != 0) && (len < 30)) { can be changed to if ((len < 30)) {.

pc-1.diff
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2018-11-18

Using a power of two for a hash table can actually lead to worse distribution with a simple hash function. Some authors recommend always using a prime size.

I would want to see benchmark evidence before changing anything here and its unlikely that performing a modulo will be particularly noticeable compared with other aspects such as the loop in PositionCacheEntry::Hash.

Its possible that there are big performance wins in PositionCache but its likely the biggest win would be to simply use a larger default size since memory is much more available than it was 10 years ago when this code was written. If someone wants to work on PositionCache performance then the first stage should be to build a test program (possibly with a mocked but expensive Surface::MeasureWidths) that just exercises this feature. Run a profiler to get a feel for where the time goes then look at the times with different cache sizes on some representative files.

The check for pces.empty() was likely to allow turning off the position cache with SCI_SETPOSITIONCACHE(0) although I don't know if that path actually works.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2018-11-19

Screenshots using VS2017 15.9.1 CPU Usage Profiler (SC_TECHNOLOGY_DIRECTWRITE on Win10 i5-4570), with text from https://github.com/zufuliu/notepad2/issues/79.

and.png

mod.png

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2018-11-20

That's a very unusual file with 4.5 million NUL bytes followed by a small amount of readable text. Its not a case that represents common usage so does not deserve much special treatment.

If fast performance with files like this is an important use of your application, then the line layout cache level can be increased with SCI_SETLAYOUTCACHE(SC_CACHE_PAGE). This could either be unconditional or be triggered by very wide lines. Due to changes to allow lines to overlap, SC_CACHE_CARET is no longer as effective as it was.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2019-04-15

Find a redundent loop in EditView::LayoutLine() (line 428-431):

for (Sci::Position styleInLine = 0; styleInLine < numCharsInLine; styleInLine++) { const unsigned char styleByte = ll->styles[styleInLine]; ll->styles[styleInLine] = styleByte; }
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Neil Hodgson - 2019-04-25
  
  Removed with [40f158].
  
  Related
  
  Commit: [40f158]
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2019-04-16

Improve readability for EditView::LayoutLine() .

LayoutLine.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Neil Hodgson - 2019-04-17
  
  I think a better simplification of the case forcing code would be to unify the two portions (checking and assignment) into a separate function that returned the case forced value then call that in both situations.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Neil Hodgson - 2019-04-25
  
  That diff exits the loop early when a mismatch found so style_byte isn't the last value when compared just after the loop.
  
  The current code checks for membership in the set of current word characters for camel case forcing but this appears excessive for the application: displaying ASCII-letter-only keywords in a preferred way. Patch 7474 simplifies this by replacing Document::IsASCIIWordByte with new IsUpperOrLowerCase function. Document::IsASCIIWordByte is only used for this feature so is removed.
  
  Patch 7475 hoists the case forcing code into its own function, simplifying the two loops.
  
  Further simplifications and optimizations could be performed but its unlikely that much additional complexity would be justified by performance increases. Most changes change the line length so don't even go into this code.
  
  scintilla_7474.patch
  
  scintilla_7475.patch
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Neil Hodgson - 2019-04-27
    
    Committed these with minor changes and documentation as [cdb7af] and [93d24d].
    
    Related
    
    Commit: [93d24d]
    Commit: [cdb7af]
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2019-04-26

If the loop exits early, allSame will be false, so the result LineLayout::llInvalid not changed.

Other idea (since force casing is rarely used):

if (vstyle.someStylesForceCase) { ... } else { Sci::Position charInDoc = posLineStart; while (numCharsInLine < lineLength) { styleByte = model.pdoc->StyleIndexAt(charInDoc); if ((ll->styles[numCharsInLine] != styleByte) || (model.pdoc->CharAt(charInDoc) != ll->chars[numCharsInLine])) { allSame = false; break; } ++numCharsInLine; ++charInDoc; } }
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Zufu Liu - 2019-04-26
  
  Full code is at https://github.com/zufuliu/notepad2/blob/master/scintilla/src/EditView.cxx#L401
  
  Last edit: Zufu Liu 2019-04-26
  
  1238-case.cxx
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Neil Hodgson - 2019-04-27
  
  Its difficult to find a case where the checking code shows up in a profile. Best example so far is to open a 1 Megabyte C++ file, set layout caching to SC_CACHE_DOCUMENT, turn on wrap (to style/relayout whole file after each change), set identifiers to case:upper, then replace all '=' with '!' to affect many lines without changing their length, then undo and redo this until there is a minute of profile data.
  
  The result is the allSame loop taking up 0.3% of the whole profile and 5% of EditView::LayoutLine. The profile:
  
  Last edit: Neil Hodgson 2019-04-27
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2019-04-28

It maybe has no difference. I also found the difference (early exiting) seems only happens on current edited line and depends on the edit position.
with SC_CACHE_PAGE, I only got less than 4% of whole lineLength == ll->numCharsInLineif block where allSame=false.

Not related to this issue, in ScintillaBase.h, only the first method is implemented:

#ifdef SCI_LEXER LexState *DocumentLexState(); void SetLexer(uptr_t wParam); void SetLexerLanguage(const char *languageName); void Colourise(int start, int end); #endif
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Neil Hodgson - 2019-04-28
  
  LayoutLine is a very central method and does have big performance effects in some modes but the allSame loop doesn't appear to be a significant contributor to performance so should be written as clearly as possible. At a high level, its just checking whether the characters and styles have changed since the positions were calculated. Case forcing makes it more complicated.
  
  Checking the characters isn't that dissimilar to
  
  bool allSame = std::equal(ll->chars.get(), &ll->chars[lineLength], model.pdoc->RangePointer(posLineStart, lineLength));
  
  which works but doesn't account for case forcing. It may also move the gap which isn't so nice for a read-only operation although an iterator could be built to make this more natural. Due to the need to access the corresponding style byte for case forcing and the previous character for camel case, it doesn't even fit with a lambda:
  
  bool allSame = std::equal(ll->chars.get(), &ll->chars[lineLength], model.pdoc->RangePointer(posLineStart, lineLength), [] (char a, char b) { return a == CaseForce(?, b, ?); } );
  
  An actual loop appears to be the clearest option. Checking allSame at loop head doesn't complicate the code that much and will likely be a tiny win.
  
  At more than 200 lines, it may be worthwhile moving some of the LayoutLine code out into separate methods and checking whether the characters and styles are different would be a reasonable candidate. The wrapping code (or the core wrapping loop without the preparation) might also be worth its own method particularly if it becomes more complex.
  
  in ScintillaBase.h, only the first method is implemented
  
  These were moved to LexInterface / LexState at some point. This code will be modified much more as part of isolating lexing functionality into an optionally separate Lexilla library. I don't want to make other changes wile working on this isolation as it moves the code away from when the isolation work branched.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2019-05-24

labels: --> scintilla, layout

status: open --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2019-05-24

Closing this as the power-of-two change that initiated the issue doesn't appear worth pursuing.

There were some other concerns bought up but they don't seem sufficiently important to keep this issue open.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Improve performance of PositionCache::MeasureWidths()

Group

Searches

Help

#1238 Improve performance of PositionCache::MeasureWidths()

Discussion

Related

Related