Menu

#1238 Improve performance of PositionCache::MeasureWidths()

Completed
closed
5
2019-05-24
2018-10-26
Zufu Liu
No

Improve performance of PositionCache::MeasureWidths() (for large file)
by ensure the size of PositionCache is always power of two, the hash then can be masked instead of do a modular.

1 Attachments

Discussion

  • Zufu Liu

    Zufu Liu - 2018-10-26

    with

            if (size_ != 0) {
                size_ = NextPowerOfTwo(size_);
            }
    

    changed to

            if (size_ & (size_ - 1)) {
                size_ = NextPowerOfTwo(size_);
            }
    

    maybe if ((probe != 0) && (len < 30)) { can be changed to if ((len < 30)) {.

     
  • Neil Hodgson

    Neil Hodgson - 2018-11-18

    Using a power of two for a hash table can actually lead to worse distribution with a simple hash function. Some authors recommend always using a prime size.

    I would want to see benchmark evidence before changing anything here and its unlikely that performing a modulo will be particularly noticeable compared with other aspects such as the loop in PositionCacheEntry::Hash.

    Its possible that there are big performance wins in PositionCache but its likely the biggest win would be to simply use a larger default size since memory is much more available than it was 10 years ago when this code was written. If someone wants to work on PositionCache performance then the first stage should be to build a test program (possibly with a mocked but expensive Surface::MeasureWidths) that just exercises this feature. Run a profiler to get a feel for where the time goes then look at the times with different cache sizes on some representative files.

    The check for pces.empty() was likely to allow turning off the position cache with SCI_SETPOSITIONCACHE(0) although I don't know if that path actually works.

     
  • Zufu Liu

    Zufu Liu - 2018-11-19

    Screenshots using VS2017 15.9.1 CPU Usage Profiler (SC_TECHNOLOGY_DIRECTWRITE on Win10 i5-4570), with text from https://github.com/zufuliu/notepad2/issues/79.

     
  • Neil Hodgson

    Neil Hodgson - 2018-11-20

    That's a very unusual file with 4.5 million NUL bytes followed by a small amount of readable text. Its not a case that represents common usage so does not deserve much special treatment.

    If fast performance with files like this is an important use of your application, then the line layout cache level can be increased with SCI_SETLAYOUTCACHE(SC_CACHE_PAGE). This could either be unconditional or be triggered by very wide lines. Due to changes to allow lines to overlap, SC_CACHE_CARET is no longer as effective as it was.

     
  • Zufu Liu

    Zufu Liu - 2019-04-15

    Find a redundent loop in EditView::LayoutLine() (line 428-431):

    for (Sci::Position styleInLine = 0; styleInLine < numCharsInLine; styleInLine++) {
        const unsigned char styleByte = ll->styles[styleInLine];
        ll->styles[styleInLine] = styleByte;
    }
    
     
    • Neil Hodgson

      Neil Hodgson - 2019-04-25

      Removed with [40f158].

       

      Related

      Commit: [40f158]

  • Zufu Liu

    Zufu Liu - 2019-04-16

    Improve readability for EditView::LayoutLine() .

     
    • Neil Hodgson

      Neil Hodgson - 2019-04-17

      I think a better simplification of the case forcing code would be to unify the two portions (checking and assignment) into a separate function that returned the case forced value then call that in both situations.

       
    • Neil Hodgson

      Neil Hodgson - 2019-04-25

      That diff exits the loop early when a mismatch found so style_byte isn't the last value when compared just after the loop.

      The current code checks for membership in the set of current word characters for camel case forcing but this appears excessive for the application: displaying ASCII-letter-only keywords in a preferred way. Patch 7474 simplifies this by replacing Document::IsASCIIWordByte with new IsUpperOrLowerCase function. Document::IsASCIIWordByte is only used for this feature so is removed.

      Patch 7475 hoists the case forcing code into its own function, simplifying the two loops.

      Further simplifications and optimizations could be performed but its unlikely that much additional complexity would be justified by performance increases. Most changes change the line length so don't even go into this code.

       
      • Neil Hodgson

        Neil Hodgson - 2019-04-27

        Committed these with minor changes and documentation as [cdb7af] and [93d24d].

         

        Related

        Commit: [93d24d]
        Commit: [cdb7af]

  • Zufu Liu

    Zufu Liu - 2019-04-26

    If the loop exits early, allSame will be false, so the result LineLayout::llInvalid not changed.

    Other idea (since force casing is rarely used):

    if (vstyle.someStylesForceCase) {
        ...
    } else {
        Sci::Position charInDoc = posLineStart;
        while (numCharsInLine < lineLength) {
            styleByte = model.pdoc->StyleIndexAt(charInDoc);
            if ((ll->styles[numCharsInLine] != styleByte) || (model.pdoc->CharAt(charInDoc) != ll->chars[numCharsInLine])) {
                allSame = false;
                break;
            }
            ++numCharsInLine;
            ++charInDoc;
        }
    }
    
     
    • Zufu Liu

      Zufu Liu - 2019-04-26
       

      Last edit: Zufu Liu 2019-04-26
    • Neil Hodgson

      Neil Hodgson - 2019-04-27

      Its difficult to find a case where the checking code shows up in a profile. Best example so far is to open a 1 Megabyte C++ file, set layout caching to SC_CACHE_DOCUMENT, turn on wrap (to style/relayout whole file after each change), set identifiers to case:upper, then replace all '=' with '!' to affect many lines without changing their length, then undo and redo this until there is a minute of profile data.

      The result is the allSame loop taking up 0.3% of the whole profile and 5% of EditView::LayoutLine. The profile: image of the profile

       

      Last edit: Neil Hodgson 2019-04-27
  • Zufu Liu

    Zufu Liu - 2019-04-28

    It maybe has no difference. I also found the difference (early exiting) seems only happens on current edited line and depends on the edit position.
    with SC_CACHE_PAGE, I only got less than 4% of whole lineLength == ll->numCharsInLineif block where allSame=false.

    Not related to this issue, in ScintillaBase.h, only the first method is implemented:

    #ifdef SCI_LEXER
        LexState *DocumentLexState();
        void SetLexer(uptr_t wParam);
        void SetLexerLanguage(const char *languageName);
        void Colourise(int start, int end);
    #endif
    
     
    • Neil Hodgson

      Neil Hodgson - 2019-04-28

      LayoutLine is a very central method and does have big performance effects in some modes but the allSame loop doesn't appear to be a significant contributor to performance so should be written as clearly as possible. At a high level, its just checking whether the characters and styles have changed since the positions were calculated. Case forcing makes it more complicated.

      Checking the characters isn't that dissimilar to

      bool allSame = std::equal(ll->chars.get(),
          &ll->chars[lineLength],
          model.pdoc->RangePointer(posLineStart, lineLength));
      

      which works but doesn't account for case forcing. It may also move the gap which isn't so nice for a read-only operation although an iterator could be built to make this more natural. Due to the need to access the corresponding style byte for case forcing and the previous character for camel case, it doesn't even fit with a lambda:

      bool allSame = std::equal(ll->chars.get(),
          &ll->chars[lineLength],
          model.pdoc->RangePointer(posLineStart, lineLength),
          [] (char a, char b) {
              return a == CaseForce(?, b, ?);
          }
      );
      

      An actual loop appears to be the clearest option. Checking allSame at loop head doesn't complicate the code that much and will likely be a tiny win.

      At more than 200 lines, it may be worthwhile moving some of the LayoutLine code out into separate methods and checking whether the characters and styles are different would be a reasonable candidate. The wrapping code (or the core wrapping loop without the preparation) might also be worth its own method particularly if it becomes more complex.

      in ScintillaBase.h, only the first method is implemented

      These were moved to LexInterface / LexState at some point. This code will be modified much more as part of isolating lexing functionality into an optionally separate Lexilla library. I don't want to make other changes wile working on this isolation as it moves the code away from when the isolation work branched.

       
  • Neil Hodgson

    Neil Hodgson - 2019-05-24
    • labels: --> scintilla, layout
    • status: open --> closed
     
  • Neil Hodgson

    Neil Hodgson - 2019-05-24

    Closing this as the power-of-two change that initiated the issue doesn't appear worth pursuing.

    There were some other concerns bought up but they don't seem sufficiently important to keep this issue open.

     

Log in to post a comment.