Menu

#1582 Optimize `Document::SetStyles()` and `Document::SetStyleFor()`

Committed
open
5
2026-03-31
2026-03-09
Zufu Liu
No

Using idea of SplitView, Document::SetStyles() and Document::SetStyleFor() can be optimized to update style on each segment, this eliminated four boundary check (style.ValueAt() + style.SetValueAt()) for each byte. when precise range for ModificationFlags::ChangeStyle notification is not needed (like Document::SetStyleFor()), memcmp, memcpy, and memset can be used to set new styles, attached patch does this (use single method for CellBuffer to reduce code size, Document::SetStyleFor() and Document::SetStyles() can be changed to call a new method to reduce duplication).

1 Attachments

Discussion

1 2 > >> (Page 1 of 2)
  • Neil Hodgson

    Neil Hodgson - 2026-03-09

    This appears to produce a larger range in the ChangeStyle notification that could require more work for the application to process.

     
    • Zufu Liu

      Zufu Liu - 2026-03-09

      It's same as Document::SetStyleFor(), which report entail range as changed.

       
      • Neil Hodgson

        Neil Hodgson - 2026-03-13

        SetStyles is the main method used by lexers so is the dominant cause of ChangeStyle notifications.

        It may be useful to report the truly changed range for SetStyleFor.

        Better performance might be possible by pushing more of SetStylesdeeper into CellBuffer or SplitVector but this should only be done if there is a significant benefit.

         
        • Zufu Liu

          Zufu Liu - 2026-03-13

          Styling time is reduced by 1/3 (measured with Notepad4) with the patch, not measured byte comparison/updating/precise range version.

           

          Last edit: Zufu Liu 2026-03-16
  • Zufu Liu

    Zufu Liu - 2026-03-16

    The result for https://dumps.wikimedia.org/enwiki/20260301/enwiki-20260301-pages-articles-multistream11.xml-p6899367p7054859.bz2 (unpack, then add empty line at begging to avoid brace match) on my system:

    SciTE Old SciTE this patch Notepad4
    2940 2720 1850

    Styling performance for SciTE is measured with following changes:

    SciTEGlobal.properties
    file.size.large=2147483647
    file.size.no.styles=2147483647
    
    diff -r 1bab1e108d98 src/SciTEIO.cxx
    --- a/src/SciTEIO.cxx
    +++ b/src/SciTEIO.cxx
    @@ -56,6 +56,9 @@
     #include "Searcher.h"
     #include "SciTEBase.h"
    
    +#include <chrono>
    +#include "ElapsedPeriod.h"
    +
     #if defined(GTK)
     const GUI::gui_char propUserFileName[] = GUI_TEXT(".SciTEUser.properties");
     #elif defined(__APPLE__)
    @@ -443,6 +446,12 @@
    
        CurrentBuffer()->CompleteLoading();
    
    
    +   wEditor.SetModEventMask(SA::ModificationFlags::InsertText | SA::ModificationFlags::DeleteText);
    +   Scintilla::Internal::ElapsedPeriod period;
    +   wEditor.Colourise(0, -1);
    +   const double duration = period.Duration()*1e3;
    +   printf("Colourise duration=%.6f\n", duration);
    +
        Redraw();
     }
    
    diff -r 1bab1e108d98 win32/SciTEWin.cxx
    --- a/win32/SciTEWin.cxx
    +++ b/win32/SciTEWin.cxx
    @@ -2371,6 +2371,11 @@
     #endif
    
     int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE, LPSTR, int) {
    
    +   if (AttachConsole(ATTACH_PARENT_PROCESS)) {
    +       freopen("CONOUT$", "w", stdout);
    +       freopen("CONOUT$", "w", stderr);
    +       fprintf(stdout, "\n%s:%d %s\n", __FILE__, __LINE__, __FUNCTION__);
    +   }
    
        RestrictDLLPath();
    

    Following is changes for Notepad4 (tested x64 build):

    diff --git a/scintilla/src/Document.cxx b/scintilla/src/Document.cxx
    index 18fe3da7..1ea24ff7 100644
    --- a/scintilla/src/Document.cxx
    +++ b/scintilla/src/Document.cxx
    @@ -90,7 +90,7 @@ void LexInterface::Colourise(Sci::Position start, Sci::Position end) {
                instance->Lex(start, len, styleStart, pdoc);
                instance->Fold(start, len, styleStart, pdoc);
                if (enableUrlHighlight) {
    
    -               pdoc->HighlightUrl(start, len, urlIgnoreStyle);
    +               // pdoc->HighlightUrl(start, len, urlIgnoreStyle);
                }
            }
    
    diff --git a/src/Notepad4.cpp b/src/Notepad4.cpp
    index ccf551ae..9266e765 100644
    --- a/src/Notepad4.cpp
    +++ b/src/Notepad4.cpp
    @@ -464,7 +464,7 @@ BOOL WINAPI ConsoleHandlerRoutine(DWORD dwCtrlType) noexcept {
     int WINAPI wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPWSTR lpCmdLine, int nShowCmd) {
        UNREFERENCED_PARAMETER(hPrevInstance);
        UNREFERENCED_PARAMETER(lpCmdLine);
    -#if 0 // used for Clang UBSan or printing debug message on console.
    +#if 1 // used for Clang UBSan or printing debug message on console.
        if (AttachConsole(ATTACH_PARENT_PROCESS)) {
            SetConsoleCtrlHandler(ConsoleHandlerRoutine, TRUE);
            freopen("CONOUT$", "w", stdout);
    @@ -1714,9 +1714,9 @@ void EditCreate(HWND hwndParent) noexcept {
        SciCall_SetAdditionalCaretsBlink(true);
        SciCall_SetAdditionalCaretsVisible(true);
        // style both before and after the visible text in the background
    
    -   SciCall_SetIdleStyling(SC_IDLESTYLING_ALL);
    +   // SciCall_SetIdleStyling(SC_IDLESTYLING_ALL);
        // profile lexer performance
    -   //SciCall_SetIdleStyling(SC_IDLESTYLING_NONE);
    +   SciCall_SetIdleStyling(SC_IDLESTYLING_NONE);
    
        SciCall_AssignCmdKey((SCK_NEXT + (SCMOD_CTRL << 16)), SCI_PARADOWN);
        SciCall_AssignCmdKey((SCK_PRIOR + (SCMOD_CTRL << 16)), SCI_PARAUP);
    diff --git a/src/Styles.cpp b/src/Styles.cpp
    index c78af899..3d67dfd4 100644
    --- a/src/Styles.cpp
    +++ b/src/Styles.cpp
    @@ -1838,7 +1838,7 @@ void Style_SetLexer(PEDITLEXER pLexNew, BOOL bLexerChanged) noexcept {
            // SC_CACHE_PAGE depends on line height (i.e. styles in current lexer) and edit window height.
            SciCall_SetLayoutCache(SC_CACHE_PAGE);
    
    -#if 0
    +#if 1
            // profile lexer performance
            StopWatch watch;
            watch.Start();
    

    Not figured out why SciTE is about 1 second slow than Notepad4 (LexHTML in Notepad4 does not handles PHP, Python and other templates).

     
  • Neil Hodgson

    Neil Hodgson - 2026-03-19

    Providing a more precise change range can enable large optimizations when the modification drives an expensive operation like linting, spell checking or remote view updating.

    Only updating the necessary region has been an ongoing direction for other features like the ReplaceType::minimal option to ReplaceTarget.

    Something like the following could be a basis for a faster implementation. It would need to be expanded to handle the buffer split instead of compacting the buffer. It uses two simple functions firstDifference and lastDifference that takes 2 equally long blocks and returns a pointer to the first or last char in the first buffer that is different to the corresponding char in the second buffer.

    bool CellBuffer::SetStyles(Sci::Position position, const char *styles, Sci::Position length,
        Sci::Position &startMod, Sci::Position &endMod) {
        char *buf = style.BufferPointer();  // May move the gap so not wanted in final version
        char *start = buf + position;
        char *first = firstDifference(start, styles, length);
        if (!first) {
            // No difference
            return false;
        }
        char *last = lastDifference(start, styles, length);
        startMod = first - buf;
        endMod = last - buf;
        memcpy(first, styles + (first - start), last - first + 1);
        return true;
    }
    
     

    Last edit: Neil Hodgson 2026-03-19
  • Zufu Liu

    Zufu Liu - 2026-03-19

    A different implementation for precise range changes:

    bool SetStyleRange(char *data, char styleValue, Sci::Position length, Sci::Position &startMod, Sci::Position &endMod) noexcept {
        Sci::Position index = 0;
        while (index < length && data[index] == styleValue) {
            ++index;
        }
        --length;
        while (length > index && data[length] == styleValue) {
            --length;
        }
        ++length;
        if (index < length) {
            memset(data + index, static_cast<unsigned char>(styleValue), length - index);
            startMod = index;
            endMod = length;
            return true;
        }
        return false;
    }
    
    bool SetStylesRange(char *data, const char *styles, Sci::Position length, Sci::Position &startMod, Sci::Position &endMod) noexcept {
        Sci::Position index = 0;
        while (index < length && data[index] == styles[index]) {
            ++index;
        }
        --length;
        while (length > index && data[length] == styles[length]) {
            --length;
        }
        ++length;
        if (index < length) {
            memcpy(data + index, styles + index, length - index);
            startMod = index;
            endMod = length;
            return true;
        }
        return false;
    }
    
     
    • Zufu Liu

      Zufu Liu - 2026-03-19

      It can be changed to return a pair/range to simplify function signature, if the range is empty (end position is zero), then no style changes.

       
      • Zufu Liu

        Zufu Liu - 2026-03-19

        Return style changed range version of above changes.

         
        • Neil Hodgson

          Neil Hodgson - 2026-03-21

          The true path in the templates is always taken before the false and the range always starts out empty so the segment1 template argument and hence the template isn't needed. Maybe add a range extending method to StyleChangeRange as this is common to the functions.

          A similar reduction in modification range for unchanged values is currently implemented in Document::TrimReplacement so it may be useful in other contexts to include this in SplitVector. There is currently a SplitVector::GetRange to retrieve a range of values so a SplitVector::SetRange may be useful here and also in other situations. Then CellBuffer::SetStyles is mostly a combination of the two.

           
          • Zufu Liu

            Zufu Liu - 2026-03-21

            segment1 template argument is attempt to make compiler inline the function (as it only instantiated once), without it msvc will generate a function call for second segment (though removed range.Empty() check for first segment).

             
            • Neil Hodgson

              Neil Hodgson - 2026-03-21

              Inlining was one of the reasons for separating the range trimming from the memory copying. Functions like firstDifference are simple and likely to inline.

                  for (size_t i = 0; i < length; i++) {
                      if (a[i] != b[i]) {
                          return a + i;
                      }
                  }
              

              It can be widened to consider 4 or 8 bytes at a time if that performs better. Couldn't convince Visual C++ to auto-vectorize though.

              Many user actions are small and localized: typing a character in a comment or adding to the end of an identifier often only changes a single style byte.

               
              • Neil Hodgson

                Neil Hodgson - 2026-03-22

                std::mismatch is the standard library version of a check for first difference. Its likely less well optimized than a loop although it does take an std::execution so could potentially be parallelized or vectorized.

                 
                • Zufu Liu

                  Zufu Liu - 2026-03-22

                  parallelized (also find_first_not_of and find_last_not_of) looks overkill for a 4KB range.

                   
            • Neil Hodgson

              Neil Hodgson - 2026-03-23

              Attached my current version. The generated code seems good.

              struct ChangedRange {
                  Sci::Position start = -1;
                  Sci::Position end = -1;
                  ChangedRange() noexcept = default;
                  ChangedRange(Sci::Position start_, Sci::Position end_) noexcept : start(start_), end(end_) {}
                  void Offset(Sci::Position offset) noexcept {
                      if (start >= 0) {
                          start += offset;
                          end += offset;
                      }
                  }
                  bool Empty() const noexcept {
                      return start < 0;
                  }
                  void Merge(const ChangedRange &cr2) noexcept {
                      if (cr2.start >= 0) {
                          if (start < 0) {
                              *this = cr2;
                          } else {
                              end = cr2.end;
                          }
                      }
                  }
              };
              
              ChangedRange CopyBytes(char *p, const char *styles, Sci::Position length) noexcept {
                  for (Sci::Position start = 0; start < length; start++) {
                      if (p[start] != styles[start]) {
                          for (Sci::Position end = length - 1; end >= 0; end--) {
                              if (p[end] != styles[end]) {
                                  memcpy(p+start, styles+start, end-start+1);
                                  return { start, end };
                              }
                          }
                      }
                  }
                  return {};
              }
              
              ChangedRange CellBuffer::SetStyles(Sci::Position position, const char *styles, Sci::Position length) {
                  if (!hasStyles || (position < 0) || (length <= 0) || ((position + length) > style.Length())) {
                      return {};
                  }
              
                  // Divide length into two parts if it overlaps the gap
                  const Sci::Position part1Length = style.GapPosition();
                  const Sci::Position length1 = (position < part1Length) ?
                      std::min(length, part1Length - position) : 0;
                  const Sci::Position length2 = length - length1;
              
                  ChangedRange cr = CopyBytes(&style[position], styles, length1 ? length1 : length2);
                  cr.Offset(position);
                  if (length1 && length2) {
                      ChangedRange cr2 = CopyBytes(&style[position + length1], styles + length1, length2);
                      cr2.Offset(position + length1);
                      cr.Merge(cr2);
                  }
                  return cr;
              }
              
               
              • Zufu Liu

                Zufu Liu - 2026-03-23

                (position < 0) || (length <= 0) || ((position + length) > style.Length())

                assuming initial (after call Document::StartStyling()) endStyled is within document position range, this changed previously truncation behavior, though updated endStyled may larger then document length.

                Setting styles for positions outside the range of the buffer is safe and has no effect.

                 
                • Zufu Liu

                  Zufu Liu - 2026-03-24

                  truncation can be used to simplify lexer:

                  1. StyleContext could use styler.ColourTo(currentPos - 1, state); instead of styler.ColourTo(currentPos - ((currentPos > lengthDocument) ? 2 : 1), state);.
                  2. trailing word classify block (e.g. in LexHTML) could be omitted by increase end position/length like StyleContext does. for LexHTML it's only needed for PHP where ?> can be omitted and is preferred to avoid extra content.
                   
                • Neil Hodgson

                  Neil Hodgson - 2026-03-24

                  OK. Truncating length in SetStyles and SetStyleFor seems reasonable. Moving this and subdividing over the gap into a common SplitUpdate. When length1 is 0, SplitUpdate could move length2 to length1 similar to the length1 ? length1 : length2 logic. There's also similar splitting logic in GetRange.

                  ChangedRange CopyBytes(char *p, const char *styles, Sci::Position length, Sci::Position offset) noexcept {
                      for (Sci::Position start = 0; start < length; start++) {
                          if (p[start] != styles[start]) {
                              for (Sci::Position end = length - 1; end >= 0; end--) {
                                  if (p[end] != styles[end]) {
                                      memcpy(p+start, styles+start, end-start+1);
                                      return { start+offset, end+offset };
                                  }
                              }
                          }
                      }
                      return {};
                  }
                  
                  struct Lengths {
                      Sci::Position length1;
                      Sci::Position length2;
                  };
                  
                  Lengths SplitUpdate(const SplitVector<char> &style, Sci::Position position, Sci::Position length) noexcept {
                      length = std::min(length, style.Length() - position);
                  
                      // Divide length into two parts if it overlaps the gap
                      const Sci::Position part1Length = style.GapPosition();
                      const Sci::Position length1 = (position < part1Length) ?
                          std::min(length, part1Length - position) : 0;
                      return { length1, length - length1 };
                  }
                  
                  ChangedRange CellBuffer::SetStyles(Sci::Position position, const char *styles, Sci::Position length) {
                      if (!hasStyles || (position < 0) || (length <= 0)) {
                          return {};
                      }
                  
                      const auto [length1, length2] = SplitUpdate(style, position, length);
                      ChangedRange cr = CopyBytes(&style[position], styles, length1 ? length1 : length2, position);
                      if (length1 && length2) {
                          const ChangedRange cr2 = CopyBytes(&style[position + length1], styles + length1, length2, position + length1);
                          cr.Merge(cr2);
                      }
                      return cr;
                  }
                  

                  Edit: added CopyBytes since it added offset argument.
                  Edit 2: adding the offset argument to CopyBytes stops inlining so dropping that. The single range code runs for initial styling and the two range code runs for small edits since styling restarts at the beginning of a line and the editing mostly occurs after the line start, moving the gap to the caret.

                   

                  Last edit: Neil Hodgson 2026-03-24
  • Zufu Liu

    Zufu Liu - 2026-03-23

    Source code with your CopyBytes is shorter (msvc cl /utf-8 /W4 /c /EHsc /std:c++20 /O2 /GS- /GR- /FAcs /DNDEBUG /I../include CellBuffer.cxx not inline CopyBytes), but msvc generated asm with my SetStylesRange is shorter.

    changes compared to SetStyles0319Range.diff.

    -   ++length;
    -   if (index < length) {
    +   if (index <= length) {
    +       ++length;
    
    template <bool segment1>
    void SetStyleRange(char *data, char styleValue, Sci::Position length, StyleChangeRange &range, Sci::Position position) noexcept {
        Sci::Position index = 0;
        while (index < length && data[index] == styleValue) {
            ++index;
        }
        --length;
        while (length > index && data[length] == styleValue) {
            --length;
        }
        if (index <= length) {
            ++length;
            memset(data + index, static_cast<unsigned char>(styleValue), length - index);
            if (segment1 || range.Empty()) {
                range.startPos = index + position;
            }
            range.endPos = length + position;
        }
    }
    
    template <bool segment1>
    void SetStylesRange(char *data, const char *styles, Sci::Position length, StyleChangeRange &range, Sci::Position position) noexcept {
        Sci::Position index = 0;
        while (index < length && data[index] == styles[index]) {
            ++index;
        }
        --length;
        while (length > index && data[length] == styles[length]) {
            --length;
        }
        if (index <= length) {
            ++length;
            memcpy(data + index, styles + index, length - index);
            if (segment1 || range.Empty()) {
                range.startPos = index + position;
            }
            range.endPos = length + position;
        }
    }
    
     
    • Neil Hodgson

      Neil Hodgson - 2026-03-24

      Source code with your CopyBytes is shorter (msvc cl /utf-8 /W4 /c /EHsc /std:c++20 /O2 /GS- /GR- /FAcs /DNDEBUG /I../include CellBuffer.cxx not inline CopyBytes), but msvc generated asm with my SetStylesRange is shorter.

      Scintilla commonly builds with link time code generation which enables more inlining. That is cl /GL and link -LTCG.

       
      • Neil Hodgson

        Neil Hodgson - 2026-03-25

        Depending on which function gets more of the code, the inliner is bouncing between inlining CopyBytes into CellBuffer::SetStyles or inlining CellBuffer::SetStyles into Document::SetStyles. It is unlikely saving one call (the potential second CopyBytes) in this context is important..

         
        • Zufu Liu

          Zufu Liu - 2026-03-25

          OK.

           
  • Neil Hodgson

    Neil Hodgson - 2026-03-30

    Committed with [4f2ea7] and [281e29].

     

    Related

    Commit: [281e29]
    Commit: [4f2ea7]

  • Neil Hodgson

    Neil Hodgson - 2026-03-30
    • Group: Initial --> Committed
     
  • Zufu Liu

    Zufu Liu - 2026-03-31

    Update for endStyled += length; isn't consistent: one before the notification one after.

     
1 2 > >> (Page 1 of 2)

Log in to post a comment.

MongoDB Logo MongoDB