Menu

#1557 Simplify and optimize WrapLine()

Committed
closed
5
2025-06-17
2025-05-19
Zufu Liu
No

Split out from [feature-requests:#1417].

  1. following block is redundant:
if (!foundBreak) {
    // Try moving to start of last character
    if (p > 0) {
        lastGoodBreak = CharacterBoundary(p, -1);
    }
}

after

// backtrack to find lastGoodBreak
Sci::Position lastGoodBreak = p;
if (p > 0) {
    lastGoodBreak = CharacterBoundary(p, -1);
}

lastLineStart <= lastGoodBreak <= numCharsBeforeEOL, following code only update lastGoodBreak to value larger then lastLineStart.

  1. remaining use of CharacterBoundary() can be changed to use pdoc->NextPosition().

Related

Feature Requests: #1417

Discussion

1 2 > >> (Page 1 of 2)
  • Zufu Liu

    Zufu Liu - 2025-05-19

    Patch change CharacterBoundary to use pdoc->NextPosition(), remove the redundant block, and change afterWrap = CharacterBoundary(lastGoodBreak, 1).

     
  • Neil Hodgson

    Neil Hodgson - 2025-05-20

    Committed as [1b7610]. Also removed last write of foundBreak which was never read.

     

    Related

    Commit: [1b7610]

    • Zufu Liu

      Zufu Liu - 2025-05-20

      I just want to revert the change for CharacterBoundary() as currently MovePositionOutsideChar() is faster than NextPosition() (see [feature-requests:#1558]).

       

      Related

      Feature Requests: #1558

      • Zufu Liu

        Zufu Liu - 2025-05-20

        Patch revert CharacterBoundary() to MovePositionOutsideChar(), current NextPosition() version looks more readable though.

         
  • Neil Hodgson

    Neil Hodgson - 2025-05-20
    • Group: Initial --> Committed
     
  • Neil Hodgson

    Neil Hodgson - 2025-05-20

    I'm not all that worried about the performance degradation and expect that NextPosition can be made faster.

     
  • Zufu Liu

    Zufu Liu - 2025-05-21

    The performance degradation should not be observed for general UTF-8 file.

    For file contains very long lines, LineLayout::AddLineStart() is the bottleneck, following changes will improve the speed a lot:
    1. Change growth factor, e.g. to 1.5. + 4 here is to ensure the code works when lines=1 && lenLineStarts=0.
    2. minimum lenLineStarts (Wrap::Char for ASCII) can be estimated by divide line width with wrap width, optimal lenLineStarts can be estimated by multiple a factor (e.g. 1.25, 1.5).

    rough changes:

    @@ -203,7 +203,7 @@
     void LineLayout::AddLineStart(Sci::Position start) {
        lines++;
        if (lines >= lenLineStarts) {
    -       const int newMaxLines = lines + 20;
    +       const int newMaxLines = lines + 4 + lines/2;
            std::unique_ptr<int[]> newLineStarts = std::make_unique<int[]>(newMaxLines);
            if (lenLineStarts) {
                std::copy(lineStarts.get(), lineStarts.get() + lenLineStarts, newLineStarts.get());
    @@ -341,6 +341,11 @@
        auto CharacterBoundary = [=](Sci::Position i, int moveDir) noexcept -> Sci::Position {
            return pdoc->NextPosition(i + posLineStart, moveDir) - posLineStart;
        };
    +   const int newMaxLines = static_cast<int>(positions[numCharsInLine]/(wrapWidth - wrapIndent)) + 2;
    +   if (newMaxLines > lenLineStarts) {
    +       lineStarts = std::make_unique<int[]>(newMaxLines);
    +       lenLineStarts = newMaxLines;
    +   }
        lines = 0;
        // Calculate line start positions based upon width.
        Sci::Position lastLineStart = 0;
    
     

    Last edit: Zufu Liu 2025-05-21
    • Zufu Liu

      Zufu Liu - 2025-05-22

      I end up with just const int newMaxLines = lines + 20 + lines/2;, as long line is very rare, reduce initial 10~20 allocations does not speed up a lot.

       
  • Neil Hodgson

    Neil Hodgson - 2025-05-22

    I was thinking that the total width / line width approximation would avoid over-allocation for a file containing short paragraphs. It is only 20*4 bytes though so maybe not significant when there has to be an allocation and the rest of LineLayout is taking so much space.

     
  • Zufu Liu

    Zufu Liu - 2025-05-23

    I was thinking that the total width / line width approximation would avoid over-allocation for a file containing short paragraphs.

    The computation is unnecessary for manual written code file (already breaks into short lines), only benefits program generated data file with huge lines, and change growth factor will defect the slower.

    I end up with just const int newMaxLines = lines + 20 + lines/2;, as long line is very rare, reduce initial 10~20 allocations does not speed up a lot.

    Use 2x growth factor (libc++, libstdc++) yields simple code than 1.5x (MSVC), also reduces total allocations.

    @@ -203,12 +203,12 @@
     void LineLayout::AddLineStart(Sci::Position start) {
        lines++;
        if (lines >= lenLineStarts) {
    -       const int newMaxLines = lines + 20;
    +       const int newMaxLines = lines + 4 + lines;
            std::unique_ptr<int[]> newLineStarts = std::make_unique<int[]>(newMaxLines);
            if (lenLineStarts) {
                std::copy(lineStarts.get(), lineStarts.get() + lenLineStarts, newLineStarts.get());
            }
    -       lineStarts = std::move(newLineStarts);
    +       lineStarts.swap(newLineStarts);
            lenLineStarts = newMaxLines;
        }
        lineStarts[lines] = static_cast<int>(start);
    @@ -389,8 +389,8 @@
                        lastGoodBreak = CharacterBoundary(lastGoodBreak, 1);
                    }
                }
    +           AddLineStart(lastGoodBreak);
                lastLineStart = lastGoodBreak;
    -           AddLineStart(lastLineStart);
                startOffset = positions[lastLineStart];
                // take into account the space for start wrap mark and indent
                startOffset += wrapWidth - wrapIndent;
    
     

    Last edit: Zufu Liu 2025-05-23
  • Neil Hodgson

    Neil Hodgson - 2025-05-23

    OK, except for the std::swap which appears to be less conventional than std::move. [bc402a]

     

    Related

    Commit: [bc402a]

    • Zufu Liu

      Zufu Liu - 2025-05-23

      use swap() or std::swap() fixes bad code gen (two operator delete[] calls) for = std::move().

       
      • Neil Hodgson

        Neil Hodgson - 2025-05-24

        I don't see double deletion occurring with either debug or release MSVC 2022 17.13.7.

         
  • Zufu Liu

    Zufu Liu - 2025-05-24

    In terminal, compile cl /utf-8 /W4 /c /EHsc /std:c++20 /O2 /FAcs /DNDEBUG /I../include PositionCache.cxx, then find void LineLayout::AddLineStart inside PositionCache.cod:

    _TEXT   SEGMENT
    newLineStarts$1 = 32
    this$ = 80
    start$ = 88
    ?AddLineStart@LineLayout@Internal@Scintilla@@QEAAX_J@Z PROC ; Scintilla::Internal::LineLayout::AddLineStart, COMDAT
    
    ; 203  : void LineLayout::AddLineStart(Sci::Position start) {
    
    $LN82:
      00000 40 57        push    rdi
      00002 41 57        push    r15
      00004 48 83 ec 38  sub     rsp, 56            ; 00000038H
    
    ; 204  :    lines++;
    
      00008 8b 41 6c     mov     eax, DWORD PTR [rcx+108]
      0000b 4c 8b fa     mov     r15, rdx
      0000e ff c0        inc     eax
      00010 48 8b f9     mov     rdi, rcx
      00013 89 41 6c     mov     DWORD PTR [rcx+108], eax
    
    ; 205  :    if (lines >= lenLineStarts) {
    
      00016 3b 41 10     cmp     eax, DWORD PTR [rcx+16]
      00019 0f 8c be 00 00
        00       jl  $LN57@AddLineSta
      0001f 48 89 5c 24 58   mov     QWORD PTR [rsp+88], rbx
      00024 48 89 6c 24 60   mov     QWORD PTR [rsp+96], rbp
      00029 48 89 74 24 68   mov     QWORD PTR [rsp+104], rsi
      0002e 4c 89 74 24 30   mov     QWORD PTR [rsp+48], r14
    
    ; 206  :        const int newMaxLines = lines * 2 + 4;
    
      00033 44 8d 34 45 04
        00 00 00     lea     r14d, DWORD PTR [rax*2+4]
    
    ; 207  :        std::unique_ptr<int[]> newLineStarts = std::make_unique<int[]>(newMaxLines);
    
      0003b 49 63 ce     movsxd  rcx, r14d
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\memory
    
    ; 3636 :     return unique_ptr<_Ty>(new _Elem[_Size]());
    
      0003e b8 04 00 00 00   mov     eax, 4
      00043 48 f7 e1     mul     rcx
      00046 48 8b d8     mov     rbx, rax
      00049 48 c7 c0 ff ff
        ff ff        mov     rax, -1
      00050 48 0f 42 d8  cmovb   rbx, rax
      00054 48 8b cb     mov     rcx, rbx
      00057 e8 00 00 00 00   call    ??_U@YAPEAX_K@Z        ; operator new[]
      0005c 48 8b e8     mov     rbp, rax
      0005f 48 85 c0     test    rax, rax
      00062 74 0f        je  SHORT $LN8@AddLineSta
      00064 4c 8b c3     mov     r8, rbx
      00067 33 d2        xor     edx, edx
      00069 48 8b c8     mov     rcx, rax
      0006c e8 00 00 00 00   call    memset
      00071 eb 02        jmp     SHORT $LN9@AddLineSta
    $LN8@AddLineSta:
      00073 33 ed        xor     ebp, ebp
    $LN9@AddLineSta:
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 208  :        if (lenLineStarts) {
    
      00075 48 63 47 10  movsxd  rax, DWORD PTR [rdi+16]
      00079 48 8d 77 08  lea     rsi, QWORD PTR [rdi+8]
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\xmemory
    
    ; 1531 :         : _Ty1(), _Myval2(_STD forward<_Other2>(_Val2)...) {}
    
      0007d 48 8b dd     mov     rbx, rbp
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 208  :        if (lenLineStarts) {
    
      00080 85 c0        test    eax, eax
      00082 74 12        je  SHORT $LN24@AddLineSta
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\xutility
    
    ; 4751 :     _CSTD memmove(_Dest_ch, _First_ch, _Byte_count);
    
      00084 48 8b 16     mov     rdx, QWORD PTR [rsi]
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 209  :            std::copy(lineStarts.get(), lineStarts.get() + lenLineStarts, newLineStarts.get());
    
      00087 4c 8b c0     mov     r8, rax
      0008a 49 c1 e0 02  shl     r8, 2
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\xutility
    
    ; 4751 :     _CSTD memmove(_Dest_ch, _First_ch, _Byte_count);
    
      0008e 48 8b cd     mov     rcx, rbp
      00091 e8 00 00 00 00   call    memmove
    $LN24@AddLineSta:
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\memory
    
    ; 3525 :         if (this != _STD addressof(_Right)) {
    
      00096 48 8d 44 24 20   lea     rax, QWORD PTR newLineStarts$1[rsp]
      0009b 48 3b f0     cmp     rsi, rax
      0009e 74 18        je  SHORT $LN50@AddLineSta
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\utility
    
    ; 773  :     _Ty _Old_val = static_cast<_Ty&&>(_Val);
    
      000a0 48 8b 0e     mov     rcx, QWORD PTR [rsi]
    
    ; 774  :     _Val         = static_cast<_Other&&>(_New_val);
    
      000a3 33 db        xor     ebx, ebx
      000a5 48 89 2e     mov     QWORD PTR [rsi], rbp
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\memory
    
    ; 3613 :         if (_Old) {
    
      000a8 48 85 c9     test    rcx, rcx
      000ab 74 0b        je  SHORT $LN50@AddLineSta
    
    ; 3323 :         delete[] _Ptr;
    
      000ad e8 00 00 00 00   call    ??_V@YAXPEAX@Z     ; operator delete[]
      000b2 44 89 77 10  mov     DWORD PTR [rdi+16], r14d
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\xutility
    
    ; 4751 :     _CSTD memmove(_Dest_ch, _First_ch, _Byte_count);
    
      000b6 eb 11        jmp     SHORT $LN79@AddLineSta
    $LN50@AddLineSta:
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 212  :        lenLineStarts = newMaxLines;
    
      000b8 44 89 77 10  mov     DWORD PTR [rdi+16], r14d
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\memory
    
    ; 3572 :         if (_Mypair._Myval2) {
    
      000bc 48 85 db     test    rbx, rbx
      000bf 74 08        je  SHORT $LN79@AddLineSta
    
    ; 3323 :         delete[] _Ptr;
    
      000c1 48 8b cb     mov     rcx, rbx
      000c4 e8 00 00 00 00   call    ??_V@YAXPEAX@Z     ; operator delete[]
    $LN79@AddLineSta:
      000c9 48 8b 74 24 68   mov     rsi, QWORD PTR [rsp+104]
      000ce 48 8b 6c 24 60   mov     rbp, QWORD PTR [rsp+96]
      000d3 48 8b 5c 24 58   mov     rbx, QWORD PTR [rsp+88]
      000d8 4c 8b 74 24 30   mov     r14, QWORD PTR [rsp+48]
    $LN57@AddLineSta:
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 214  :    lineStarts[lines] = static_cast<int>(start);
    
      000dd 48 63 4f 6c  movsxd  rcx, DWORD PTR [rdi+108]
      000e1 48 8b 47 08  mov     rax, QWORD PTR [rdi+8]
      000e5 44 89 3c 88  mov     DWORD PTR [rax+rcx*4], r15d
    
    ; 215  : }
    
      000e9 48 83 c4 38  add     rsp, 56            ; 00000038H
      000ed 41 5f        pop     r15
      000ef 5f       pop     rdi
      000f0 c3       ret     0
    ?AddLineStart@LineLayout@Internal@Scintilla@@QEAAX_J@Z ENDP ; Scintilla::Internal::LineLayout::AddLineStart
    _TEXT   ENDS
    
     

    Last edit: Zufu Liu 2025-05-24
  • Zufu Liu

    Zufu Liu - 2025-05-24

    generated code with swap is shorter and contains only one operator delete[]:

    _TEXT   SEGMENT
    this$ = 64
    start$ = 72
    ?AddLineStart@LineLayout@Internal@Scintilla@@QEAAX_J@Z PROC ; Scintilla::Internal::LineLayout::AddLineStart, COMDAT
    
    ; 203  : void LineLayout::AddLineStart(Sci::Position start) {
    
    $LN62:
      00000 40 53        push    rbx
      00002 57       push    rdi
      00003 41 56        push    r14
      00005 48 83 ec 20  sub     rsp, 32            ; 00000020H
    
    ; 204  :    lines++;
    
      00009 8b 41 6c     mov     eax, DWORD PTR [rcx+108]
      0000c 48 8d 59 6c  lea     rbx, QWORD PTR [rcx+108]
      00010 ff c0        inc     eax
      00012 4c 8b f2     mov     r14, rdx
      00015 48 8b f9     mov     rdi, rcx
      00018 89 03        mov     DWORD PTR [rbx], eax
    
    ; 205  :    if (lines >= lenLineStarts) {
    
      0001a 3b 41 10     cmp     eax, DWORD PTR [rcx+16]
      0001d 0f 8c 95 00 00
        00       jl  $LN43@AddLineSta
      00023 48 89 6c 24 48   mov     QWORD PTR [rsp+72], rbp
      00028 48 89 74 24 50   mov     QWORD PTR [rsp+80], rsi
      0002d 4c 89 7c 24 58   mov     QWORD PTR [rsp+88], r15
    
    ; 206  :        const int newMaxLines = lines * 2 + 4;
    
      00032 44 8d 3c 45 04
        00 00 00     lea     r15d, DWORD PTR [rax*2+4]
    
    ; 207  :        std::unique_ptr<int[]> newLineStarts = std::make_unique<int[]>(newMaxLines);
    
      0003a 49 63 cf     movsxd  rcx, r15d
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\memory
    
    ; 3636 :     return unique_ptr<_Ty>(new _Elem[_Size]());
    
      0003d b8 04 00 00 00   mov     eax, 4
      00042 48 f7 e1     mul     rcx
      00045 48 8b e8     mov     rbp, rax
      00048 48 c7 c0 ff ff
        ff ff        mov     rax, -1
      0004f 48 0f 42 e8  cmovb   rbp, rax
      00053 48 8b cd     mov     rcx, rbp
      00056 e8 00 00 00 00   call    ??_U@YAPEAX_K@Z        ; operator new[]
      0005b 48 8b f0     mov     rsi, rax
      0005e 48 85 c0     test    rax, rax
      00061 74 0f        je  SHORT $LN8@AddLineSta
      00063 4c 8b c5     mov     r8, rbp
      00066 33 d2        xor     edx, edx
      00068 48 8b c8     mov     rcx, rax
      0006b e8 00 00 00 00   call    memset
      00070 eb 02        jmp     SHORT $LN9@AddLineSta
    $LN8@AddLineSta:
      00072 33 f6        xor     esi, esi
    $LN9@AddLineSta:
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 208  :        if (lenLineStarts) {
    
      00074 48 63 47 10  movsxd  rax, DWORD PTR [rdi+16]
      00078 48 8b 6c 24 48   mov     rbp, QWORD PTR [rsp+72]
      0007d 85 c0        test    eax, eax
      0007f 74 17        je  SHORT $LN55@AddLineSta
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\xutility
    
    ; 4751 :     _CSTD memmove(_Dest_ch, _First_ch, _Byte_count);
    
      00081 48 8b 57 08  mov     rdx, QWORD PTR [rdi+8]
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 209  :            std::copy(lineStarts.get(), lineStarts.get() + lenLineStarts, newLineStarts.get());
    
      00085 4c 8b c0     mov     r8, rax
      00088 49 c1 e0 02  shl     r8, 2
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\xutility
    
    ; 4751 :     _CSTD memmove(_Dest_ch, _First_ch, _Byte_count);
    
      0008c 48 8b ce     mov     rcx, rsi
      0008f e8 00 00 00 00   call    memmove
      00094 48 8d 5f 6c  lea     rbx, QWORD PTR [rdi+108]
    $LN55@AddLineSta:
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\utility
    
    ; 139  :     _Ty _Tmp = _STD move(_Left);
    
      00098 48 8b 4f 08  mov     rcx, QWORD PTR [rdi+8]
    
    ; 140  :     _Left    = _STD move(_Right);
    
      0009c 48 89 77 08  mov     QWORD PTR [rdi+8], rsi
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 213  :        lenLineStarts = newMaxLines;
    
      000a0 48 8b 74 24 50   mov     rsi, QWORD PTR [rsp+80]
      000a5 44 89 7f 10  mov     DWORD PTR [rdi+16], r15d
      000a9 4c 8b 7c 24 58   mov     r15, QWORD PTR [rsp+88]
    ; File D:\Program Files\Microsoft Visual Studio\2022\VC\Tools\MSVC\14.44.35207\include\memory
    
    ; 3572 :         if (_Mypair._Myval2) {
    
      000ae 48 85 c9     test    rcx, rcx
      000b1 74 05        je  SHORT $LN43@AddLineSta
    
    ; 3323 :         delete[] _Ptr;
    
      000b3 e8 00 00 00 00   call    ??_V@YAXPEAX@Z     ; operator delete[]
    $LN43@AddLineSta:
    ; File D:\notepad4\scintilla\src\PositionCache.cxx
    
    ; 215  :    lineStarts[lines] = static_cast<int>(start);
    
      000b8 48 63 0b     movsxd  rcx, DWORD PTR [rbx]
      000bb 48 8b 47 08  mov     rax, QWORD PTR [rdi+8]
      000bf 44 89 34 88  mov     DWORD PTR [rax+rcx*4], r14d
    
    ; 216  : }
    
      000c3 48 83 c4 20  add     rsp, 32            ; 00000020H
      000c7 41 5e        pop     r14
      000c9 5f       pop     rdi
      000ca 5b       pop     rbx
      000cb c3       ret     0
    ?AddLineStart@LineLayout@Internal@Scintilla@@QEAAX_J@Z ENDP ; Scintilla::Internal::LineLayout::AddLineStart
    _TEXT   ENDS
    
     
  • Zufu Liu

    Zufu Liu - 2025-05-24

    Another patch simplified Editor::WrapLines() and LineLayout::Resize().

    For LineLayout, call Free() is redundant inside destructor and Resize(), removed bidiData->Resize() as bidiData is already freed.

     
    • Neil Hodgson

      Neil Hodgson - 2025-05-24

      Without any destructor code, can be subjected to rule-of-zero.

       
      • Zufu Liu

        Zufu Liu - 2025-05-24

        deleted unused stuffs, but not convert the constructor to use default member init.

         
  • Zufu Liu

    Zufu Liu - 2025-05-29

    Rewrite LineLayout::SubLineFromPosition() to avoid out of bound access when line = lines - 1.

     
    • Neil Hodgson

      Neil Hodgson - 2025-05-31

      That function appears incorrect as lines is ascending and the test is lineStarts[line] <= posInLine which may break early. Example:

      lineStarts=[0,10,20,30]
      posInLine=25
      -> line == 1
      return 0
      

      This method is only active for bidirectional mode so will have limited impact.

       
  • Zufu Liu

    Zufu Liu - 2025-05-31

    swap comparison to posInLine < lineStarts[line] looks would fix the bug.

     
    • Zufu Liu

      Zufu Liu - 2025-06-01

      swap comparison to posInLine < lineStarts[line] looks would fix the bug.

      or lineStarts[line] > posInLine.

       
      • Zufu Liu

        Zufu Liu - 2025-06-16

        earlier return inside LineLayout::SubLineFromPosition() is still not fixed.

         
    • Neil Hodgson

      Neil Hodgson - 2025-06-01

      That seems to need to also invert the posInLine adjustment to

      posInLine += FlagSet(pe, PointEnd::subLineEnd) ? -1 : 0;
      

      The only place this is needed is for the VoiceOver feature on macOS with bidirectional=1 to ensure that the VoiceOver 'cursor' selects whole sublines instead of a narrow rectangle over 2 line starts.

       
1 2 > >> (Page 1 of 2)

Log in to post a comment.