Menu

#1535 Optimize DBCS `MovePositionOutsideChar()` and `NextPosition()`

Committed
closed
5
2024-12-18
2024-11-26
Zufu Liu
No

Call LineStartPosition(pos) is expensive inside both function.

For MovePositionOutsideChar() function, posStartLine can be removed: when pos is at line start, its previous character can only be CR, LF or NUL, which is not lead byte, so following code is empty loop.

Related

Feature Requests: #1533
Feature Requests: #1558

Discussion

  • Zufu Liu

    Zufu Liu - 2024-11-26

    Patch for MovePositionOutsideChar():

    @@ -833,15 +833,9 @@
                    // Else invalid UTF-8 so return position of isolated trail byte
                }
            } else {
    -           // Anchor DBCS calculations at start of line because start of line can
    -           // not be a DBCS trail byte.
    -           const Sci::Position posStartLine = LineStartPosition(pos);
    -           if (pos == posStartLine)
    -               return pos;
    -
                // Step back until a non-lead-byte is found.
                Sci::Position posCheck = pos;
    -           while ((posCheck > posStartLine) && IsDBCSLeadByteNoExcept(cb.CharAt(posCheck-1)))
    +           while ((posCheck > 0) && IsDBCSLeadByteNoExcept(cb.CharAt(posCheck-1)))
                    posCheck--;
    
                // Check from known start of character.
    

    Not yet figure out how to eliminate posStartLine in NextPosition().

     
  • Zufu Liu

    Zufu Liu - 2024-11-27

    Patch also removes posStartLine for NextPosition():

    @@ -916,14 +910,11 @@
                    if (pos > cb.Length())
                        pos = cb.Length();
                } else {
    -               // Anchor DBCS calculations at start of line because start of line can
    -               // not be a DBCS trail byte.
    -               const Sci::Position posStartLine = LineStartPosition(pos);
    -               // See http://msdn.microsoft.com/en-us/library/cc194792%28v=MSDN.10%29.aspx
    -               // http://msdn.microsoft.com/en-us/library/cc194790.aspx
    -               if ((pos - 1) <= posStartLine) {
    -                   return pos - 1;
    -               } else if (IsDBCSLeadByteNoExcept(cb.CharAt(pos - 1))) {
    +               // How to Go Backward in a DBCS String
    +               // https://msdn.microsoft.com/en-us/library/cc194792.aspx
    +               // DBCS-Enabled Programs vs. Non-DBCS-Enabled Programs
    +               // https://msdn.microsoft.com/en-us/library/cc194790.aspx
    +               if (IsDBCSLeadByteNoExcept(cb.CharAt(pos - 1))) {
                        // Should actually be trail byte
                        if (IsDBCSDualByteAt(pos - 2)) {
                            return pos - 2;
    @@ -934,7 +925,7 @@
                    } else {
                        // Otherwise, step back until a non-lead-byte is found.
                        Sci::Position posTemp = pos - 1;
    -                   while (posStartLine <= --posTemp && IsDBCSLeadByteNoExcept(cb.CharAt(posTemp)))
    +                   while (--posTemp >= 0 && IsDBCSLeadByteNoExcept(cb.CharAt(posTemp)))
                            ;
                        // Now posTemp+1 must point to the beginning of a character,
                        // so figure out whether we went back an even or an odd
    

    even with this change, DBCS backward brace match using NextPosition() is still much slower than using MovePositionOutsideChar().

     
  • Zufu Liu

    Zufu Liu - 2024-11-29
    • labels: Scintilla, encoding, dbcs --> Scintilla, encoding, dbcs, optimization
     
  • Neil Hodgson

    Neil Hodgson - 2024-11-30
    • Group: Initial --> Committed
     
  • Neil Hodgson

    Neil Hodgson - 2024-11-30

    Committed with [24545b] and [2c0dbb].

     

    Related

    Commit: [24545b]
    Commit: [2c0dbb]

  • Neil Hodgson

    Neil Hodgson - 2024-12-18
    • status: open --> closed
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.