Menu

#1422 Improve measure width for long line with many non-ASCII characters

Initial
open
None
5
2022-01-22
2021-11-26
Zufu Liu
No

There is a test at https://github.com/zufuliu/notepad2/issues/396#issuecomment-977949317
it has a long line with random text (including bidi control characters) at the end of file. open in in SciTE or Notepad2, wait a while or scroll down, editor getting unresponsive regardless of whether word wrap is enabled or not, disable bidi control characters (in Notepad2, View -> Unicode Control Characters) has no help.

90% or more time is spent on pTextLayout->GetClusterMetrics() (inside SurfaceD2D::MeasureWidths()).
Some other editors (tested VSCode and Sublime Text) works very smooth (with word wrap enabled, both rendering bidi control character as hex, both rendering emoji with color, both uses DirectWrite?), so it seems there are better method to measure the width or handle long line text.

Related

Feature Requests: #1427

Discussion

  • Neil Hodgson

    Neil Hodgson - 2021-11-26

    The 7,266,428 byte line takes 40 seconds to layout on my machine with DirectWrite and 28 seconds with GDI. As random non-ASCII text, it doesn't benefit from caching optimizations.

    VS Code displays only the first 10,000 characters of the line in normal (non-wrapped) mode. While there could be a line truncate option, that removes most functionality on long lines. Its wrapped mode display has a ragged right edge and leaves some right hand characters invisible until scrolled into view so it may be wrapping on number of characters (or bytes) instead of actual character positions.

    There are some techniques that could be used here but they may only help in some cases and the line could just be made 10 times wider to again be too slow.

    Early on, there was some work on progressive refinement of line layouts. This was when the call to find all character locations could be slower than finding the total width of the text. Firstly, the line is broken into segments and the display width of each segment is measured but not the position of each character. These segment positions are stored into the layout but the other elements are either 0 or a linear interpolation between the measured locations. Then, when the position of a particular character is needed, its segment is fully laid out. This did not prove to be a significant benefit when it was tried but may be more beneficial on huge lines with the current platforms.

    A more likely technique is to multi-thread the layout of each line. This could only be done when the underlying platform is thread safe and would require some locking of the layout caches. Its likely this would yield a 4-8 times improvement on many machines.

    Non-accurate positioning could be tried. Estimate the positions of characters on long lines then refine the layout as an idle task, updating the display and caret as locations become more accurate. UI could remain responsive but may cause extra display updates and unexpected navigation effects.

     
  • Zufu Liu

    Zufu Liu - 2021-11-28

    measure total width of segment does not improve the speed,

    if (style.monospaceASCII) {
        const size_t length = sv.length();
        const XYPOSITION width = surface->WidthText(style.font.get(), sv) / length;
        for (size_t i = 0; i < length; i++) {
            positions[i] = width * (i + 1);
        }
        return;
    }
    

    I have tried to use East Asian Width https://www.unicode.org/reports/tr11/ to estimate the widths (the code is at https://github.com/zufuliu/notepad2/commit/a7074ba0435501109e37c9cc8c3a91c5d22d619e, ambiguous characters treated as narrow), it's very fast and works well for CJK text using monospaced font with GDI, but has wrong width for CJK characters with D2D.

     
  • Neil Hodgson

    Neil Hodgson - 2021-11-28

    In many cases, text is laid out according to the width property of each character. The code could keep a per-font map of character -> width then use this for a rapid approximation that is refined later or used as is when it is OK to do so.

    Kerning, ligatures, bidirectionality, glyph selection, and other factors refine this simplistic model of lay-out. Newer, more comprehensive text engines are more likely to differ.

    For the example crashtest.log there are only 229 distinct characters.

    import pathlib, unicodedata
    longFile = pathlib.Path(r"C:\u\Patches\Features\1422 Large XML\crashtest\crashtest.log")
    text = longFile.read_text(encoding="utf-16-le", errors="replace")
    setChars = {}
    for ch in text:
            setChars[ch] = setChars.get(ch, 0) + 1
    print(f"length={len(text)} distinct={len(setChars)}\n")
    for ch in sorted(setChars):
            print(f"{setChars[ch]:#6} {ord(ch):X}[{unicodedata.name(ch,'-')}]")
    
     
  • Zufu Liu

    Zufu Liu - 2021-11-29

    Not tried this yet, it seems reasonable as normally there will be only two fonts: base font for Latin-1, and fallback font for user's locale.

    It seems we can avoid UI freeze by allow LineLayout to has partial position state, for long line treat it as many small parts as estimated by durationWrapOneByte: in EditView::LayoutLine() stop process remaining text when ts.end() exceeds the limit, and record it as next start position. word wrap will need some changes to only wrap text from previous lastGoodBreak.

    In many cases (except for printing and goto end of the line), partial position state or only draw begging of a long line like VSCode seems acceptable. This at least will make idle wrap smoother. When word wrap is disabled, whole positions for a long line may not even needed since only part of it is visible.

     
    • Neil Hodgson

      Neil Hodgson - 2021-11-29

      it seems reasonable as normally there will be only two fonts: base font for Latin-1, and fallback font for user's locale.

      And an emoji font (which are often differently sized), and a font for scientific characters and arrows, ... Coverage of normal writing fonts is often very limited so fallback occurs more often than you may expect. In Character Map set a font with good coverage like Lucida Sans Unicode or Deja Vu Sans then choose Character Set: Unicode and Group by: Unicode Subrange. In the Group By list choose types like Arrows or Mathematical Operators; then swap to a writing or UI font like Verdana, Candara, or Segoe UI and the set of characters reduces greatly.

      In many cases (except for printing and goto end of the line), partial position state or

      With the example file, the first thing I did was press Ctrl+End to go to the end of the document which then depends on the layout to determine where the caret should appear. Going to end would be a common operation so the problem has just been moved from "moving onto any part of long line" to "moving towards the end of a long line".

      only draw begging of a long line like VSCode seems acceptable.

      That seems like failure to me: the user is unable to see or edit the text they want to. In VSCode, its not really obvious what has happened: there is no caret and there is a "..." at line end - I initially thought I was seeing the end of the line.

       
    • Zufu Liu

      Zufu Liu - 2021-12-05

      https://github.com/zufuliu/notepad2/commit/c954c4cec64bc661d57188230310491180d4bec0 implemented partial position state, only printing (EditView::FormatRange()) do whole line layout. unlike VSCode, long line is still wrapped in idle time (guarded by a waitable timer). also added a separator cache for long lines (always cached) to avoid relayout on scrolling. There are other problems though.

      P.S. it seems scroll bar update is missing from Editor::SetAnnotationHeights():

      if (changedHeight) {
          SetScrollBars();
          SetVerticalScrollPos();
          Redraw();
      }
      
       
      • Neil Hodgson

        Neil Hodgson - 2021-12-15

        I'm not understanding what the partialLine argument to GetFoldDisplayText achieves. Its returning fold display text for partially laid out lines even when fold display is supposed to be hidden.

        The lifetime of the timer seems clumsy as a field on EditModel. Each timing period is bounded by WrapLines. Is a waitable timer significantly faster than a cross-platform ElapsedPeriod?

        P.S. it seems scroll bar update is missing from Editor::SetAnnotationHeights()

        Yes.

         
        • Zufu Liu

          Zufu Liu - 2021-12-16

          Change to GetFoldDisplayText is just the simplest way in Notepad2 to draw ellipsis for partial line as the default fold display text is configured as ellipsis (no text for other lines).

          It's ugly, but since the waitable timer is always (or close to) signaled when LayoutLine() is called outside WrapLines(), this makes LayoutLine() always finished in reasonable time with same check for idle and non-idle wrapping.

          Though I don't have a reference link, my past test (during fix https://github.com/zufuliu/notepad2/issues/236 and https://sourceforge.net/p/scintilla/feature-requests/1381/ for mark all occurrences) shows waitable timer is faster than QueryPerformanceCounter().

           
      • Neil Hodgson

        Neil Hodgson - 2021-12-16

        Committed SetAnnotationHeights change as [a954ef].

         

        Related

        Commit: [a954ef]

  • Zufu Liu

    Zufu Liu - 2021-11-30

    It seems increase lengthStartSubdivision and lengthEachSubdivision would reduce about half the time.

    // PositionCache.h
    lengthStartSubdivision = 4096
    lengthEachSubdivision = 1024
    
    // SurfaceD2D::MeasureWidths() and SurfaceD2D::MeasureWidthsUTF8()
        UINT32 count = 0;
    #if 0
        constexpr int clusters = stackBufferLength;
        DWRITE_CLUSTER_METRICS clusterMetrics[clusters];
        const HRESULT hrGetCluster = pTextLayout->GetClusterMetrics(clusterMetrics, clusters, &count);
    #else
        VarBuffer<DWRITE_CLUSTER_METRICS, stackBufferLength> dummy(tbuf.tlen);
        DWRITE_CLUSTER_METRICS * const clusterMetrics = dummy.buffer;
        const HRESULT hrGetCluster = pTextLayout->GetClusterMetrics(clusterMetrics, tbuf.tlen, &count);
    #endif
    
     
    • Neil Hodgson

      Neil Hodgson - 2021-12-02

      I'm seeing a 25% time decrease on Win32.

      it seems reasonable as normally there will be only two fonts: base font for Latin-1, and fallback font for user's locale.

      'font' here meant logical font as represented by Font. The font collection implementing this isn't important unless a layer is stripped off to access fonts at a lower level. While possible, that would require per-platform code and may not provide all features implemented by higher level platform layers. Converting from characters to glyphs is a similar topic.

       
  • Zufu Liu

    Zufu Liu - 2021-12-27

    Some small changes related to line layout and wrapping:
    1. Removed unneeded default width value for EditView::LayoutLine() method.
    2. In EditView::LayoutLine(), fill ll->positions with zero once instead of fill it for each segment.
    3. Removed posLineStart field from BreakFinder class as it only used inside the constructor.
    4. Moved/Merged scrollbar update inside Editor::InsertCharacter() to the end of insertion loop.

    I think all the null test for LineLayout can be removed, as it can not be null otherwise make_shared() already thrown.

     

    Last edit: Zufu Liu 2021-12-27
    • Neil Hodgson

      Neil Hodgson - 2022-01-22

      Committed as [7aa394]. Changed 0.0f to 0.0 in fill as XYPOSITION is currently double.

      I think all the null test for LineLayout can be removed, as it can not be null otherwise make_shared() already thrown.

      OK, but I'd still like an assert inside LayoutLine so there is an early failure instead of memory corruption if the assumption is wrong.

       

      Related

      Commit: [7aa394]

  • Neil Hodgson

    Neil Hodgson - 2022-01-07

    I wrote a parallel version of layout that achieves a substantial speed up on this long line: 4.1x faster on a 4 core 8 thread machine.

    Since it is applicable in other scenarios and needs further development, it is a separate feature request. [feature-requests:#1427]

     
    👍
    1

    Related

    Feature Requests: #1427

    • Zufu Liu

      Zufu Liu - 2022-01-07

      Will test the parallel changes later, for short lines, it might not worth to run in parallel, also the code may not works on XP or Vista (see https://github.com/microsoft/STL/issues/2286).

      Another change that I think will improve editing experience for long line (current edited line will be layout twice after Editor::NotifyModified(), one caused by FlagSet(mh.modificationType, ModificationFlags::ChangeStyle), author by CheckModificationForWrap(mh); ) is to find first different position (for ll->styles or ll->chars) in checkTextAndStyle block, then start layout from this position instead of layout whole line.

       
    • Neil Hodgson

      Neil Hodgson - 2022-01-07

      Yes, there will be a lower limit to line length or number of segments where it would be faster to measure sequentially.

      std::execution::par works in Visual C++ 2017 with v141_xp so should work with the XP-specific builds of Scintilla which use an old version of the runtime library. Windows Vista could use the XP build since it is now a minority platform. Just tried Visual C++ 2017 + v141_xp + std::execution::par on Windows 10 and its 30% slower than using current compiler and runtime but that is still a better than 3x speed up over sequential.

      For moderate width lines, wrapping can be slow on long files and it may be faster to measure the lines in a block of lines in parallel. This would require more complex code additions.

      find first different position (for ll->styles or ll->chars) in checkTextAndStyle block, then start layout from this position

      Because of kerning, combining characters, etc., this should go back to the start of segment or a similar 'safe point'. Bidirectional text may also complicate this so could be disabled in bidirectional mode.

       
  • Zufu Liu

    Zufu Liu - 2022-01-08

    A simple approach to achieve "safe point" is if (ts.end() < diffPosition) continue;

    A special case for finding the difference position is viewEOL, where line layout for base text is not changed [feature-requests:#1428].

     

    Related

    Feature Requests: #1428


    Last edit: Zufu Liu 2022-01-08
  • Neil Hodgson

    Neil Hodgson - 2022-01-11

    Working on [feature-requests:#1427] showed that this example isn't slow just because there is a lot of text but because the text is difficult. Measuring widths of the example takes 45 times as long per-character as ASCII text. This could be because of merging multiple fonts, using combining characters, showing emoji or something else.

     

    Related

    Feature Requests: #1427


Log in to post a comment.