Menu

#1373 Improve ActionDuration for idle wrapping for consecutive long lines

Committed
closed
5
2021-06-24
2020-08-29
Zufu Liu
No

std::clamp() inside ActionDuration::AddSample() hided the fact long lines used much long time than short lines, and always returns maxDuration (1e-4) when it happens.

In computing linesInAllowedTime for ws == WrapScope::wsIdle inside Editor::WrapLines(), it's wrongly assumed that the approximately 100 lines will only takes 0.01 second.

When wrapping for consecutive long lines (e.g. MySQL dump with extended insert, see https://stackoverflow.com/questions/1293529/how-to-deal-with-enormous-line-lengths-created-by-mysqldump), as linesInAllowedTime is always larger than expected, application will become unresponsive until idle wrapping finished, or idle wrapping entered short lines block.

Related

Feature Requests: #1381
Feature Requests: #1427

Discussion

  • Zufu Liu

    Zufu Liu - 2020-08-29

    I added a draft changes at https://github.com/zufuliu/notepad2/commit/6d81c09cca199b5eb10b7e7b5b2fdc97de711892
    linesInAllowedTime is changed to minimum of 8 lines (if this still takes long time than expected, unresponsive is not fixed).
    std::clamp() in ActionDuration::AddSample() is changed to std::max():
    for short lines, they behavior the same.
    for long lines, using std::max() seems better reflect the fact, when idle wrapping still on the consecutive long/short lines block, duration is kept at same level; then decreased/increased by several runs when idle wrapping entered short/long lines block.

     
  • Neil Hodgson

    Neil Hodgson - 2020-08-31

    Since all the std::chrono clocks are wall clocks, not CPU time (or effort) clocks, the clamp is there to avoid being perturbed by transient system overloads.

    If just max is used, how rapidly does it converge back to a reasonable value after a large exogenous spike?

    Tweaking the settings here are likely to only relieve a minority of cases so the particular cases need measuring. How long are the problematic lines and how much time do they take on particular machines?

    The LinesOnScreen term is there to avoid breaking up the visible range into multiple wraps plus (expensive) repaints which could look worse. LinesInAllowedTime removes choice of policy from code concerned with a particular feature and the constraints on that feature.

    Its unlikely any changes can be incorporated before Scintilla 5.0.

     
  • Zufu Liu

    Zufu Liu - 2020-09-02

    with following changes in ActionDuration::AddSample():

    const double duration_ = alpha * durationOne + (1.0 - alpha) * duration;
    
    printf("%s actions=%.9f / %zd, one=%.9f, value=%.9f, [%.9f, %f, %f]\n", __func__,
        durationOfActions, numberActions, durationOne, duration_, duration, minDuration, maxDuration);
    

    attachment are results on an i5 for enwiki-20200801-redirect.sql (557 lines, 491 MiB) and enwiki-20200801-flaggedrevs.sql (280 lines, 210 MiB) from https://dumps.wikimedia.org/enwiki/20200801/

    The change with std::max() still not fixed the case when first page (from ws == WrapScope::wsVisible) contains only short lines but immediately followed by long lines (e.g. enwiki-20200801-flaggedrevs.sql), which is more common (at least for MySQL dumps, where file starts with short comments and DDL statements).

    I added a new method to measure the time for ActionDuration see https://github.com/zufuliu/notepad2/commit/f7e9ef12702c767e739077ea33038acbe0b0934c
    Instead of measuring the time for one line, measuring the time for 1024 bytes is more reliable and works better. finding lines need wrapping is then changed to finding lines contains allowed bytes in the time slot. for long lines, wrapping is done line by line, for short lines, it's block by block. comparing codes in #if ActionDuration_MeasureTimeByBytes ... #endif blocks, change to measure time for 1024 bytes is small.

     
  • Neil Hodgson

    Neil Hodgson - 2020-09-02

    I don't really want to spend much time on this but I'm seeing some different numbers (maybe 40%) to you on enwiki-20200801-flaggedrevs.sql and an almost-5-year-old i7 won't be that much faster than an i5 since this is running on a single core. Perhaps you are running in debug mode or syntax highlighting (which makes PositionCache work better) is turned off.

    AddSample actions=4.681963300 / 83, one=0.056409196, value=0.014177299, [0.000100000, 0.000001, 0.000100]
    AddSample actions=21.796871600 / 132, one=0.165127815, value=0.041356954, [0.000100000, 0.000001, 0.000100]
    AddSample actions=8.828542800 / 65, one=0.135823735, value=0.034030934, [0.000100000, 0.000001, 0.000100]
    

    Per-byte statistics may be a better approach and I'll have a look after Scintilla 5.

    This issue is looking at batch size from the context of files with huge lines but the opposite problem is where the adaptive rate came from. Short batches made files with huge numbers of reasonable length lines slow as there was a lot of per-batch overhead from redisplay, scroll bar updates, and application notifications (and applications processing those notifications).

     
  • Neil Hodgson

    Neil Hodgson - 2020-09-13

    Aspects of timing that should be further explored include overhead and accuracy.

    A different technique for maintaining smoothness is to give short operations a time budget (likely between 5 and 50 milliseconds) then stopping once that budget has been exhausted. A clock read is likely a system call so is not something that should be done too often - with common scenarios with short lines and good caching once per line is probably too often. There may have to be a limit to clock reading with some (dynamically calculated?) minimum amount of processing (bytes wrapped or styled) between reads.

    MinGW was particularly imprecise with IIRC the 60Hz system tick clock being used for high_resolution_clock. That may have been fixed but it made the minimum period (17 ms) close to the desired short operation duration. It can lead to measurement producing many zero values and makes stopping after a budget not work well. If all platforms had a precision better than 5 milliseconds then the budget technique may be OK. Without precision, indirect budgets like the current scheme can limit operations more effectively.

    Since the benefits of different designs depends on overhead and accuracy, measurements of these are needed for a wide variety of targets. While simple slow machines may seem to be the worst case, its possible that more complex systems may actually be more of a problem. Perhaps a NUMA machine (I don't have access to one) requires complex synchronization between CPUs when reading a clock or requires waking up other processes or even hardware. Or a virtualized machine only receives a subset of host clock ticks as an optimization.

     
  • Zufu Liu

    Zufu Liu - 2020-09-13

    Changed ElapsedPeriod to use steady_clock when available.
    See discussion on other project https://github.com/cocos2d/cocos2d-x/issues/16046

     
  • Zufu Liu

    Zufu Liu - 2020-09-13

    By testing enwiki-20200801-langlinks.sql (1491 lines, 1.37 GiB) line by line, I found Arabic (e.g. line 287 - 288), Hebrew (e.g. line 633 - 645) and Hindi (e.g. line 646 - 651) takes much more time than Latin.
    Per-byte statistics looks better, as complex scripts takes more bytes to encode and more time to decode.

    log added into Editor::WrapLines().

    const double duration = epWrapping.Duration();
    printf("line=%zd, actions=%.9f / %zd\n", lineToWrap, duration, actions);
    durationWrapOneLine.AddSample(actions, duration);
    

    script to sort log lines by time

    import sys
    import os.path
    
    def sort_line_time(path):
        time_map = {}
        for line in open(path).readlines():
            if line.startswith('line='):
                start = line.index('actions=')
                start += len('actions=')
                end = line.index('/', start)
                duration = float(line[start:end])
                key = round(duration, 1)
                if key in time_map:
                    time_map[key].append(line)
                else:
                    time_map[key] = [line]
    
        lines = []
        for key, value in sorted(time_map.items(), reverse=True):
            lines.extend(sorted(value))
    
        path, ext = os.path.splitext(path)
        path = path + '-sorted' + ext
        with open(path, 'w') as fd:
            fd.write(''.join(lines))
    
    sort_line_time(sys.argv[1])
    
     
  • Zufu Liu

    Zufu Liu - 2020-09-15

    Changed _LIBCPP_HAS_NO_MONOTONIC_CLOCK to NO_CXX11_STEADY_CLOCK.
    Added a incomplete table for "Preprocessors affect building" in ScintillaDoc.html.
    Simplified the loop inside LineLayout::SetLineStart() since newLineStarts is already zeroed.

     
  • Neil Hodgson

    Neil Hodgson - 2021-05-08

    Committed with per-byte measurement and estimation as [8b128a] and [89dd67] .

    The behaviour with this change shows that there is significant differences in rates of progress per-byte depending on factors like whether the styled text contains many identical lexemes that are cached well by PositionCache or have most layout data cached in LineLayoutCache. This makes it more important that the current rate should remain clamped so it doesn't got too extreme when hitting an easy or difficult patch.

     

    Related

    Commit: [89dd67]
    Commit: [8b128a]

  • Neil Hodgson

    Neil Hodgson - 2021-05-08
    • labels: --> scintilla, idle
    • Group: Initial --> Committed
     
  • Zufu Liu

    Zufu Liu - 2021-05-09

    How about change Document::StyleToAdjustingLineDuration() to following:

    const Sci::Position bytesBeingStyled = GetEndStyled() - stylingStart;
    durationStyleOneByte.AddSample(bytesBeingStyled, epStyling.Duration());
    

    EnsureStyledTo() might be no-op.

     
    • Neil Hodgson

      Neil Hodgson - 2021-05-09

      What scenarios does this affect? Reentrance (enteredStyling != 0) is rare so unlikely to play an important role.

       
  • Neil Hodgson

    Neil Hodgson - 2021-06-24
    • status: open --> closed
     

Log in to post a comment.