#1514 jump to far line number where it has never been before takes 1 minute

Bug
closed-rejected
Neil Hodgson
None
5
2013-08-31
2013-08-14
Jim Michaels
No

on files that have 100,000 lines (about the limit for a 32-bit application unfortunately), SCI_GOTOLINE takes about 1 minute on an i7-3970x (3.5GHz, 6C 12 thread) with 64GB RAM. it takes up to 30 minutes on an old pentium 4 HT 2.8GHz (1C 2 thread) with 3GB RAM. but this only happens when the difference in lines is large and the editor has not been near that vicinity before.

on my other editor it pops.

Discussion

  • Jim Michaels
    Jim Michaels
    2013-08-14

    you aren't perchance reading/writing through the entire file with every change are you? that's what it feels like. hmm. no, actually, it's slower than that, because I can write a c++ program that writes 100k lines very fast. in fact, I have processed a 3GB email inbox using fopen/fclose, etc in 6 minutes. using CreateFile/ReadFile/WriteFile/CloseHandle according to the MSDN 64-bit filesystem strategy was extremely slow (6 hours) for some reason I can't explain, but fopen/fread/fwrite/fsetpos/fgetpos are 64-bit and work fine for large files past the 32-bit 2GiB or 4GiB limitation (but beware of conversions to/from int or long or using int or long for array indexes or for a file position, use int64_t or uint64_t and I use std::vector with #include initializer_list instead of arrays and std::map for associative arrays where you need to do abc["hello"]=12;).

     
    Last edit: Jim Michaels 2013-08-14
  • I guess it's due either to highlighting or line wrapping. Both does incremental work, so doesn't process the parts that aren't yet useful, so it may have to highlight 100k lines after the seek. And line wrapping is really slow, maybe particularly in Scintilla, not sure -- if you have it enabled, try disabling it just to see the difference.

    And no, Scintilla doesn't touch files.

     
  • Neil Hodgson
    Neil Hodgson
    2013-08-14

    Source code lines normally average somewhere between 30 and 100 characters so 100,000 lines will commonly be between 3 and 10 MB so should be well below the limit for a 32-bit application.

    Scintilla takes about half a second to display line 100,000 of a 400 MB C++ file on an i7 870. Perhaps you have a file that contains many wide lines, so that lexical analysis up to the displayed area takes more time. If this is the case then features such as syntax styling, folding, and line wrapping should be turned off until you are happy with the performance.

     
  • Jim Michaels
    Jim Michaels
    2013-08-15

    takes about 1/2 second on my i7-3970x for 100k lines (scite's max). should be instantaneous.

    the memory usage of scite causes an out-of-memory error at 200,000 lines. I have 64GB of memory, but because it's 32-bit, this is a problem. it has large overhead I guess.

     
  • Jim Michaels
    Jim Michaels
    2013-08-15

    a test file was 162313278 bytes and it had 2,115,219 lines and consumed 358720KiB peak according to task manager (so I was wrong about the line numbers, it's more about size I think), but I could not double it before getting an out of memory error.

    a max filesize test showed me that an 7.8MB html file can be loaded and multiplied 31 times and then it will give the out of memory error, which is 241.8MB. so that's approximately the max filesize. the memory consumed was 831372KiB, so overhead ratio was 8313721024/(317839789)=3.502912:1 so I think the overhead ratio is not too bad considering.

    I was wondering if I can get the goto line number to be instant. I was going to say close this bug, but I forgot the original title and got sidetracked.

     
    Last edit: Jim Michaels 2013-08-15
  • Neil Hodgson
    Neil Hodgson
    2013-08-16

    • status: open --> open-rejected
    • assigned_to: Neil Hodgson
     
  • Neil Hodgson
    Neil Hodgson
    2013-08-31

    • status: open-rejected --> closed-rejected