Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#428 Negative index fix in SegmentBuffer for files beyond 1GB

closed-later
nobody
None
5
2012-05-26
2012-05-25
Thomas Meyer
No

Large file support is going to be really hard in java/jEdit...

Discussion

  • Thomas, when I open 200MB file jedit allocates 800MB of heap. Win7 64, java 6. So if you want to edit 1GB you need roughly 4GB heap. I don't have enough memory to make experiments with files over 1GB. I bet 99% of jedit users don't need to edit 1GB files. So for me there's no need to pick up this subject.

    You may of course submit a bug report. Probably with files over 1GB other exceptions start to rise. You may quote them. If you submit a bug report, please make a link to here. Then a dev interested in large file support may reopen this ticket.

    I can't find the additional info (that Thomas provided) in our mailing lists archives. So I copy it here:

    QUOTE:
    okay, I see. The problem arises with files greater Integer.MAX_VALUE.
    I guess it will be really hard to implement a handling of files > 2GB as
    for example the array index only uses an "int" type as well as all
    Segment and Buffer classes. Is this a bug or is somewhere described that
    jEdit only supports files up to 2GB? If you think this is a bug I'll
    open a bug report for this, otherwise feel free to reject this patch.

     
    • status: open --> closed-later
     
    • summary: Negative index fix for SegmentBuffer --> Negative index fix in SegmentBuffer for files beyond 1GB
     
  • large file generation in c

     
  • In fact a problem of jEdit here is the ensureCapacity method:

    when opening a big file the data are stored in
    a char[] of the ContentManager.
    To be simple If the file is 100MB the char[] length is 100 000 000.
    when typing anything,
    ensureCapacity() is called, like that:

    ensureCapacity(length + len + 1024);
    where length is the current size, and len the size of inserted data
    But in ensureCapacity() method, if the new length is bigger than the current length we create a new char[] array with twice the asked length :
    2*(length+len+1024)

    So we now have a for a few times
    char[] 100 000 000 chars
    and another one of more that twice that size.
    So we need 3x the file size.
    Maybe we could try to remove that 2x multiplier and see what happens. It should use less memory but do more array copy, I don't know if it would cause performances problems or not.

    Another problem is that the buffer will never release chars until being closed.
    So if you open a big file then remove half of it, the memory usage will remains the same.

    Another option (but more complex) would be to split that char[] into smaller one, but inserting and removing text would be a complex operation.

     
  • Thomas Meyer
    Thomas Meyer
    2012-05-29

    "I can't find the additional info (that Thomas provided) in our mailing
    lists archives. So I copy it here:" -> This is probably because as I posted a non-subscriber to the mailing list, but this should be fixed for future postings.