Menu

#3 OutOfMemoryException with Large files

open
nobody
None
5
2005-01-12
2005-01-12
Anonymous
No

Here is my code how i create the delta file for patching
two files:

Rdiff rdiff = new rDiff();

...

List sigs = rdiff.makeSignatures(InputStream);

List deltas = rdiff.makeDeltas(sigs, InputStream);
rdiff.writeDeltas(deltas, OutputStream);

When i use a file over a size of 450MB it reports back:
Exception in thread "main" java.lang.OutOfMemoryError:
Java heap space

Is this an environment setting problem or an actual
problem in the jarsync self. Or am i suppose to set some
settings before doing the patching.

Regards
Rossouw

Discussion

  • Casey Marshall

    Casey Marshall - 2007-01-20

    Logged In: YES
    user_id=322026
    Originator: NO

    (not sure if you're still interested, in the intervening years...)

    There were some bugs in the ChecksumPair code, which prevented the hash search from working properly.

    The other issue is that the list of signatures may be very large, as may be the deltas (the list of deltas may comprise the entire file, for example). It's much better to work with temporary files if you are processing very large files. Take a look at the methods

    Rdiff.makeSignatures(InputStream, OutputStream);
    Rdiff.makeDeltas(List<ChecksumPair>, InputStream, OutputStream);

    These use the "streaming" API in a simple fashion, and should use little memory. You still need to keep the whole list of signatures in memory, of course, but you can mitigate the problems that causes by setting the 'blockSize' value to something larger for larger files.

    (I suggest getting the newest code from Subversion, by the way)

     
  •  Kostis Anagnostopoulos

    The algorithm generates excessive GC due to:
    1) Excessive array cloning,
    2) excessive object and class constructions(i.e. event-objects, TwoKeys inner classes),
    3) storing results in intermediate lists.

    Using the -pipe option can only mitigate the 3rd problem.

    To mitigate all problems would require a partial re-write.

     

Log in to post a comment.

MongoDB Logo MongoDB