Sergey Koren - 2015-11-27

Typically a human-sized genome requires 2-4TB of space to run. It can be higher for more repetitive genomes since these have more overlaps.

There really isn't anything you can change to reduce the space usage, most of it is in storing the overlaps between raw sequences and the layouts for corrected reads. You can make sure that there are no *.dat files in the temporary folder/1-overlapper directory and that there are no *.ovb in temporary folder/1-overlapper/001/*. You can estimate how much space you need based on your current usage. The asm.layout.err step is converting the overlap store to a read-based layout which will be used to generate corrected sequences. Thus, it will need approximately double the asm.ovlStore's space to finish. Once the layouts are generated the asm.ovlStore could be removed and the pipeline will remove it for you when it completes generating corrected sequences.