Hello Aaron/Andrew,
I've been using A5 for a while mostly without setbacks. However, sometimes I've an error on the A5qc.tar command that is usually solved by increasing available memory or by sub-sambling data.
If it is not too much to ask, could you explain the difference between this errors I'm used to get with this command: "OutOfMemoryError: GC overhead limit exceeded", "OutOfMemoryError: Java heap space" and "ArrayIndexOutOfBoundsException"? And also, would it be possible to somehow modify this command to lower its memory use?
Here's the error portion of the log file. I've also attached the full log files in case you need something from them. By the way, in this case, OutOfMemory and ArrayIndexOutOfBoundsException errors come from different projects.
FIRST ERROR
[bam_sort_core] merging from 12 files...
[a5] java -Xmx31870m -jar A5qc.jar Pool1.s4/Pool1.qc.libraw1.sam Pool1.crude.scaffolds.fasta Pool1.s4/Pool1.qc.libraw1.broken.fasta 1 > Pool1.s4/Pool1.qc.libraw1.qc.out
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.TreeMap.put(Unknown Source)
at java.util.TreeSet.add(Unknown Source)
at org.halophiles.assembly.qc.MatchPoint.addNeighbor(MatchPoint.java:57)
at org.halophiles.assembly.qc.SpatialClusterer.locateNeighbors(SpatialClusterer.java:254)
at org.halophiles.assembly.qc.SpatialClusterer.buildReadPairClusters(SpatialClusterer.java:182)
at org.halophiles.assembly.qc.MisassemblyBreaker.main(MisassemblyBreaker.java:208)
[a5] Error in detecting misassemblies.
SECOND ERROR
...
[samopen] SAM header is present: 284 sequences.
[bam_sort_core] merging from 13 files...
[a5] java -Xmx45000m -jar A5qc.jar Pool1.s4/Pool1.qc.libraw1.sam Pool1.crude.scaffolds.fasta Pool1.s4/Pool1.qc.libraw1.broken.fasta 1 > Pool1.s4/Pool1.qc.libraw1.qc.out
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.util.Vector.grow(Unknown Source)
at java.util.Vector.ensureCapacityHelper(Unknown Source)
at java.util.Vector.addAll(Unknown Source)
at org.halophiles.assembly.qc.SpatialClusterer.expandClusters(SpatialClusterer.java:303)
at org.halophiles.assembly.qc.SpatialClusterer.runDBSCAN(SpatialClusterer.java:279)
at org.halophiles.assembly.qc.SpatialClusterer.buildReadPairClusters(SpatialClusterer.java:183)
at org.halophiles.assembly.qc.MisassemblyBreaker.main(MisassemblyBreaker.java:208)
[a5] Error in detecting misassemblies.
THIRD ERROR
...
[samopen] SAM header is present: 123 sequences.
[bam_sort_core] merging from 15 files...
[a5] java -Xmx96679m -jar A5qc.jar AM.s4/AM.qc.libraw1.sam AM.crude.scaffolds.fasta AM.s4/AM.qc.libraw1.broken.fasta 1 > AM.s4/AM.qc.libraw1.qc.out
java.lang.ArrayIndexOutOfBoundsException
at org.halophiles.assembly.qc.MisassemblyBreaker.loadData(MisassemblyBreaker.java:650)
at org.halophiles.assembly.qc.MisassemblyBreaker.main(MisassemblyBreaker.java:193)
[a5] Error in detecting misassemblies.
Thank you very much in advance for your time.
Sincerely,
Santiago
Hi Santiago, can you please let us know exactly which version of A5 your are using? I ask because the ArrayIndexOutOfBoundsException looks like something that may have been fixed in recent releases.
Hi Aaron, I'm using the latest version, a5pipeline-20141120.
Hi Aaron, do you have any news about this? Thanks.
Hello!
First of all thanks a lot for this amazing software.
We have experienced some problems with A5qc.jar (A5-miseq version 20150522):
[a5] java -Xmx103125m -jar A5qc.jar Set13aA8_S8.s4/Set13aA8_S8.qc.libraw1.sam Set13aA8_S8.crude.scaffolds.fasta Set13aA8_S8.s4/Set13aA8_S8.qc.libraw1.broken.fasta 1 > Set13aA8_S8.s4/Set13aA8_S8.qc.libraw1.qc.out
java.lang.ArrayIndexOutOfBoundsException: -1
at org.halophiles.assembly.qc.MisassemblyBreaker.loadData(MisassemblyBreaker.java:650)
at org.halophiles.assembly.qc.MisassemblyBreaker.main(MisassemblyBreaker.java:193)
[a5] Error in detecting misassemblies.
Could you please help us with this? Do you have any idea of why this is happening?
<offtopic>:
We also observed the general behaviour that by assembling a given data set of reads with A5-miseq version 20150522 we get significantly less scaffolds then if we use a5_miseq_linux_20140521. Is it expected?
</offtopic>
Last edit: Luca 2016-03-14