From: Tim F. <tfe...@br...> - 2010-02-19 19:12:16
|
This isn't strictly true. If the SAM file you are validating contains PE data, then ValidateSamFile may use up more memory for larger files. Since each record has information about it's mate and vice versa, ValidateSamFile caches some information about each paired-end read until it finds the other end of the pair. It does this in a Map using the read name as the key, and a small class as the value that uses approximately 30 bytes per read. If the read names are on average 20-30 characters long, this could use up ~60 bytes per read being tracked. If your data contains either lots of chimeric pairs (where the two records are very far apart in the file) or lots of paired-end data with only one end mapped, then this could chew up quite a bit of memory, and would presumably peak about halfway through the file or thereabouts. If I've done my math right, assuming some small overhead for other data, this would mean that it could track about 32-35m reads in 2GB of ram before running out. -t On Feb 19, 2010, at 1:50 PM, Alec Wysoker wrote: > Hi Holly, > > ValidateSamFile should not require RAM proportional to BAM file size. > Most Picard programs will work fine with -Xmx2g passed to java (2GB of > Java heap space). Please try that and let me know if it still isn't > working. > > -Alec > > Holly Zheng Bradley wrote: >> Hi there, >> >> I am using picard ValidateSamFile function and would like to know >> what >> is the upper limit for the input BAM file size. A 63 BAM file choked >> the run and threw the following error; I wonder if the error is >> totally caused by the big input file size or there are other reasons. >> >> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead >> limit exceeded >> at >> net.sf.samtools.BinaryCigarCodec.decode(BinaryCigarCodec.java:70) >> at net.sf.samtools.BAMRecord.getCigar(BAMRecord.java:228) >> at net.sf.samtools.SAMRecord.validateCigar(SAMRecord.java: >> 1159) >> at >> net >> .sf.picard.sam.SamFileValidator.validateCigar(SamFileValidator.java: >> 241) >> >> at >> net >> .sf >> .picard >> .sam.SamFileValidator.validateSamRecords(SamFileValidator.java:186) >> >> at >> net >> .sf >> .picard.sam.SamFileValidator.validateSamFile(SamFileValidator.java: >> 162) >> >> at >> net >> .sf >> .picard >> .sam.SamFileValidator.validateSamFileVerbose(SamFileValidator.java: >> 122) >> >> at >> net.sf.picard.sam.ValidateSamFile.doWork(ValidateSamFile.java:122) >> at >> net >> .sf >> .picard >> .cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:131) >> >> at >> net.sf.picard.sam.ValidateSamFile.main(ValidateSamFile.java:76) >> >> Thanks. >> >> Holly Zheng Bradley >> EBI >> >> > > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help |