From: Alireza K. <ali...@gm...> - 2013-03-05 17:01:21
|
Thanks for your answers, I did tried sorting on queryname; but as Jacob had guessed, "unpaired mates are pile up in memory anyway" therefore, again I got "java out of space - Java heap space" error, Is there any guess, how a bam file can be generated that is not compatible with SamtoFastq format as such ?! Alireza On Tue, Mar 5, 2013 at 6:37 AM, Jacob Grydholt <jgr...@cl...> wrote: > Hi, > > I wonder if this could be related to the other issue, Alireza reported > where a lot of reads were missing their mate. From what I can gather from > SamToFastq.java, the program holds on to a mate until its companion is > found. When you have >5M missing mates, you could run out of memory. > > According to the SAM format, two reads must have identical names in order > to be regarded as paired. I have come across BAM files where the pairing > has been recorded by suffixes added to the read names in the style of: > > READ_NAME/1 163 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG * # > WRONG > READ_NAME/2 83 ref 37 30 9M = 7 -39 CAGCGCCAT * > > where the correct style would be: > > READ_NAME 163 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG * # > CORRECT > READ_NAME 83 ref 37 30 9M = 7 -39 CAGCGCCAT * > > If this is the root cause, I don't think sorting on queryname will help, > since the unpaired mates will pile up in memory anyway. The ValidateSamFile > command that Alec described in the previous mail can be used to diagnose > the issue. > > Kind regards, > > Jacob Grydholt, Senior Developer > CLC bio > > > > On 03/04/13 19:24, Alec Wysoker wrote: > > Hi Alireza, > > MAX_RECORDS_IN_RAM doesn't have any affect in SamToFastq. If the input > file is queryname sorted, the memory footprint should be quite low. Try > queryname sorting the input BAM and then running SamToFastq on the sorted > file. > > -Alec > > You probably need to reduce MAX_RECORDS_IN_RAM > On Mar 4, 2013, at 11:24 AM, Alireza Kashani wrote: > > Hi there, > > To convert a some how big bam file (9gig) to fastq format, I got "GC > overhead limit exceeded", I did , tried different approach, such as > assigning a temporary directory to java, or adding -XX:-UseGCOverheadLimit > or reducing the MAX_RECORDS_IN_RAM; All failed and I got different errors. > > > the original error : > INFO 2013-03-04 09:18:41 SamToFastq Processed 74,000,000 > records. Elapsed time: 00:46:33s. Time for last 1,000,000: 336s. Last > read position: chr10:112,337,223 > [Mon Mar 04 09:28:16 EST 2013] net.sf.picard.sam.SamToFastq done. Elapsed > time: 56.16 minutes. > Runtime.totalMemory()=3196256256 > FAQ: > http://sourceforge.net/apps/mediawiki/picard/index.php?title=Main_Page > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > exceeded > at net.sf.samtools.SAMUtils.phredToFastq(SAMUtils.java:355) > at net.sf.samtools.SAMUtils.phredToFastq(SAMUtils.java:343) > at > net.sf.samtools.SAMRecord.getBaseQualityString(SAMRecord.java:247) > at net.sf.picard.sam.SamToFastq.writeRecord(SamToFastq.java:250) > at net.sf.picard.sam.SamToFastq.doWork(SamToFastq.java:160) > at > net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177) > at net.sf.picard.sam.SamToFastq.main(SamToFastq.java:119) > ~ > ~ > > the error after running with -XX:-UseGCOverheadLimit or reducing the > MAX_RECORDS_IN_RAM > > > INFO 2013-03-03 23:09:17 SamToFastq Processed 77,000,000 > records. Elapsed time: 01:38:13s. Time for last 1,000,000: 2,235s. Last > read position: chr11:46,772,136 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00002b05033435ec, pid=835, tid=47300522781568 > # > # JRE version: 6.0-b16 > # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 ) > # Distribution: Custom build (Thu May 13 08:22:34 EDT 2010) > # Problematic frame: > # V [libjvm.so+0x5875ec] > # > # An error report file with more information is saved as: > # /medpop/mpg-psrl/Parabase/Projects/Yales/hs_err_pid835.log > # > # If you would like to submit a bug report, please include > # instructions how to reproduce the bug and visit: > # http://icedtea.classpath.org/bugzilla > # > /local/scratch/1362360291.259808: line 8: 835 Aborted > (core dumped) java -Xmx8g -XX:-UseGCOverheadLimit -jar > /seq/software/picard/current/bin/SamToFastq.jar > INPUT=/medpop/mpg-psrl/Parabase/BAM_files/EX_c1005LEBa_1ln_hg19.bam > FastQ=/medpop/mpg-psrl/Parabase/Projects/Yales/EX_c1005LEBa_1ln_hg19/EX_c1005LEBa_1ln_hg19.fq1 > SECOND_END_FASTQ=/medpop/mpg-psrl/Parabase/Projects/Yales/EX_c1005LEBa_1ln_hg19/EX_c1005LEBa_1ln_hg19.fq2 > NON_PF=true RE_REVERSE=true VALIDATION_STRINGENCY=SILENT > MAX_RECORDS_IN_RAM=300000 > > Any comments ?! > > Thanks > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > > http://p.sf.net/sfu/appdyn_d2d_feb_______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help > > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today:http://p.sf.net/sfu/appdyn_d2d_feb > > > > _______________________________________________ > Samtools-help mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/samtools-help > > > > -- > /grydholt > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help > > |