|
From: Alec W. <al...@br...> - 2011-08-18 19:16:58
|
Hi Ming, It certainly sounds like a corrupt file. You can try to samtools view it and see if samtools also chokes on it. Also, you could gunzip -c s_3_export.sorted.bam > /dev/null and see if gzip also chokes on it. -Alec On 8/18/11 3:09 PM, Yi, Ming (NIH/NCI) [C] wrote: > Hi, Sean and Alec: > > Thanks for both of you for suggestions. In fact, I just did run picard ValidateSamFile on this bam file along with other "normal and good" bam files that can run through picard AddOrReplaceReadGroups. > > Turns out, this bam has odd behavior with ValidateSamFile. > > I used the following command (take out path for simplicty) > > java -Xms16g -Xmx16g -jar picard-tools-1.44/ValidateSamFile.jar INPUT=s_3_export.sorted.bam > OUTPUT=s_3_export.sorted.bam_validReport.out > IGNORE=MISSING_TAG_NM > > It got error message similar as for AddOrReplaceReadGroups > > Exception in thread "main" java.lang.RuntimeException: java.util.zip.DataFormatException: invalid distance too far back at net.sf.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:112) at net.sf.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:320) at net.sf.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:302) > .... > > But it still gives output to s_3_export.sorted.bam_validReport.out file that only has one line: > ERROR: Read groups is empty > > Whereas if I run the same command of ValidateSamFile as above for another "Good" bam file that has no problem for picard AddOrReplaceReadGroups, I did got normal output in its report file for ValidateSamFile as below: > ERROR: Read groups is empty > ERROR: Record 49818802, Read name NCI-GA3_39:2:120:4177:7990, CIGAR M operator maps off end of reference > ERROR: Record 49818803, Read name NCI-GA3_39:2:83:12825:12531, CIGAR M operator maps off end of reference > ERROR: Record 49818804, Read name NCI-GA3_39:2:9:1503:4509, CIGAR M operator maps off end of reference > ERROR: Record 49818805, Read name NCI-GA3_39:2:96:18191:13964, CIGAR M operator maps off end of reference > ERROR: Record 49818806, Read name NCI-GA3_39:2:7:19792:8943, CIGAR M operator maps off end of reference > ERROR: Record 49818807, Read name NCI-GA3_39:2:17:10477:10243, CIGAR M operator maps off end of reference > ..... > > Does this mean this bam file (s_3_export.sorted.bam) is bad or corrupted? > It got very similar error for both picard ValidateSamFile and AddOrReplaceReadGroups command as: > > java.util.zip.DataFormatException: invalid distance too far back > ...... > > Thanks again, > > Ming > > > > > -----Original Message----- > From: Alec Wysoker [mailto:al...@br...] > Sent: Thursday, August 18, 2011 12:10 PM > To: Davis, Sean (NIH/NCI) [E] > Cc: Yi, Ming (NIH/NCI) [C]; sam...@li... > Subject: Re: [Samtools-help] invalid distance too far back error for picard AddOrReplaceReadGroups > > Also try ValidateSamFile. > > -Alec > > On 8/18/11 12:06 PM, Sean Davis wrote: >> Hi, Ming. >> >> >> Are you able to generate an index using samtools index on this file? >> The .bam file may be corrupt and making an index is a quick way to >> check. I'm definitely not saying that I know the cause of the >> problem, though. >> >> Sean >> >> >> On Thu, Aug 18, 2011 at 11:55 AM, Yi, Ming (NIH/NCI) [C] >> <yi...@ma...> wrote: >>> Hi, Dear list: >>> >>> I run into issues with the picard AddOrReplaceReadGroups as below: >>> >>> The command I used as below (simplified file path for clarity): >>> >>> java -Xms15g -Xmx15g -jar /opt/nasapps/stow/picard-tools-1.49/AddOrReplaceReadGroups.jar >>> INPUT=s_3_export.sorted.bam >>> OUTPUT=F14_w_RG.bam >>> RGID=708BRAAXX_Sample_F14 RGLB=F14_Illumina RGPL=Illumina RGPU=708BRAAXX.lane_3 RGSM=F14 RGCN=NCI-CCR_SF VALIDATION_STRINGENCY=SILENT >>> >>> >>> And I got the following error message: >>> >>> Exception in thread "main" java.lang.RuntimeException: java.util.zip.DataFormatException: invalid distance too far back >>> at net.sf.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:112) >>> at net.sf.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:320) >>> at net.sf.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:302) >>> at net.sf.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:106) >>> at net.sf.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:175) >>> at java.io.DataInputStream.read(DataInputStream.java:149) >>> at net.sf.samtools.util.BinaryCodec.readBytesOrFewer(BinaryCodec.java:394) >>> at net.sf.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:371) >>> at net.sf.samtools.util.BinaryCodec.readByteBuffer(BinaryCodec.java:480) >>> at net.sf.samtools.util.BinaryCodec.readInt(BinaryCodec.java:491) >>> at net.sf.samtools.BAMRecordCodec.decode(BAMRecordCodec.java:159) >>> at net.sf.samtools.BAMFileReader$BAMFileIterator.getNextRecord(BAMFileReader.java:486) >>> at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:460) >>> at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:450) >>> at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:417) >>> at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:629) >>> at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:607) >>> at net.sf.picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:91) >>> at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:169) >>> at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119) >>> at net.sf.picard.sam.AddOrReplaceReadGroups.main(AddOrReplaceReadGroups.java:61) >>> Caused by: java.util.zip.DataFormatException: invalid distance too far back >>> at java.util.zip.Inflater.inflateBytes(Native Method) >>> at java.util.zip.Inflater.inflate(Inflater.java:255) >>> at net.sf.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:96) >>> ... 20 more >>> >>> >>> In fact, I have total 19 bam files, each one works fine with picard AddOrReplaceReadGroups, only this bam file had the issue. >>> Any idea why is that? >>> >>> Thanks a lot in advance! >>> >>> Myi >>> >>> ABCC >>> National Cancer Institute at Frederick, >>> Frederick, MD 21702 >>> >>> >>> ------------------------------------------------------------------------------ >>> Get a FREE DOWNLOAD! and learn more about uberSVN rich system, >>> user administration capabilities and model configuration. Take >>> the hassle out of deploying and managing Subversion and the >>> tools developers use with it. http://p.sf.net/sfu/wandisco-d2d-2 >>> _______________________________________________ >>> Samtools-help mailing list >>> Sam...@li... >>> https://lists.sourceforge.net/lists/listinfo/samtools-help >>> >> ------------------------------------------------------------------------------ >> Get a FREE DOWNLOAD! and learn more about uberSVN rich system, >> user administration capabilities and model configuration. Take >> the hassle out of deploying and managing Subversion and the >> tools developers use with it. http://p.sf.net/sfu/wandisco-d2d-2 >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help > ------------------------------------------------------------------------------ > Get a FREE DOWNLOAD! and learn more about uberSVN rich system, > user administration capabilities and model configuration. Take > the hassle out of deploying and managing Subversion and the > tools developers use with it. http://p.sf.net/sfu/wandisco-d2d-2 > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help |