|
From: Michael J. C. <mic...@gm...> - 2010-05-16 23:52:08
|
Hi guys,
I've been trying to get 1102's tumor to run through Picard
MarkDuplicates for two weeks now. Resorted to splitting the whole genome
file up by chromosome and trying to run that. SOME of it has worked, but
some of them crashed with the following error:
/usr/java/latest/bin/java -Xmx6G -jar
/share/apps/picard-tools-1.19/MarkDuplicates.jar I=1102T.LMP.chrY.bam
O=1102T.LMP.rmdup.chrY.bam M=1102T.LMP.rmdup.chrY.metrics
TMP_DIR=tmp.files/ REMOVE_DUPLICATES=TRUE VALIDATION_STRINGENCY=SILENT
MAX_RECORDS_IN_RAM=8000000
[Sun May 16 16:32:46 PDT 2010] net.sf.picard.sam.MarkDuplicates
INPUT=1102T.LMP.chrY.bam OUTPUT=1102T.LMP.rmdup.chrY.bam
METRICS_FILE=1102T.LMP.rmdup.chrY.metrics REMOVE_DUPLICATES=true
TMP_DIR=tmp.files VALIDATION_STRINGENCY=SILENT
MAX_RECORDS_IN_RAM=8000000 ASSUME_SORTED=false
MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000
READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).*
OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false
COMPRESSION_LEVEL=5
INFO 2010-05-16 16:32:46 MarkDuplicates Start of doWork
freeMemory: 8658104; totalMemory: 9109504; maxMemory: 5726666752
INFO 2010-05-16 16:32:46 MarkDuplicates Reading input file and
constructing read end information.
INFO 2010-05-16 16:32:46 MarkDuplicates Will retain up to
22724868 data points before spilling to disk.
INFO 2010-05-16 16:33:07 MarkDuplicates Read 1000000 records.
Tracking 2526 as yet unmatched pairs. 2526 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:33:17 MarkDuplicates Read 2000000 records.
Tracking 6482 as yet unmatched pairs. 6482 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:33:27 MarkDuplicates Read 3000000 records.
Tracking 10160 as yet unmatched pairs. 10160 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:33:36 MarkDuplicates Read 4000000 records.
Tracking 68510 as yet unmatched pairs. 68510 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:33:46 MarkDuplicates Read 5000000 records.
Tracking 75782 as yet unmatched pairs. 75782 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:33:55 MarkDuplicates Read 6000000 records.
Tracking 97889 as yet unmatched pairs. 97889 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:34:03 MarkDuplicates Read 7000000 records.
Tracking 60997 as yet unmatched pairs. 60997 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:34:13 MarkDuplicates Read 8000000 records.
Tracking 61125 as yet unmatched pairs. 61125 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:34:22 MarkDuplicates Read 9000000 records.
Tracking 60597 as yet unmatched pairs. 60597 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:34:33 MarkDuplicates Read 10000000 records.
Tracking 58306 as yet unmatched pairs. 58306 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:34:41 MarkDuplicates Read 11000000 records.
Tracking 55412 as yet unmatched pairs. 55412 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:34:53 MarkDuplicates Read 12000000 records.
Tracking 51336 as yet unmatched pairs. 51336 records in RAM. Last
sequence index: 23
INFO 2010-05-16 16:35:02 MarkDuplicates Read 13000000 records.
Tracking 13051 as yet unmatched pairs. 13051 records in RAM. Last
sequence index: 23
[Sun May 16 16:35:04 PDT 2010] net.sf.picard.sam.MarkDuplicates done.
Runtime.totalMemory()=2987982848
Exception in thread "main" net.sf.samtools.FileTruncatedException:
Premature end of file
at
net.sf.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:290)
at
net.sf.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:100)
at
net.sf.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:169)
at java.io.DataInputStream.read(DataInputStream.java:132)
at
net.sf.samtools.util.BinaryCodec.readBytesOrFewer(BinaryCodec.java:394)
at net.sf.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:371)
at net.sf.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:357)
at net.sf.samtools.BAMRecordCodec.decode(BAMRecordCodec.java:182)
at
net.sf.samtools.BAMFileReader$BAMFileIterator.getNextRecord(BAMFileReader.java:397)
at
net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:373)
at
net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:363)
at
net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:330)
at
net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:261)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:112)
at
net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:150)
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:96)
It crashes immediately upon saying "net.sf.picard.sam.MarkDuplicates
done." with this Premature end of file error. Anyone have any ideas how
to get around this and get it to work?
Thanks,
Mike
|