|
From: Yi, M. (NIH/N. [C] <yi...@ma...> - 2011-08-03 19:35:11
|
Hi, Dear List:
I run into some issues with the picard MarkDuplicates as below:
The command I used as below(simplified file path for clarity):
java -Xms4g -Xmx4g -jar /opt/nasapps/stow/picard-tools-1.31/MarkDuplicates.jar
INPUT=F17_w_RG_reorder.bam
OUTPUT= F17_w_RG_reorder_dedup.bam METRICS_FILE=F17_w_RG_reorder.metricFile VALIDATION_STRINGENCY=SILENT
Here is the error message I got:
Exception in thread "main" net.sf.picard.PicardException: F17_w_RG_reorder.bam is not coordinate sorted.
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:248)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:109)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:165)
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:93)
However, before this step, I did use picard ReorderSam.jar to resort the bam file according as below: (simplified file path for clarity):
java -Xms4g -Xmx4g -jar /opt/nasapps/stow/picard-tools-1.44/ReorderSam.jar INPUT=F17_w_RG.bam
OUTPUT= F17_w_RG_reorder.bam
REFERENCE= hg19.fa
VALIDATION_STRINGENCY=SILENT
The ResorderSam seems running fine without any error, which generates the input for MarkDuplicates as shown above. I also checked the bam file, it is indeed sorted by coordinates. So the error message seem not make any sense to me.
I also did use ValidateSamFile.jar to validate the bam file and I got the output as below with a bunch of warning:
WARNING: Record 1, Read name NCI-GA3_39:7:57:11042:10162, NM tag (nucleotide differences) is missing
WARNING: Record 2, Read name NCI-GA3_39:7:102:7670:15811, NM tag (nucleotide differences) is missing
WARNING: Record 3, Read name NCI-GA3_39:7:97:8390:6286, NM tag (nucleotide differences) is missing
... (100 of such similar warnings)
Any idea why I have issue with picard MarkDuplicates for this bam file?
Thanks a lot in advance!
Myi
ABCC
National Cancer Institute at Frederick,
Frederick, MD 21702
|