Re: [svtoolkit-help] GenomeSTRiP error (Mismatched read pair records)
Status: Beta
Brought to you by:
bhandsaker
From: Bob H. <han...@br...> - 2011-08-31 16:17:03
|
Hi, Hyun, Genome STRiP is seeing 4 reads with the same IDs - they look on cursory inspection like they might be identical (i.e. the same read pair is seen twice). Did you perhaps include the same input bam file twice on the command line? If that's not it, I would try running Picard's ValidateSamFile on the problem bam file and see what it says. -Bob On 8/31/11 11:45 AM, Hyun Ji Noh wrote: > Hi, > > I'm trying to use GenomeSTRiP to call CNVs for targeted sequencing in dogs. I created mask fasta file using ComputeGenomeMask and now I have been trying to call CNVs using discovery.sh script that is provided in the GenomeSTRiP package. > > Just to describe what I've been trying, I changed the config file as instructed in the wiki page to adjust for targeted sequencing and created gender map file as well. I have multiple input bam files so I added several -I options, too. > > When I ran the modified discovery.sh script, the first type of error message I got was that: > > ##### ERROR MESSAGE: Fasta file is not indexed: canFam2/work/Canis_lupus_familiaris_assembly2.mask.fasta > > so I created fai file for the mask fasta file using following code: > > #!/bin/bash > > outdir=canFam2_1_index > readLength=101 > reference=/path/Canis_lupus_familiaris_assembly2.mask.fasta > export SV_DIR=/humgen/cnp04/bobh/svtoolkit/stable > > # These executables must be on your path. > which java> /dev/null || exit 1 > which bwa> /dev/null || exit 1 > > # The directory containing libbwa.so must be on your LD_LIBRARY_PATH > export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH} > > classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar" > > mkdir -p ${outdir}/work > > localReference=${outdir}/work/`echo ${reference} | awk -F / '{ print $NF }'` > if [ ! -e ${localReference} ]; then > ln ${reference} ${localReference} || exit 1 > fi > > java -cp ${classpath} -Xmx4g \ > org.broadinstitute.sv.apps.IndexFastaFile \ > -I ${localReference} \ > -O ${localReference}.fai \ > || exit 1 > bwa index -a bwtsw ${localReference} || exit 1 > > > Then I ran again the modified discovery.sh script and got error messages as following: > > ##### ERROR MESSAGE: Mismatched read pair records: [ {C0196ACXX110720:7:1101:10217:84864 83 chr9 15258469 37 15S86M chr9 15258470 -84 TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG #####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@ MD:Z:0A83T1 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:55 XT:i:86}, {C0196ACXX110720:7:1101:10217:84864 83 chr9 15258469 37 15S86M chr9 15258470 -84 TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG #####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@ MD:Z:0A83T1 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:55 XT:i:86}, {C0196ACXX110720:7:1101:10217:84864 163 chr9 15258470 37 86M15S chr9 15258469 84 GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC ;1=BADDDFD:?C;@GHG > HAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952 MD:Z:83T1T0 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:43 XT:i:86}, {C0196ACXX110720:7:1101:10217:84864 163 chr9 15258470 37 86M15S chr9 15258469 84 GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC ;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952 MD:Z:83T1T0 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:43 XT:i:86} ] > > Then I thought the indexing must be a problem so I used the original fasta file's fai file for the mask fasta file. But then I still got: > > ##### ERROR MESSAGE: Mismatched read pair records: [ {C0196ACXX110720:7:1101:10217:84864 83 chr9 15258469 37 15S86M chr9 15258470 -84 TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG #####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@ MD:Z:0A83T1 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:55 XT:i:86}, {C0196ACXX110720:7:1101:10217:84864 83 chr9 15258469 37 15S86M chr9 15258470 -84 TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG #####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@ MD:Z:0A83T1 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:55 XT:i:86}, {C0196ACXX110720:7:1101:10217:84864 163 chr9 15258470 37 86M15S chr9 15258469 84 GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC ;1=BADDDFD:?C;@GHG > HAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952 MD:Z:83T1T0 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:43 XT:i:86}, {C0196ACXX110720:7:1101:10217:84864 163 chr9 15258470 37 86M15S chr9 15258469 84 GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC ;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952 MD:Z:83T1T0 PG:Z:bwa RG:Z:C0196.7 AM:i:37 NM:i:2 SM:i:37 MQ:i:37 UQ:i:43 XT:i:86} ] > > Now I'm not sure what else I can try to make the discovery module works. Could you give me any idea why this is happening? If you need more detailed information, please just let me know. > > Thanks for your help. > > Bests, > Hyun Ji > ------------------------------------------------------------------------------ > Special Offer -- Download ArcSight Logger for FREE! > Finally, a world-class log management solution at an even better > price-free! And you'll get a free "Love Thy Logs" t-shirt when you > download Logger. Secure your free ArcSight Logger TODAY! > http://p.sf.net/sfu/arcsisghtdev2dev > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |