svtoolkit-help Mailing List for Structural Variation Toolkit (Page 9)

Status: Beta

Brought to you by: bhandsaker

svtoolkit-help — bug reports and trouble shooting

You can subscribe to this list here.

2011	Jan	Feb (2)	Mar (7)	Apr (3)	May	Jun (9)	Jul (2)	Aug (2)	Sep (15)	Oct (13)	Nov (11)	Dec (8)
2012	Jan (14)	Feb (5)	Mar (4)	Apr (8)	May (2)	Jun (8)	Jul (6)	Aug (4)	Sep (8)	Oct (5)	Nov (2)	Dec (4)
2013	Jan (4)	Feb	Mar (4)	Apr	May (8)	Jun (3)	Jul	Aug (1)	Sep	Oct (4)	Nov	Dec (3)
2014	Jan	Feb	Mar (3)	Apr (5)	May	Jun	Jul	Aug	Sep (1)	Oct	Nov	Dec
2015	Jan (2)	Feb (9)	Mar (2)	Apr (6)	May	Jun (7)	Jul	Aug (2)	Sep (4)	Oct (4)	Nov (1)	Dec (1)
2016	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug (3)	Sep (3)	Oct (1)	Nov	Dec
2017	Jan	Feb (2)	Mar	Apr	May	Jun	Jul	Aug	Sep (3)	Oct (3)	Nov (1)	Dec
2018	Jan (2)	Feb	Mar	Apr	May (3)	Jun (6)	Jul	Aug (4)	Sep	Oct (1)	Nov	Dec

Flat | Threaded

<< < 1 .. 7 8 9 10 > >> (Page 9 of 10)

[svtoolkit-help] STRiP Unrecognized sequence: 1:0-0

From: Jingerbread <ae4...@gm...> - 2011-10-28 13:04:22

Hi, we are very excited about GenomeStrip and have been trying to use
it. However, I got the following error. Can you please give some hints
how to fix it?


##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0
	at org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
	at org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
	at SVDiscovery.script(SVDiscovery.q:19)
	at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
	at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
	at scala.collection.Iterator$class.foreach(Iterator.scala:631)
	at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
	at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
	at scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
	at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
	at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
	at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
	at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
##### ERROR
##### ERROR Please visit to wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our wiki for extensive documentation
http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked
questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Unrecognized sequence: 1:0-0
##### ERROR ------------------------------------------------------------------------------------------

Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

From: Ashish K. <as...@we...> - 2011-10-24 23:14:58

Thanks Bob.
On the parallelisation issue, could you please clarify more.

1.       When this -windowSize is for e.g. 3 Mb, then does each window run in parallel or that's only for chunking, and we could optionally make them run separately and join the outputs later? Is it same for the SVGenotyper's -parallelJobs option?

2.       On various runs, I've noticed that the no. of optimal cores required by the program on a multi-core architecture is 2 cores. Is this correct or the program can use more cores on a 8-core node in some different settings?

Best,
Ashish.


From: Bob Handsaker [mailto:han...@br...]
Sent: 21 October 2011 16:49
To: Ashish Kumar
Cc: svt...@li...
Subject: Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

Yes, you can use -L to process just a specific interval (or can pass a file with a list of intervals - the file extension must be .list).
Note that the SVDiscovery queue script also does parallelization internally, based on the -windowSize parameter.
For example, here are typical parameters to process the genome (or just the intervals selected with -L) in 3Mb windows for events between 100bp and 100Kb:

    -windowSize 3000000
    -windowPadding 100000
    -minimumSize 100
    -maximumSize 100000

You can invoke the queue script without '-run' to preview the chunking.

-Bob

On 10/21/11 11:30 AM, Ashish Kumar wrote:
Hi Bob,

On the same issue, if we want to use the -L option, would it be safe to presume that we can chunk up the chromosomes.
So, say something like "-L  20:1250000-2500000" would be a valid option, assuming that this sequence exists in my reference genome?

Thanks,
Ashish


From: Bob Handsaker [mailto:han...@br...]
Sent: 06 October 2011 14:40
To: svt...@li...<mailto:svt...@li...>
Subject: Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

This is because the example script uses "-L 1" (only process chromosome 1) to make it faster,
but you likely don't have a sequence named "1" in your reference genome.
To process the whole genome, simply remove the -L argument.
-Bob

On 10/5/11 2:50 PM, Axel Ericsson wrote:
Hi
I have get the following error message when I run the Genome strip: I forwarded the modified shell script, hope you could point me in the right direction.
Best regards Axel

INFO  14:37:52,013 QScriptManager - Compiling 2 QScripts
INFO  14:37:59,172 QScriptManager - Compilation complete
INFO  14:38:02,654 HelpFormatter - ---------------------------------------------------------
INFO  14:38:02,655 HelpFormatter - Program Name: org.broadinstitute.sting.queue.QCommandLine
INFO  14:38:02,655 HelpFormatter - Program Args: -S /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVDiscovery.q -S /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVQScript.q -gatk /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar -cp /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/SVToolkit.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/Queue.jar -configFile conf/genstrip_HCSMA_parameters.txt -tempDir ./tmpdir -R /seq/references/Canis_lupus_familiaris_assembly2/v0/Canis_lupus_familiaris_assembly2.fasta -genomeMaskFile /seq/vsag/hyunji/CCD/capture/genomestrip/canFam2_1_index/work/Canis_lupus_familiaris_assembly2.mask.fasta -genderMapFile data/HCSMA_gender.map -runDirectory HCSMA -md HCSMA/metadata -jobLogDir HCSMA/logs -L 1 -minimumSize 100 -maximumSize 1000000 -I /seq/vsag/axel/bamfiles/HCSMCRealignment.HCSMA_B90_Homo_1.clean.dedup.recal.bam -O HCSMA.discovery.vcf -run
INFO  14:38:02,656 HelpFormatter - Date/Time: 2011/10/05 14:38:02
INFO  14:38:02,656 HelpFormatter - ---------------------------------------------------------
INFO  14:38:02,657 HelpFormatter - ---------------------------------------------------------
INFO  14:38:02,661 QCommandLine - Scripting SVDiscovery
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0
at org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
at org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
at SVDiscovery.script(SVDiscovery.q:19)
at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
at scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
##### ERROR
##### ERROR Please visit to wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Unrecognized sequence: 1:0-0
##### ERROR ------------------------------------------------------------------------------------------






------------------------------------------------------------------------------

All the data continuously generated in your IT infrastructure contains a

definitive record of customers, application performance, security

threats, fraudulent activity and more. Splunk takes this data and makes

sense of it. Business sense. IT sense. Common sense.

http://p.sf.net/sfu/splunk-d2dcopy1





_______________________________________________

svtoolkit-help mailing list

svt...@li...<mailto:svt...@li...>

https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

Re: [svtoolkit-help] FW: problem encountered, can you please help?

From: Bob H. <han...@br...> - 2011-10-24 22:37:25

Hi, Jin,
Forgive me if this is too obvious, but is SVToolkit.jar in the current 
directory?
The -cp argument must list the paths to SVToolkit.jar and 
GenomeAnalysisTK.jar.
If this isn't the problem, can you check the version with "java -jar 
SVToolkit.jar"?
-Bob

On 10/24/11 11:55 AM, Szatkiewicz, Jin P wrote:
> To whom it may concern,
>
> I am a new user to GenomeStrip. I am creating the genome mask file as the first step.
>
> However, using the command below, I got the following error message. Can anyone have an idea how to fix it? Many thanks!!
>
> Best,
>
> Jin
>
> COMMAND:
> +++++++++++++++++++++
> export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}
>
> java -Xmx2g -cp SVToolkit.jar:GenomeAnalysisTK.jar \
>      org.broadinstitute.sv.apps.ComputeGenomeMask \
>       -R hg19pfs.fasta \
>       -O hg19pfs.mask100.chr1.fasta \
>       -readLength 100 \
>       -sequence chr1
> +++++++++++++++++++++++
>
> ERROR:
> ++++++++++++++++++++++++++++
> Exception in thread "main" java.lang.NoClassDefFoundError: org/broadinstitute/sv/apps/ComputeGenomeMask
> Caused by: java.lang.ClassNotFoundException: org.broadinstitute.sv.apps.ComputeGenomeMask
>          at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>          at java.security.AccessController.doPrivileged(Native Method)
>          at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>          at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>          at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>          at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.broadinstitute.sv.apps.ComputeGenomeMask.  Program will exit.
>
>
> ------------------------------------------------------------------------------
> The demand for IT networking professionals continues to grow, and the
> demand for specialized networking skills is growing even more rapidly.
> Take a complimentary Learning@Cisco Self-Assessment and learn
> about Cisco certifications, training, and career opportunities.
> http://p.sf.net/sfu/cisco-dev2dev
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

[svtoolkit-help] FW: problem encountered, can you please help?

From: Szatkiewicz, J. P <jin...@me...> - 2011-10-24 16:30:14

To whom it may concern, 

I am a new user to GenomeStrip. I am creating the genome mask file as the first step. 

However, using the command below, I got the following error message. Can anyone have an idea how to fix it? Many thanks!!

Best,

Jin

COMMAND:
+++++++++++++++++++++
export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}

java -Xmx2g -cp SVToolkit.jar:GenomeAnalysisTK.jar \
    org.broadinstitute.sv.apps.ComputeGenomeMask \
     -R hg19pfs.fasta \
     -O hg19pfs.mask100.chr1.fasta \
     -readLength 100 \
     -sequence chr1
+++++++++++++++++++++++

ERROR:
++++++++++++++++++++++++++++
Exception in thread "main" java.lang.NoClassDefFoundError: org/broadinstitute/sv/apps/ComputeGenomeMask
Caused by: java.lang.ClassNotFoundException: org.broadinstitute.sv.apps.ComputeGenomeMask
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.broadinstitute.sv.apps.ComputeGenomeMask.  Program will exit.

Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

From: Ashish K. <as...@we...> - 2011-10-21 15:53:12

Hi Bob,

On the same issue, if we want to use the -L option, would it be safe to presume that we can chunk up the chromosomes.
So, say something like "-L  20:1250000-2500000" would be a valid option, assuming that this sequence exists in my reference genome?

Thanks,
Ashish


From: Bob Handsaker [mailto:han...@br...]
Sent: 06 October 2011 14:40
To: svt...@li...
Subject: Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

This is because the example script uses "-L 1" (only process chromosome 1) to make it faster,
but you likely don't have a sequence named "1" in your reference genome.
To process the whole genome, simply remove the -L argument.
-Bob

On 10/5/11 2:50 PM, Axel Ericsson wrote:
Hi
I have get the following error message when I run the Genome strip: I forwarded the modified shell script, hope you could point me in the right direction.
Best regards Axel

INFO  14:37:52,013 QScriptManager - Compiling 2 QScripts
INFO  14:37:59,172 QScriptManager - Compilation complete
INFO  14:38:02,654 HelpFormatter - ---------------------------------------------------------
INFO  14:38:02,655 HelpFormatter - Program Name: org.broadinstitute.sting.queue.QCommandLine
INFO  14:38:02,655 HelpFormatter - Program Args: -S /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVDiscovery.q -S /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVQScript.q -gatk /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar -cp /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/SVToolkit.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/Queue.jar -configFile conf/genstrip_HCSMA_parameters.txt -tempDir ./tmpdir -R /seq/references/Canis_lupus_familiaris_assembly2/v0/Canis_lupus_familiaris_assembly2.fasta -genomeMaskFile /seq/vsag/hyunji/CCD/capture/genomestrip/canFam2_1_index/work/Canis_lupus_familiaris_assembly2.mask.fasta -genderMapFile data/HCSMA_gender.map -runDirectory HCSMA -md HCSMA/metadata -jobLogDir HCSMA/logs -L 1 -minimumSize 100 -maximumSize 1000000 -I /seq/vsag/axel/bamfiles/HCSMCRealignment.HCSMA_B90_Homo_1.clean.dedup.recal.bam -O HCSMA.discovery.vcf -run
INFO  14:38:02,656 HelpFormatter - Date/Time: 2011/10/05 14:38:02
INFO  14:38:02,656 HelpFormatter - ---------------------------------------------------------
INFO  14:38:02,657 HelpFormatter - ---------------------------------------------------------
INFO  14:38:02,661 QCommandLine - Scripting SVDiscovery
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0
at org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
at org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
at SVDiscovery.script(SVDiscovery.q:19)
at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
at scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
##### ERROR
##### ERROR Please visit to wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Unrecognized sequence: 1:0-0
##### ERROR ------------------------------------------------------------------------------------------





------------------------------------------------------------------------------

All the data continuously generated in your IT infrastructure contains a

definitive record of customers, application performance, security

threats, fraudulent activity and more. Splunk takes this data and makes

sense of it. Business sense. IT sense. Common sense.

http://p.sf.net/sfu/splunk-d2dcopy1




_______________________________________________

svtoolkit-help mailing list

svt...@li...<mailto:svt...@li...>

https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

From: Bob H. <han...@br...> - 2011-10-21 15:48:51

Yes, you can use -L to process just a specific interval (or can pass a 
file with a list of intervals - the file extension must be .list).
Note that the SVDiscovery queue script also does parallelization 
internally, based on the -windowSize parameter.
For example, here are typical parameters to process the genome (or just 
the intervals selected with -L) in 3Mb windows for events between 100bp 
and 100Kb:

     -windowSize 3000000
     -windowPadding 100000
     -minimumSize 100
     -maximumSize 100000

You can invoke the queue script without '-run' to preview the chunking.

-Bob

On 10/21/11 11:30 AM, Ashish Kumar wrote:
>
> Hi Bob,
>
> On the same issue, if we want to use the --L option, would it be safe 
> to presume that we can chunk up the chromosomes.
>
> So, say something like "-L 20:1250000-2500000" would be a valid 
> option, assuming that this sequence exists in my reference genome?
>
> Thanks,
>
> Ashish
>
> *From:*Bob Handsaker [mailto:han...@br...]
> *Sent:* 06 October 2011 14:40
> *To:* svt...@li...
> *Subject:* Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 
> 1:0-0
>
> This is because the example script uses "-L 1" (only process 
> chromosome 1) to make it faster,
> but you likely don't have a sequence named "1" in your reference genome.
> To process the whole genome, simply remove the -L argument.
> -Bob
>
> On 10/5/11 2:50 PM, Axel Ericsson wrote:
>
> Hi
>
> I have get the following error message when I run the Genome strip: I 
> forwarded the modified shell script, hope you could point me in the 
> right direction.
>
> Best regards Axel
>
> INFO  14:37:52,013 QScriptManager - Compiling 2 QScripts
>
> INFO  14:37:59,172 QScriptManager - Compilation complete
>
> INFO  14:38:02,654 HelpFormatter - 
> ---------------------------------------------------------
>
> INFO  14:38:02,655 HelpFormatter - Program Name: 
> org.broadinstitute.sting.queue.QCommandLine
>
> INFO  14:38:02,655 HelpFormatter - Program Args: -S 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVDiscovery.q -S 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVQScript.q -gatk 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar 
> -cp 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/SVToolkit.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/Queue.jar 
> -configFile conf/genstrip_HCSMA_parameters.txt -tempDir ./tmpdir -R 
> /seq/references/Canis_lupus_familiaris_assembly2/v0/Canis_lupus_familiaris_assembly2.fasta 
> -genomeMaskFile 
> /seq/vsag/hyunji/CCD/capture/genomestrip/canFam2_1_index/work/Canis_lupus_familiaris_assembly2.mask.fasta 
> -genderMapFile data/HCSMA_gender.map -runDirectory HCSMA -md 
> HCSMA/metadata -jobLogDir HCSMA/logs -L 1 -minimumSize 100 
> -maximumSize 1000000 -I 
> /seq/vsag/axel/bamfiles/HCSMCRealignment.HCSMA_B90_Homo_1.clean.dedup.recal.bam 
> -O HCSMA.discovery.vcf -run
>
> INFO  14:38:02,656 HelpFormatter - Date/Time: 2011/10/05 14:38:02
>
> INFO  14:38:02,656 HelpFormatter - 
> ---------------------------------------------------------
>
> INFO  14:38:02,657 HelpFormatter - 
> ---------------------------------------------------------
>
> INFO  14:38:02,661 QCommandLine - Scripting SVDiscovery
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
> ##### ERROR stack trace
>
> java.lang.IllegalArgumentException:*Unrecognized sequence: 1:0-0*
>
> at 
> org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
>
> at 
> org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
>
> at SVDiscovery.script(SVDiscovery.q:19)
>
> at 
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
>
> at 
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
>
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
>
> at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
>
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
>
> at 
> scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
>
> at 
> org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
>
> at 
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
>
> at 
> org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
>
> at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
> ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
>
> ##### ERROR
>
> ##### ERROR Please visit to wiki to see if this is a known problem
>
> ##### ERROR If not, please post the error, with stack trace, to the 
> GATK forum
>
> ##### ERROR Visit our wiki for extensive documentation 
> http://www.broadinstitute.org/gsa/wiki
>
> ##### ERROR Visit our forum to view answers to commonly asked 
> questions http://getsatisfaction.com/gsa
>
> ##### ERROR
>
> ##### ERROR MESSAGE: Unrecognized sequence: 1:0-0
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
>
>
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...  <mailto:svt...@li...>
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help
>

Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0

From: Bob H. <han...@br...> - 2011-10-06 13:40:04

This is because the example script uses "-L 1" (only process chromosome 
1) to make it faster,
but you likely don't have a sequence named "1" in your reference genome.
To process the whole genome, simply remove the -L argument.
-Bob

On 10/5/11 2:50 PM, Axel Ericsson wrote:
> Hi
> I have get the following error message when I run the Genome strip: I 
> forwarded the modified shell script, hope you could point me in the 
> right direction.
> Best regards Axel
>
> INFO  14:37:52,013 QScriptManager - Compiling 2 QScripts
> INFO  14:37:59,172 QScriptManager - Compilation complete
> INFO  14:38:02,654 HelpFormatter - 
> ---------------------------------------------------------
> INFO  14:38:02,655 HelpFormatter - Program Name: 
> org.broadinstitute.sting.queue.QCommandLine
> INFO  14:38:02,655 HelpFormatter - Program Args: -S 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVDiscovery.q -S 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVQScript.q -gatk 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar 
> -cp 
> /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/SVToolkit.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/Queue.jar 
> -configFile conf/genstrip_HCSMA_parameters.txt -tempDir ./tmpdir -R 
> /seq/references/Canis_lupus_familiaris_assembly2/v0/Canis_lupus_familiaris_assembly2.fasta 
> -genomeMaskFile 
> /seq/vsag/hyunji/CCD/capture/genomestrip/canFam2_1_index/work/Canis_lupus_familiaris_assembly2.mask.fasta 
> -genderMapFile data/HCSMA_gender.map -runDirectory HCSMA -md 
> HCSMA/metadata -jobLogDir HCSMA/logs -L 1 -minimumSize 100 
> -maximumSize 1000000 -I 
> /seq/vsag/axel/bamfiles/HCSMCRealignment.HCSMA_B90_Homo_1.clean.dedup.recal.bam 
> -O HCSMA.discovery.vcf -run
> INFO  14:38:02,656 HelpFormatter - Date/Time: 2011/10/05 14:38:02
> INFO  14:38:02,656 HelpFormatter - 
> ---------------------------------------------------------
> INFO  14:38:02,657 HelpFormatter - 
> ---------------------------------------------------------
> INFO  14:38:02,661 QCommandLine - Scripting SVDiscovery
> ##### ERROR 
> ------------------------------------------------------------------------------------------
> ##### ERROR stack trace
> java.lang.IllegalArgumentException:*Unrecognized sequence: 1:0-0*
> at 
> org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
> at 
> org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
> at SVDiscovery.script(SVDiscovery.q:19)
> at 
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
> at 
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
> at 
> scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
> at 
> org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
> at 
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
> at 
> org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
> at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
> ##### ERROR 
> ------------------------------------------------------------------------------------------
> ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
> ##### ERROR
> ##### ERROR Please visit to wiki to see if this is a known problem
> ##### ERROR If not, please post the error, with stack trace, to the 
> GATK forum
> ##### ERROR Visit our wiki for extensive documentation 
> http://www.broadinstitute.org/gsa/wiki
> ##### ERROR Visit our forum to view answers to commonly asked 
> questions http://getsatisfaction.com/gsa
> ##### ERROR
> ##### ERROR MESSAGE: Unrecognized sequence: 1:0-0
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

Re: [svtoolkit-help] Unrecognized sequence: 1:0-0 error with ELAND mapped bams

From: Verena T. <ver...@em...> - 2011-09-30 07:24:08

Hi Bob,

thanks a lot for your quick reply.

Here is the command right before the error stack:

INFO  23:32:06,755 QGraph - Deleting intermediate files.
INFO  23:32:06,764 QCommandLine - Done
INFO  23:32:09,047 QScriptManager - Compiling 2 QScripts
INFO  23:32:13,741 QScriptManager - Compilation complete
INFO  23:32:15,767 HelpFormatter -
---------------------------------------------------------
INFO  23:32:15,767 HelpFormatter - Program Name:
org.broadinstitute.sting.queue.QCommandLine
INFO  23:32:15,768 HelpFormatter - Program Args: -S
/home/tischler/software/svtoolkit/qscript/SVDiscovery.q -S
/home/tischler/software/svtoolkit/qscript/SVQScript.q -gatk
/home/tischler/software/svtoolkit/lib/gatk/GenomeAnalysisTK.jar -cp
/home/tischler/software/svtoolkit/lib/SVToolkit.jar:/home/tischler/software/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/home/tischler/software/svtoolkit/lib/gatk/Queue.jar
-configFile conf/genstrip_parameters.txt -tempDir  /home/tischler/tmp/ -R
home/tischler/genomes/GenomeSTRiP_ref/test.fa -genomeMaskFile
/home/tischler/genomes/GenomeSTRiP_ref/test.mask.fasta -genderMapFile
gender.map -runDirectory test -md test/metadata -jobLogDir test/logs -L 1
-minimumSize 100 -maximumSize 1000000 -I /home/tischler/test1/test1.bam -I
/home/tischler/test2/test2.bam -I /home/tischler/test3/test3.bam -I
/home/tischler/test4/test4.bam -I /home/tischler/test5/test5.bam -O
/home/tischler/GenomeSTRiP/deletions.discovery.vcf -run
INFO  23:32:15,768 HelpFormatter - Date/Time: 2011/09/29 23:32:15
INFO  23:32:15,768 HelpFormatter -
---------------------------------------------------------
INFO  23:32:15,768 HelpFormatter -
---------------------------------------------------------
INFO  23:32:15,769 QCommandLine - Scripting SVDiscovery


The command from the bash-script (provided in the GenomeSTRiP package) I
only modified according to my paths:

export SV_DIR=`pwd`
SV_TMPDIR=/home/tischler/tmp/
export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}
mx="-Xmx8g"
classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar"
...
...
# Run discovery
java -cp ${classpath} ${mx} \
    org.broadinstitute.sting.queue.QCommandLine \
    -S ${SV_DIR}/qscript/SVDiscovery.q \
    -S ${SV_DIR}/qscript/SVQScript.q \
    -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
    -cp ${classpath} \
    -configFile conf/genstrip_parameters.txt \
    -tempDir ${SV_TMPDIR} \
    -R /home/tischler/genomes/GenomeSTRiP_ref/test.fa \
    -genomeMaskFile /home/tischler/genomes/GenomeSTRiP_ref/test.mask.fasta \
    -genderMapFile gender.map \
    -runDirectory ${runDir} \
    -md ${runDir}/metadata \
    -jobLogDir ${runDir}/logs \
    -L 1 \
    -minimumSize 100 \
    -maximumSize 1000000 \
    -I ${bam1} \
    -I ${bam2} \
    -I ${bam3} \
    -I ${bam4} \
    -I ${bam5} \
    -O ${sites} \
    -run \
    || exit 1


I hope this helps!


Cheers
Verena



2011/9/27 Verena Tischler <ver...@em...>

> Dear all,
>
> I am using GenomeSTRiP on eland mapped bam files and get the following
> error message:
>
> INFO 04:20:04,705 HelpFormatter - Date/Time: 2011/09/27 04:20:04
> INFO 04:20:04,705 HelpFormatter -
> ---------------------------------------------------------
> INFO 04:20:04,705 HelpFormatter -
> ---------------------------------------------------------
> INFO 04:20:04,706 QCommandLine - Scripting SVDiscovery
> ##### ERROR
> ------------------------------------------------------------------------------------------
>
> ##### ERROR stack trace
> java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0
> at
> org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
>
> at
> org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
>
> at SVDiscovery.script(SVDiscovery.q:19)
> at
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
>
> at
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
>
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
>
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
> at
> scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
>
> at
> org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
> at
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
> at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
>
> at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
> ##### ERROR ----------------------
>
> parsing the input bam files for this sequence I only get a hit in the read
> quality string. Here is how a typical line in my bams looks like:
> HWI-ST169_185:3:41:17915:79533 89 chr15.fa 98634059 91 103M * 0 0
> CTACAAAGATAAAAAATTAGCTGAGTCTGTTGTCACAGGCCTGTCGTCCCAGCTACTGGAGAGGCTGAGGCATGAGAATCGCTTGAACCCGGGGGCAGACGTT
> BAD*96)4(17-@<=8+9+8D;>1:0-0@1+)'7(<><38=1).089FFD=FFF@FF=E=EEDD=?DD:EEE888EEF.DFD
> XD:Z:AA1A2GA1A13G1CA2G2A1G2T6AA11T1G32A6G3 SM:i:91 AS:i:0
> RG:Z:test_eland_101PE_1
>
>
> I hope this explains my problem
> Many thanks in advance for your help
> Verena
>
>


-- 
Predoctoral fellow
Korbel Group - Genome Biology Unit
EMBL (European Molecular Biology Laboratories)
Meyerhofstrasse 1
69117 Heidelberg
Germany

+49 (0) 6221 387-8479
ver...@em...


13th International EMBL PhD Symposium Heidelberg, 17th-19th November 2011
Find out more information by visiting:
http://phdsymposium.embl.org/

Re: [svtoolkit-help] Unrecognized sequence: 1:0-0 error with ELAND mapped bams

From: Bob H. <han...@br...> - 2011-09-29 15:20:40

This is likely related to how you are invoking the Q script.  Can you 
send the command line you are using?
Thanks,
-Bob

On 9/27/11 8:41 AM, Verena Tischler wrote:
> Dear all,
>
> I am using GenomeSTRiP on eland mapped bam files and get the following 
> error message:
>
> INFO 04:20:04,705 HelpFormatter - Date/Time: 2011/09/27 04:20:04
> INFO 04:20:04,705 HelpFormatter - 
> ---------------------------------------------------------
> INFO 04:20:04,705 HelpFormatter - 
> ---------------------------------------------------------
> INFO 04:20:04,706 QCommandLine - Scripting SVDiscovery
> ##### ERROR 
> ------------------------------------------------------------------------------------------ 
>
> ##### ERROR stack trace
> java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0
> at 
> org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)
> at 
> org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)
> at SVDiscovery.script(SVDiscovery.q:19)
> at 
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)
> at 
> org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
> at 
> scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)
> at 
> org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
> at 
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) 
> at 
> org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)
> at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
> ##### ERROR ----------------------
>
> parsing the input bam files for this sequence I only get a hit in the 
> read quality string. Here is how a typical line in my bams looks like:
> HWI-ST169_185:3:41:17915:79533 89 chr15.fa 98634059 91 103M * 0 0 
> CTACAAAGATAAAAAATTAGCTGAGTCTGTTGTCACAGGCCTGTCGTCCCAGCTACTGGAGAGGCTGAGGCATGAGAATCGCTTGAACCCGGGGGCAGACGTT 
> BAD*96)4(17-@<=8+9+8D;>1:0-0@1+)'7(<><38=1).089FFD=FFF@FF=E=EEDD=?DD:EEE888EEF.DFD 
> XD:Z:AA1A2GA1A13G1CA2G2A1G2T6AA11T1G32A6G3 SM:i:91 AS:i:0 
> RG:Z:test_eland_101PE_1
>
>
> I hope this explains my problem
> Many thanks in advance for your help
> Verena
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

[svtoolkit-help] Unrecognized sequence: 1:0-0 error with ELAND mapped bams

From: Verena T. <ver...@em...> - 2011-09-27 13:17:04

Dear all,

I am using GenomeSTRiP on eland mapped bam files and get the following error
message:

INFO 04:20:04,705 HelpFormatter - Date/Time: 2011/09/27 04:20:04
INFO 04:20:04,705 HelpFormatter -
---------------------------------------------------------
INFO 04:20:04,705 HelpFormatter -
---------------------------------------------------------
INFO 04:20:04,706 QCommandLine - Scripting SVDiscovery
##### ERROR
------------------------------------------------------------------------------------------

##### ERROR stack trace
java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0
at
org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96)

at
org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132)

at SVDiscovery.script(SVDiscovery.q:19)
at
org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46)

at
org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43)

at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at
scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)

at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
at
scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596)

at
org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43)
at
org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117)

at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
##### ERROR ----------------------

parsing the input bam files for this sequence I only get a hit in the read
quality string. Here is how a typical line in my bams looks like:
HWI-ST169_185:3:41:17915:79533 89 chr15.fa 98634059 91 103M * 0 0
CTACAAAGATAAAAAATTAGCTGAGTCTGTTGTCACAGGCCTGTCGTCCCAGCTACTGGAGAGGCTGAGGCATGAGAATCGCTTGAACCCGGGGGCAGACGTT
BAD*96)4(17-@<=8+9+8D;>1:0-0@1+)'7(<><38=1).089FFD=FFF@FF=E=EEDD=?DD:EEE888EEF.DFD
XD:Z:AA1A2GA1A13G1CA2G2A1G2T6AA11T1G32A6G3 SM:i:91 AS:i:0
RG:Z:test_eland_101PE_1


I hope this explains my problem
Many thanks in advance for your help
Verena

Re: [svtoolkit-help] SVDiscovery error for installation test

From: Bob H. <han...@br...> - 2011-09-13 17:12:48

Glad you solved the R problem.
For the benefit of others, can you post what was wrong with the 
environment to cause these symptoms?

Regarding the fasta indexing, I suspect you have a blank line in your 
fasta file.  Could you check that?
I'll fix the code to generate a better error message.

-Bob

On 9/13/11 12:36 PM, Ashish Kumar wrote:
>
> Hi Bob,
>
> I could resolve the problem. It was R environment related.
>
> As a next step, I am now creating a Genome mask file for my reference 
> genome. For this, when I try to index my fasta file, I keep getting 
> the following error:
>
> Exception in thread "main" java.lang.RuntimeException: String index 
> out of range: 0
>
>         at 
> org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:36)
>
>         at 
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
>
>         at 
> org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:23)
>
>         at 
> org.broadinstitute.sv.apps.IndexFastaFile.main(IndexFastaFile.java:39)
>
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out 
> of range: 0
>
>         at 
> java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:191)
>
>         at java.lang.StringBuilder.charAt(StringBuilder.java:72)
>
>         at 
> org.broadinstitute.sv.util.fasta.FastaIndexer.indexFastaFile(FastaIndexer.java:48)
>
>         at 
> org.broadinstitute.sv.apps.IndexFastaFile.run(IndexFastaFile.java:49)
>
>         at 
> org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:33)
>
>         ... 3 more
>
> Would you be able to provide some clues please? Perhaps if you could 
> explain - what type of values/how many characters does the program 
> expect in the first column of the "<ref.fasta>.fai" file?
>
> Thanks,
>
> Ashish
>
> *From:*Bob Handsaker [mailto:han...@br...]
> *Sent:* 08 September 2011 23:45
> *To:* svt...@li...
> *Subject:* Re: [svtoolkit-help] SVDiscovery error for installation test
>
> Hi, Ashish,
> I don't know the exact problem, but it seems like something with your 
> R environment.
> You can try running the following short R script:
>
> require("coin")
> df = data.frame(COVERAGE=1:10)
> df$DELETED = as.factor(df$COVERAGE > 5)
> wilcox_test(COVERAGE ~ DELETED, df, alternative="l")
>
> which will hopefully reproduce the problem so that you can diagnose it.
> Something approximating this isn't running when R is invoked from java.
>
> -Bob
>
> On 9/8/11 3:05 PM, Ashish Kumar wrote:
>
> Hi Bob,
>
> While running the installation test, it seems that the SVDiscovery 
> step complains about an R function. I have the "coin package" is 
> installed on my server.
>
> I am pasting the error below, please advice.
>
> Best,
>
> Ashish
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
> ##### ERROR stack trace
>
> java.lang.RuntimeException: Error running script 
> /ib/users/ashish/genostrip/svtoolkit/R/discovery/compute_ranksum_pvalue.R: 
> Error in function (classes, fdef, mtable)  :
>
>   unable to find an inherited method for function "pvalue", for 
> signature "htest"
>
> Calls: main ... compute.ranksum.pvalue.coin -> pvalue -> <Anonymous>
>
> Execution halted
>
>         at 
> org.broadinstitute.sv.discovery.ClusterDepthModule.computeRankSumPValue(ClusterDepthModule.java:284)
>
>         at 
> org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:195)
>
>         at 
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:395)
>
>         at 
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:303)
>
>         at 
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:163)
>
>         at 
> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:150)
>
>         at 
> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43)
>
>         at 
> org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
>
>         at 
> org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:75)
>
>         at 
> org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:217)
>
>         at 
> org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:111)
>
>         at 
> org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
>
>         at 
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
>
>         at 
> org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
>
>         at 
> org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
> ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
>
>
>
>
> ------------------------------------------------------------------------------
> Doing More with Less: The Next Generation Virtual Desktop
> What are the key obstacles that have prevented many mid-market businesses
> from deploying virtual desktops?   How do next-generation virtual desktops
> provide companies an easier-to-deploy, easier-to-manage and more affordable
> virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/
>
>
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...  <mailto:svt...@li...>
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help
>

Re: [svtoolkit-help] SVDiscovery error for installation test

From: Bob H. <han...@br...> - 2011-09-08 22:45:05

Hi, Ashish,
I don't know the exact problem, but it seems like something with your R 
environment.
You can try running the following short R script:

require("coin")
df = data.frame(COVERAGE=1:10)
df$DELETED = as.factor(df$COVERAGE > 5)
wilcox_test(COVERAGE ~ DELETED, df, alternative="l")

which will hopefully reproduce the problem so that you can diagnose it.
Something approximating this isn't running when R is invoked from java.

-Bob

On 9/8/11 3:05 PM, Ashish Kumar wrote:
>
> Hi Bob,
>
> While running the installation test, it seems that the SVDiscovery 
> step complains about an R function. I have the "coin package" is 
> installed on my server.
>
> I am pasting the error below, please advice.
>
> Best,
>
> Ashish
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
> ##### ERROR stack trace
>
> java.lang.RuntimeException: Error running script 
> /ib/users/ashish/genostrip/svtoolkit/R/discovery/compute_ranksum_pvalue.R: 
> Error in function (classes, fdef, mtable)  :
>
>   unable to find an inherited method for function "pvalue", for 
> signature "htest"
>
> Calls: main ... compute.ranksum.pvalue.coin -> pvalue -> <Anonymous>
>
> Execution halted
>
>         at 
> org.broadinstitute.sv.discovery.ClusterDepthModule.computeRankSumPValue(ClusterDepthModule.java:284)
>
>         at 
> org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:195)
>
>         at 
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:395)
>
>         at 
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:303)
>
>         at 
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:163)
>
>         at 
> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:150)
>
>         at 
> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43)
>
>         at 
> org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
>
>         at 
> org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:75)
>
>         at 
> org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:217)
>
>         at 
> org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:111)
>
>         at 
> org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
>
>         at 
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
>
>         at 
> org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
>
>         at 
> org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
>
> ##### ERROR 
> ------------------------------------------------------------------------------------------
>
> ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):
>
>
>
> ------------------------------------------------------------------------------
> Doing More with Less: The Next Generation Virtual Desktop
> What are the key obstacles that have prevented many mid-market businesses
> from deploying virtual desktops?   How do next-generation virtual desktops
> provide companies an easier-to-deploy, easier-to-manage and more affordable
> virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

[svtoolkit-help] SVDiscovery error for installation test

From: Ashish K. <as...@we...> - 2011-09-08 19:06:07

Hi Bob,

While running the installation test, it seems that the SVDiscovery step complains about an R function. I have the "coin package" is installed on my server.
I am pasting the error below, please advice.
Best,
Ashish


##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.RuntimeException: Error running script /ib/users/ashish/genostrip/svtoolkit/R/discovery/compute_ranksum_pvalue.R: Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function "pvalue", for signature "htest"
Calls: main ... compute.ranksum.pvalue.coin -> pvalue -> <Anonymous>
Execution halted

        at org.broadinstitute.sv.discovery.ClusterDepthModule.computeRankSumPValue(ClusterDepthModule.java:284)
        at org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:195)
        at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:395)
        at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:303)
        at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:163)
        at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:150)
        at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43)
        at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
        at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:75)
        at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:217)
        at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:111)
        at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
        at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
        at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):

[svtoolkit-help] SVDiscovery error for installation test

From: Ashish K. <as...@we...> - 2011-09-08 19:01:23

Hi Bob,

While running the installation test, it seems that the SVDiscovery step complains about an R function. I have the "coin package" is installed on my server.
I am pasting the error below, please advice.
Best,
Ashish


##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.RuntimeException: Error running script /ib/users/ashish/genostrip/svtoolkit/R/discovery/compute_ranksum_pvalue.R: Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function "pvalue", for signature "htest"
Calls: main ... compute.ranksum.pvalue.coin -> pvalue -> <Anonymous>
Execution halted

        at org.broadinstitute.sv.discovery.ClusterDepthModule.computeRankSumPValue(ClusterDepthModule.java:284)
        at org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:195)
        at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:395)
        at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:303)
        at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:163)
        at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:150)
        at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43)
        at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
        at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:75)
        at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:217)
        at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:111)
        at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239)
        at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
        at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M):

Re: [svtoolkit-help] problem running genome strip

From: Bob H. <han...@br...> - 2011-09-08 14:39:08

Can you send me more of the output, including the stack trace and 
anything coming before this?
Also, what version are you using (e.g. what's the output of "java -jar 
SVToolkit.jar") ?
Thanks,
-Bob

On 9/6/11 10:10 AM, Linda Hughes wrote:
> Hi,
>
> Im trying to run genome strip SVdiscovery on 9 bam files to look for 
> deletions in a particular region of chromosome 14. However when I run 
> the program I get the following error:
>
> INFO 11:45:57,665 SVDiscovery - Processing cluster 
> chr14:106783666-106786180 chr14:106810874-106812753 LR 35
> #DBG: RC Cache fill chr14:106776181-106876180 100000 9 0.349954 sec
> Error: Exception processing cluster: Permitted to write any record 
> upstream of position 106788832, but a record at chr14:106786183 was 
> just added.
> Cluster: chr14:106783666-106786180 chr14:106810874-106812753 LR 35
> INFO 11:45:59,342 GATKRunReport - Aggregating data for run report
> [GC 309491K->25036K(819712K), 0.0030080 secs]
> [Full GC 25036K->24617K(819712K), 0.0708550 secs]
> ##### ERROR 
> ------------------------------------------------------------------------------------------
> ##### ERROR stack trace
>
> Do you have any idea what could be causing the error?
>
> many thanks,
>
> Linda
>
>
> ------------------------------------------------------------------------------
> Doing More with Less: The Next Generation Virtual Desktop
> What are the key obstacles that have prevented many mid-market businesses
> from deploying virtual desktops?   How do next-generation virtual desktops
> provide companies an easier-to-deploy, easier-to-manage and more affordable
> virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

[svtoolkit-help] problem running genome strip

From: Linda H. <li...@we...> - 2011-09-06 14:10:42

Hi,

Im trying to run genome strip SVdiscovery on 9 bam files to look for 
deletions in a particular region of chromosome 14. However when I run 
the program I get the following error:

INFO 11:45:57,665 SVDiscovery - Processing cluster 
chr14:106783666-106786180 chr14:106810874-106812753 LR 35
#DBG: RC Cache fill chr14:106776181-106876180 100000 9 0.349954 sec
Error: Exception processing cluster: Permitted to write any record 
upstream of position 106788832, but a record at chr14:106786183 was just 
added.
Cluster: chr14:106783666-106786180 chr14:106810874-106812753 LR 35
INFO 11:45:59,342 GATKRunReport - Aggregating data for run report
[GC 309491K->25036K(819712K), 0.0030080 secs]
[Full GC 25036K->24617K(819712K), 0.0708550 secs]
##### ERROR 
------------------------------------------------------------------------------------------
##### ERROR stack trace

Do you have any idea what could be causing the error?

many thanks,

Linda

Re: [svtoolkit-help] Filter criteria for SVs

From: Bob H. <han...@br...> - 2011-09-05 21:54:10

First, I should warn you that I haven't done too much work with 
discovery on targeted sequencing.
Your results may vary depending on how even your coverage is across the 
target region.

Filtering good calls from the candidates is still somewhat of an art, 
and it depends on whether you want more sensitivity or more specificity.
There is a default set of filters in the queue scripts is based on the 
1000 genomes pilot.

The metrics most often used for filtering are the following:

DEPTHRATIO / DEPTHPVALUE
The depth ratio is the mean read depth for samples with observed 
aberrant read pairs divided by the mean read depth for the other samples.
It should ideally be 0.5 or below for real deletions.  If you plot this 
metric it should be bi-modal and you can select a threshold from the data.
The depth p-value indicates whether there was sufficient depth 
information for depth ratio to be reliable.
Default 1kg pilot filters: DEPTHPVALUE < 0.01 and DEPTHRATIO < 0.63 (or 
DEPTHRATIO < 0.8 if MEMBPVALUE < 0.01)
The cutoff of 0.63 was chosen as the approximate midpoint of the bimodal 
distribution for depth ratio in the 1kg pilot data set.
See below for MEMBPVALUE.

DEPTHCALLTHRESHOLD < 1
The not-very-well-named depth call threshold is the median normalized 
sequencing depth of samples with observed aberrant read pairs.
A normalized depth of 1 in this case should be approximately copy number 
2.  Ideally this number would be 0.5 or below and this filter
excludes regions of the genome with excessive coverage.

COHPVALUE > 0.01
The coherence metric (not a true p-value, despite the name) indicates 
whether the spacing of the aberrant read pairs are consistent with a
single deletion breakpoint.  Read pairs generated by mismapping, for 
example, tend to be more uniformly spaced.

MEMBPVALUE
This metric tests whether the deletion seems to be appearing more in 
some samples than others, taking into account uneven sequencing.
Lower values are better, but unless you have a lot of samples it can be 
hard to find a good absolute cutoff for this metric.
For the 1kg pilot, what we did was to use this to "boost" some samples 
with a marginal depth ratio between 0.63 and 0.8.

If you are trying to identify high confidence calls, in general longer 
calls tend to have better depth signal and thus be of higher confidence
(all other metrics being equal).  In the 1kg phase 1 data set, we also 
used a filter where we required at least one sample to have two aberrant
read pairs.  This may be important if your sequencing is low coverage 
(e.g. 4x) but for higher sequencing depth I think this is not necessary.

I would start with the 1kg pilot filters as a guide, as they proved to 
be reasonable on the 1kg phase 1 data as well
(see SVDiscoveryDefaultFilter in SVQScript.q).

In theory, it should be possible to calibrate your filters based on 
existing gold standard data sets, if you have any.
Another useful thing to do is to prospectively genotype some of the 
sites (using Genome STRiP).
Sometimes the genotyping results and metrics can be used to help 
determine whether marginal calls are good or not and this can help
influence your discovery thresholds, although I would not recommend 
trying to do large scale filtering via genotyping.

-Bob

On 9/5/11 4:56 PM, Hyun Ji Noh wrote:
> Hi,
>
> Now I have vcf files that are generated by modified discovery.sh script. There's huge amount of information in the vcf files and there are even deletion calls that are not in my target region. So I'm wondering what is your recommended criteria to filter high quality deletion calls?
>
> BW,
> Hyun Ji
>
> P.S. Thank you for you help on mismatched read pair records error. Your advice fixed the problem!
>
>
> ------------------------------------------------------------------------------
> Special Offer -- Download ArcSight Logger for FREE!
> Finally, a world-class log management solution at an even better
> price-free! And you'll get a free "Love Thy Logs" t-shirt when you
> download Logger. Secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsisghtdev2dev
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

[svtoolkit-help] Filter criteria for SVs

From: Hyun Ji N. <no...@br...> - 2011-09-05 20:56:52

Hi,

Now I have vcf files that are generated by modified discovery.sh script. There's huge amount of information in the vcf files and there are even deletion calls that are not in my target region. So I'm wondering what is your recommended criteria to filter high quality deletion calls?

BW,
Hyun Ji

P.S. Thank you for you help on mismatched read pair records error. Your advice fixed the problem!

Re: [svtoolkit-help] Error message in SVDiscovery step

From: Bob H. <han...@br...> - 2011-09-01 17:53:31

Hi, Kim,
I think I found and fixed the bug.
Can you try build 785 and let me know if that solves the problem?
-Bob

On 9/1/11 11:53 AM, Kim Wong wrote:
> SVToolkit version 1.04 (build 683). (This isn't one of your main
> releases but this version allows me to format the lsf memory
> requirements a little differently for our system)
>
>
> Kim
>
>
>
> On 01/09/11 16:47, Bob Handsaker wrote:
>> What version?
>> -Bob
>>
>> On 9/1/11 11:44 AM, Kim Wong wrote:
>>> Hi Bob
>>>
>>> I'm running the Discovery phase using 10Mb windows and getting this error:
>>>
>>> INFO  16:04:17,165 SVDiscovery - Processing cluster
>>> 1:207788701-207788845 1:207889639-207889798 LR 6
>>> [GC 1278669K->267636K(1586432K), 1.2806820 secs]
>>> [GC 1375348K->267488K(1594112K), 1.2806950 secs]
>>> [Full GC 267488K->160084K(1605440K), 0.3875220 secs]
>>> #DBG: RC Cache fill    1:207778846-207888845    110000    100
>>> 9.229631 sec
>>> Error: Exception processing cluster: null
>>> Cluster: 1:207788701-207788845 1:207889639-207889798 LR 6
>>> [GC 591509K->161512K(1456512K), 1.2747640 secs]
>>> [Full GC 161512K->39951K(1456512K), 0.1467510 secs]
>>> ##### ERROR
>>> ------------------------------------------------------------------------------------------
>>> ##### ERROR stack trace
>>> java.lang.IndexOutOfBoundsException
>>>         at java.nio.Buffer.checkIndex(Buffer.java:514)
>>>         at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:209)
>>>         at
>>> org.broadinstitute.sv.discovery.ReadCountCache2.getReadCounts(ReadCountCache2.java:154)
>>>         at
>>> org.broadinstitute.sv.discovery.ClusterDepthModule.getReadCounts(ClusterDepthModule.java:215)
>>>         at
>>> org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:137)
>>>         at
>>> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:415)
>>>         at
>>> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:323)
>>>         at
>>> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:173)
>>>         at
>>> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:165)
>>>         at
>>> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:44)
>>>         at
>>> org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
>>>         at
>>> org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:85)
>>>         at
>>> org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:236)
>>>         at
>>> org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:116)
>>>         at
>>> org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
>>>         at
>>> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221)
>>>         at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
>>>         at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
>>>
>>>
>>> The parameters were set to:
>>>         -minimumSize 100
>>>         -maximumSize 1000000
>>>         -windowSize 10000000
>>>         -windowPadding 10000
>>>
>>> which gives for this window:
>>>
>>> -partitionName P0021 -filePrefix P0021 -L 1:199990000-211010000
>>> -searchLocus 1:200000000-209999999 -searchWindow 1:199990000-211010000
>>> -searchMinimumSize 100 -searchMaximumSize 1000000
>>>
>>> Any idea what is wrong?
>>>
>>> Thanks
>>>
>>> Kim
>>>
>>>
>>>
>>>
>>>
>>>
>> ------------------------------------------------------------------------------
>> Special Offer -- Download ArcSight Logger for FREE!
>> Finally, a world-class log management solution at an even better
>> price-free! And you'll get a free "Love Thy Logs" t-shirt when you
>> download Logger. Secure your free ArcSight Logger TODAY!
>> http://p.sf.net/sfu/arcsisghtdev2dev
>> _______________________________________________
>> svtoolkit-help mailing list
>> svt...@li...
>> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help
>

Re: [svtoolkit-help] Error message in SVDiscovery step

From: Kim W. <kw...@sa...> - 2011-09-01 15:53:53

SVToolkit version 1.04 (build 683). (This isn't one of your main 
releases but this version allows me to format the lsf memory 
requirements a little differently for our system)


Kim



On 01/09/11 16:47, Bob Handsaker wrote:
> What version?
> -Bob
>
> On 9/1/11 11:44 AM, Kim Wong wrote:
>> Hi Bob
>>
>> I'm running the Discovery phase using 10Mb windows and getting this error:
>>
>> INFO  16:04:17,165 SVDiscovery - Processing cluster
>> 1:207788701-207788845 1:207889639-207889798 LR 6
>> [GC 1278669K->267636K(1586432K), 1.2806820 secs]
>> [GC 1375348K->267488K(1594112K), 1.2806950 secs]
>> [Full GC 267488K->160084K(1605440K), 0.3875220 secs]
>> #DBG: RC Cache fill    1:207778846-207888845    110000    100
>> 9.229631 sec
>> Error: Exception processing cluster: null
>> Cluster: 1:207788701-207788845 1:207889639-207889798 LR 6
>> [GC 591509K->161512K(1456512K), 1.2747640 secs]
>> [Full GC 161512K->39951K(1456512K), 0.1467510 secs]
>> ##### ERROR
>> ------------------------------------------------------------------------------------------
>> ##### ERROR stack trace
>> java.lang.IndexOutOfBoundsException
>>        at java.nio.Buffer.checkIndex(Buffer.java:514)
>>        at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:209)
>>        at
>> org.broadinstitute.sv.discovery.ReadCountCache2.getReadCounts(ReadCountCache2.java:154)
>>        at
>> org.broadinstitute.sv.discovery.ClusterDepthModule.getReadCounts(ClusterDepthModule.java:215)
>>        at
>> org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:137)
>>        at
>> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:415)
>>        at
>> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:323)
>>        at
>> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:173)
>>        at
>> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:165)
>>        at
>> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:44)
>>        at
>> org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
>>        at
>> org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:85)
>>        at
>> org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:236)
>>        at
>> org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:116)
>>        at
>> org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
>>        at
>> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221)
>>        at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
>>        at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
>>
>>
>> The parameters were set to:
>>        -minimumSize 100
>>        -maximumSize 1000000
>>        -windowSize 10000000
>>        -windowPadding 10000
>>
>> which gives for this window:
>>
>> -partitionName P0021 -filePrefix P0021 -L 1:199990000-211010000
>> -searchLocus 1:200000000-209999999 -searchWindow 1:199990000-211010000
>> -searchMinimumSize 100 -searchMaximumSize 1000000
>>
>> Any idea what is wrong?
>>
>> Thanks
>>
>> Kim
>>
>>
>>
>>
>>
>>
>
> ------------------------------------------------------------------------------
> Special Offer -- Download ArcSight Logger for FREE!
> Finally, a world-class log management solution at an even better
> price-free! And you'll get a free "Love Thy Logs" t-shirt when you
> download Logger. Secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsisghtdev2dev
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Re: [svtoolkit-help] Error message in SVDiscovery step

From: Bob H. <han...@br...> - 2011-09-01 15:47:56

What version?
-Bob

On 9/1/11 11:44 AM, Kim Wong wrote:
> Hi Bob
>
> I'm running the Discovery phase using 10Mb windows and getting this error:
>
> INFO  16:04:17,165 SVDiscovery - Processing cluster
> 1:207788701-207788845 1:207889639-207889798 LR 6
> [GC 1278669K->267636K(1586432K), 1.2806820 secs]
> [GC 1375348K->267488K(1594112K), 1.2806950 secs]
> [Full GC 267488K->160084K(1605440K), 0.3875220 secs]
> #DBG: RC Cache fill    1:207778846-207888845    110000    100
> 9.229631 sec
> Error: Exception processing cluster: null
> Cluster: 1:207788701-207788845 1:207889639-207889798 LR 6
> [GC 591509K->161512K(1456512K), 1.2747640 secs]
> [Full GC 161512K->39951K(1456512K), 0.1467510 secs]
> ##### ERROR
> ------------------------------------------------------------------------------------------
> ##### ERROR stack trace
> java.lang.IndexOutOfBoundsException
>       at java.nio.Buffer.checkIndex(Buffer.java:514)
>       at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:209)
>       at
> org.broadinstitute.sv.discovery.ReadCountCache2.getReadCounts(ReadCountCache2.java:154)
>       at
> org.broadinstitute.sv.discovery.ClusterDepthModule.getReadCounts(ClusterDepthModule.java:215)
>       at
> org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:137)
>       at
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:415)
>       at
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:323)
>       at
> org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:173)
>       at
> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:165)
>       at
> org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:44)
>       at
> org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
>       at
> org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:85)
>       at
> org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:236)
>       at
> org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:116)
>       at
> org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
>       at
> org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221)
>       at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
>       at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
>
>
> The parameters were set to:
>       -minimumSize 100
>       -maximumSize 1000000
>       -windowSize 10000000
>       -windowPadding 10000
>
> which gives for this window:
>
> -partitionName P0021 -filePrefix P0021 -L 1:199990000-211010000
> -searchLocus 1:200000000-209999999 -searchWindow 1:199990000-211010000
> -searchMinimumSize 100 -searchMaximumSize 1000000
>
> Any idea what is wrong?
>
> Thanks
>
> Kim
>
>
>
>
>
>

[svtoolkit-help] Error message in SVDiscovery step

From: Kim W. <kw...@sa...> - 2011-09-01 15:44:22

Hi Bob

I'm running the Discovery phase using 10Mb windows and getting this error:

INFO  16:04:17,165 SVDiscovery - Processing cluster 
1:207788701-207788845 1:207889639-207889798 LR 6
[GC 1278669K->267636K(1586432K), 1.2806820 secs]
[GC 1375348K->267488K(1594112K), 1.2806950 secs]
[Full GC 267488K->160084K(1605440K), 0.3875220 secs]
#DBG: RC Cache fill    1:207778846-207888845    110000    100    
9.229631 sec
Error: Exception processing cluster: null
Cluster: 1:207788701-207788845 1:207889639-207889798 LR 6
[GC 591509K->161512K(1456512K), 1.2747640 secs]
[Full GC 161512K->39951K(1456512K), 0.1467510 secs]
##### ERROR 
------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.IndexOutOfBoundsException
     at java.nio.Buffer.checkIndex(Buffer.java:514)
     at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:209)
     at 
org.broadinstitute.sv.discovery.ReadCountCache2.getReadCounts(ReadCountCache2.java:154)
     at 
org.broadinstitute.sv.discovery.ClusterDepthModule.getReadCounts(ClusterDepthModule.java:215)
     at 
org.broadinstitute.sv.discovery.ClusterDepthModule.computeDepth(ClusterDepthModule.java:137)
     at 
org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processCluster(DeletionDiscoveryAlgorithm.java:415)
     at 
org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:323)
     at 
org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:173)
     at 
org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:165)
     at 
org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:44)
     at 
org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
     at 
org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:85)
     at 
org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:236)
     at 
org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:116)
     at 
org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
     at 
org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221)
     at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
     at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)


The parameters were set to:
     -minimumSize 100
     -maximumSize 1000000
     -windowSize 10000000
     -windowPadding 10000

which gives for this window:

-partitionName P0021 -filePrefix P0021 -L 1:199990000-211010000 
-searchLocus 1:200000000-209999999 -searchWindow 1:199990000-211010000 
-searchMinimumSize 100 -searchMaximumSize 1000000

Any idea what is wrong?

Thanks

Kim






-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Re: [svtoolkit-help] GenomeSTRiP error (Mismatched read pair records)

From: Bob H. <han...@br...> - 2011-08-31 16:17:03

Hi, Hyun,

Genome STRiP is seeing 4 reads with the same IDs - they look on cursory 
inspection like they might be identical
(i.e. the same read pair is seen twice). Did you perhaps include the 
same input bam file twice on the command line?

If that's not it, I would try running Picard's ValidateSamFile on the 
problem bam file and see what it says.

-Bob

On 8/31/11 11:45 AM, Hyun Ji Noh wrote:
> Hi,
>
> I'm trying to use GenomeSTRiP to call CNVs for targeted sequencing in dogs. I created mask fasta file using ComputeGenomeMask and now I have been trying to call CNVs using discovery.sh script that is provided in the GenomeSTRiP package.
>
> Just to describe what I've been trying, I changed the config file as instructed in the wiki page to adjust for targeted sequencing and created gender map file as well. I have multiple input bam files so I added several -I options, too.
>
> When I ran the modified discovery.sh script, the first type of error message I got was that:
>
> ##### ERROR MESSAGE: Fasta file is not indexed: canFam2/work/Canis_lupus_familiaris_assembly2.mask.fasta
>
> so I created fai file for the mask fasta file using following code:
>
> #!/bin/bash
>
> outdir=canFam2_1_index
> readLength=101
> reference=/path/Canis_lupus_familiaris_assembly2.mask.fasta
> export SV_DIR=/humgen/cnp04/bobh/svtoolkit/stable
>
> # These executables must be on your path.
> which java>  /dev/null || exit 1
> which bwa>  /dev/null || exit 1
>
> # The directory containing libbwa.so must be on your LD_LIBRARY_PATH
> export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}
>
> classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar"
>
> mkdir -p ${outdir}/work
>
> localReference=${outdir}/work/`echo ${reference} | awk -F / '{ print $NF }'`
> if [ ! -e ${localReference} ]; then
>       ln ${reference} ${localReference} || exit 1
>       fi
>
>       java -cp ${classpath} -Xmx4g \
>           org.broadinstitute.sv.apps.IndexFastaFile \
>           -I ${localReference} \
>           -O ${localReference}.fai \
>           || exit 1
> bwa index -a bwtsw ${localReference} || exit 1
>
>
> Then I ran again the modified discovery.sh script and got error messages as following:
>
> ##### ERROR MESSAGE: Mismatched read pair records: [ {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHG
>   HAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86} ]
>
> Then I thought the indexing must be a problem so I used the original fasta file's fai file for the mask fasta file. But then I still got:
>
> ##### ERROR MESSAGE: Mismatched read pair records: [ {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHG
>   HAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86} ]
>
> Now I'm not sure what else I can try to make the discovery module works. Could you give me any idea why this is happening? If you need more detailed information, please just let me know.
>
> Thanks for your help.
>
> Bests,
> Hyun Ji
> ------------------------------------------------------------------------------
> Special Offer -- Download ArcSight Logger for FREE!
> Finally, a world-class log management solution at an even better
> price-free! And you'll get a free "Love Thy Logs" t-shirt when you
> download Logger. Secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsisghtdev2dev
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

[svtoolkit-help] GenomeSTRiP error (Mismatched read pair records)

From: Hyun Ji N. <no...@br...> - 2011-08-31 15:45:19

Hi, 

I'm trying to use GenomeSTRiP to call CNVs for targeted sequencing in dogs. I created mask fasta file using ComputeGenomeMask and now I have been trying to call CNVs using discovery.sh script that is provided in the GenomeSTRiP package. 

Just to describe what I've been trying, I changed the config file as instructed in the wiki page to adjust for targeted sequencing and created gender map file as well. I have multiple input bam files so I added several -I options, too.

When I ran the modified discovery.sh script, the first type of error message I got was that:

##### ERROR MESSAGE: Fasta file is not indexed: canFam2/work/Canis_lupus_familiaris_assembly2.mask.fasta

so I created fai file for the mask fasta file using following code: 

#!/bin/bash

outdir=canFam2_1_index
readLength=101
reference=/path/Canis_lupus_familiaris_assembly2.mask.fasta
export SV_DIR=/humgen/cnp04/bobh/svtoolkit/stable

# These executables must be on your path.
which java > /dev/null || exit 1
which bwa > /dev/null || exit 1

# The directory containing libbwa.so must be on your LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}

classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar"

mkdir -p ${outdir}/work

localReference=${outdir}/work/`echo ${reference} | awk -F / '{ print $NF }'`
if [ ! -e ${localReference} ]; then
     ln ${reference} ${localReference} || exit 1
     fi

     java -cp ${classpath} -Xmx4g \
         org.broadinstitute.sv.apps.IndexFastaFile \ 
         -I ${localReference} \
         -O ${localReference}.fai \
         || exit 1
bwa index -a bwtsw ${localReference} || exit 1


Then I ran again the modified discovery.sh script and got error messages as following:

##### ERROR MESSAGE: Mismatched read pair records: [ {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86} ]

Then I thought the indexing must be a problem so I used the original fasta file's fai file for the mask fasta file. But then I still got:  

##### ERROR MESSAGE: Mismatched read pair records: [ {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	83	chr9	15258469	37	15S86M	chr9	15258470	-84	TGTGCTCTGCCGATCTGAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCG	#####?55&52AABA9<3?5(:@A>5;(;=,3@>66;B@BB==))-4A5@C@DB=8F??8?0B)3GBC;?@GCB??:<;;C::;CEAA:4FBFDDD;D@@@	MD:Z:0A83T1	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:55	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86}, {C0196ACXX110720:7:1101:10217:84864	163	chr9	15258470	37	86M15S	chr9	15258469	84	GAGGGAGAAGCAGGCTCCATACAGGGAGCCTGACGCAGGACTCGATCCCAGGTCTCCAGGATCAGGCCCTGGGCTGAAGGTGGCGAGATCGGAAGAGCGTC	;1=BADDDFD:?C;@GHGHAEDHGGIAEDEDDHHDHDF@FCFBAFHIJE7;@C=C?ACD>6;;AEC@;=;AA?<<=B@@:,488<5<BDC298?@?A9952	MD:Z:83T1T0	PG:Z:bwa	RG:Z:C0196.7	AM:i:37	NM:i:2	SM:i:37	MQ:i:37	UQ:i:43	XT:i:86} ]

Now I'm not sure what else I can try to make the discovery module works. Could you give me any idea why this is happening? If you need more detailed information, please just let me know.

Thanks for your help.

Bests,
Hyun Ji

Re: [svtoolkit-help] Left read of read pair fails left read test

From: Bob H. <han...@br...> - 2011-07-15 18:54:22

There isn't anything obviously wrong with that read.  There are a couple 
of things to try:
a) run Picard ValidateSamFile on the bam file to make sure it's ok
b) extract from the bam file both reads from the read pair so we can see 
both reads, for example:
samtools view in.bam | grep HWI-ST143_0294:7:1:19426:37893#0
c) create a small bam file containing just these two reads and see if 
the problem persists
(and if it does, send me the small bam file so I can debug).

-Bob

On 7/15/11 11:47 AM, Philine Feulner wrote:
> Hi Bob,
>
> sorry about my slow reply, we had some issues with our computing cluster.
>
> But now I managed to try out the latest interim release you suggested previously, unfortunately the error message stays the same (as you can see below).
> I also double checked that my input is a sorted bam. It was successfully run through realigning and recalibration using GATK, and SNPs and haplotypes can be called on this file utilizing GATK.
> Besides I also split the combined bam (paired end and mate pair libraries) into separate bams, which still gives the same error.
>
> Thanks for your help again,
> Philine
>
>
>
>
> ##### ERROR ------------------------------------------------------------------------------------------
> ##### ERROR stack trace
> java.lang.IllegalArgumentException: Left read of read pair fails left read test: HWI-ST143_0294:7:1:19426:37893#0       97      groupXXI        4523762 37
>        46M     groupXXI        4529101 5385    CACTAAGTGCTTCCTCGATTTCGCCAAGATTTGTTCAGCATGGAAC  7767687877776676387776376787877768767777776886  X0:i:1  X1:i:0
>    MD:Z:46 RG:Z:BS25pair   XG:i:0  AM:i:37 NM:i:0  SM:i:37 XM:i:0  XO:i:0  OQ:Z:IIIIIHIIIIEIIIIIHIIIIIIIIIGIHIIIIHIIGIGIIIBHII     XT:A:U
>          at org.broadinstitute.sv.util.ReadPair.create(ReadPair.java:135)
>          at org.broadinstitute.sv.discovery.ReadPairRecordFilter.createReadPair(ReadPairRecordFilter.java:300)
>          at org.broadinstitute.sv.discovery.ReadPairRecordFilter.generateReadPairs(ReadPairRecordFilter.java:221)
>          at org.broadinstitute.sv.discovery.ReadPairRecordFilter.filterReadPairs(ReadPairRecordFilter.java:97)
>          at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.finishReadPairSelection(DeletionDiscoveryAlgorithm.java:216)
>          at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:166)
>          at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:165)
>          at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:44)
>          at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
>          at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:85)
>          at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:236)
>          at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:116)
>          at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110)
>          at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221)
>          at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72)
>          at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)
> ##### ERROR ------------------------------------------------------------------------------------------
> ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5718M):
> ##### ERROR
> ##### ERROR Please visit the wiki to see if this is a known problem
> ##### ERROR If not, please post the error, with stack trace, to the GATK forum
> ##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
> ##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
> ##### ERROR
> ##### ERROR MESSAGE: Left read of read pair fails left read test: HWI-ST143_0294:7:1:19426:37893#0      97      groupXXI        4523762 37      46M     groupXXI        4529101 5385    CACTAAGTGCTTCCTCGATTTCGCCAAGATTTGTTCAGCATGGAAC  7767687877776676387776376787877768767777776886  X0:i:1  X1:i:0  MD:Z:46 RG:Z:BS25pair   XG:i:0  AM:i:37 NM:i:0  SM:i:37 XM:i:0  XO:i:0  OQ:Z:IIIIIHIIIIEIIIIIHIIIIIIIIIGIHIIIIHIIGIGIIIBHII     XT:A:U
> ##### ERROR ------------------------------------------------------------------------------------------
>
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Dr Philine Feulner
> Westfälische Wilhelms University
> Institute for Evolution and Biodiversity
> Evolutionary Bioinformatics Group
> Hüfferstrasse 1
> 48149 Münster
> Germany
> Tel: +49 (0) 251 83 21636
> Fax: +49 (0) 251 83 24668
> Email: p.f...@un...
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> ------------------------------------------------------------------------------
> AppSumo Presents a FREE Video for the SourceForge Community by Eric
> Ries, the creator of the Lean Startup Methodology on "Lean Startup
> Secrets Revealed." This video shows you how to validate your ideas,
> optimize your ideas and identify your business strategy.
> http://p.sf.net/sfu/appsumosfdev2dev
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help

3 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 7 8 9 10 > >> (Page 9 of 10)