Re: [svtoolkit-help] svtoolkit-help Digest, Vol 23, Issue 1
Status: Beta
Brought to you by:
bhandsaker
From: John B. <jo...@we...> - 2013-05-28 20:36:51
|
Sorry, I missed the details. Initial command line: java -cp /usr/local/genetics/svtoolkit_1.04.1068/lib/SVToolkit.jar:/usr/local/genetics/svtoolkit_1.04.1068/lib/gatk/GenomeAnalysisTK.jar:/usr/local/genetics/svtoolkit_1.04.1068/lib/gatk/Queue.jar -Xmx4g \ org.broadinstitute.sting.queue.QCommandLine \ -S /usr/local/genetics/svtoolkit_1.04.1068/qscript/SVPreprocess.q \ -S /usr/local/genetics/svtoolkit_1.04.1068/qscript/SVQScript.q \ -gatk /usr/local/genetics/svtoolkit_1.04.1068/lib/gatk/GenomeAnalysisTK.jar \ -cp /usr/local/genetics/svtoolkit_1.04.1068/lib/SVToolkit.jar:/usr/local/genetics/svtoolkit_1.04.1068/lib/gatk/GenomeAnalysisTK.jar:/usr/local/genetics/svtoolkit_1.04.1068/lib/gatk/Queue.jar \ -configFile conf/hs37d5.conf \ -tempDir /well/htseq/ILLUMINA-WGS/SV-Freeze5/OUTPUT3/tmpdir \ -R /users/johnb/SV/Genomes/hs37d5.fa \ -reduceInsertSizeDistributions \ -genomeMaskFile /users/johnb/SV/Genomes/hs37d5.mask.fa \ -genderMapFile OUTPUT3/gender.map \ -runDirectory OUTPUT3/AS_CLL_156GL \ -md OUTPUT3/AS_CLL_156GL/metadata \ -computeGCProfiles \ -copyNumberMaskFile /users/johnb/SV/Genomes/hs37d5.cn2mask.fa \ -jobLogDir OUTPUT3/AS_CLL_156GL/logs \ -I AS_CLL_156GL.bam Contents of OUTPUT3/AS_CLL_156GL/logs/SVPreprocess-11.out: INFO 12:54:38,270 HelpFormatter - ---------------------------------------------------------- INFO 12:54:38,272 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ComputeGCProfiles INFO 12:54:38,275 HelpFormatter - Program Args: -I /gpfs1/well/htseq/ILLUMINA-WGS/SV-Freeze5/AS_CLL_156GL.bam -O /gpfs1/well/htseq/ILLUMINA-WGS/SV-Freeze5/OUTPUT3/AS_CLL_156GL/metadata/gcprofile/AS_CLL_156GL.bam.gcprof.zip -R /users/johnb/SV/Genomes/hs37d5.fa -md OUTPUT3/AS_CLL_156GL/metadata -referenceProfile OUTPUT3/AS_CLL_156GL/metadata/gcprofile/reference.gcprof.zip -genomeMaskFile /users/johnb/SV/Genomes/hs37d5.mask.fa -copyNumberMaskFile /users/johnb/SV/Genomes/hs37d5.cn2mask.fa -configFile conf/hs37d5.conf INFO 12:54:38,276 HelpFormatter - Date/Time: 2013/05/23 12:54:38 INFO 12:54:38,276 HelpFormatter - ---------------------------------------------------------- INFO 12:54:38,276 HelpFormatter - ---------------------------------------------------------- INFO 12:54:38,298 ComputeGCProfiles - Opening reference sequence ... INFO 12:54:38,299 ComputeGCProfiles - Opened reference sequence. INFO 12:54:38,299 ComputeGCProfiles - Opening genome mask ... INFO 12:54:38,299 ComputeGCProfiles - Opened genome mask. INFO 12:54:38,300 ComputeGCProfiles - Opening copy number mask ... INFO 12:54:38,300 ComputeGCProfiles - Opened copy number mask. INFO 12:54:38,300 ComputeGCProfiles - Initializing algorithm ... #INFO: ReadCountAlgorithm: detected metadata version 1, forcing legacy behavior INFO 12:54:38,338 ComputeGCProfiles - Algorithm initialized. INFO 12:54:38,338 ComputeGCProfiles - Opening reference GC profile ... INFO 12:54:38,365 ComputeGCProfiles - Opened reference GC profile. INFO 12:54:38,366 ComputeGCProfiles - Processing input file org.broadinstitute.sv.dataset.SAMFileLocation@986cff74 ... Exception in thread "main" java.lang.RuntimeException: Invalid sequence position: 17:81195230 at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:46) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:24) at org.broadinstitute.sv.apps.ComputeGCProfiles.main(ComputeGCProfiles.java:120) Caused by: java.lang.IllegalArgumentException: Invalid sequence position: 17:81195230 at org.broadinstitute.sv.mask.GenomeMaskFastaFile.getMaskBit(GenomeMaskFastaFile.java:80) at org.broadinstitute.sv.metadata.gc.GCProfileAlgorithm.processRecord(GCProfileAlgorithm.java:107) at org.broadinstitute.sv.metadata.gc.GCProfileAlgorithm.processSAMFile(GCProfileAlgorithm.java:124) at org.broadinstitute.sv.apps.ComputeGCProfiles.run(ComputeGCProfiles.java:181) at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:38) ... 4 more On Tue, May 28, 2013 at 9:24 PM, < svt...@li...> wrote: > Send svtoolkit-help mailing list submissions to > svt...@li... > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help > or, via email, send a message with subject or body 'help' to > svt...@li... > > You can reach the person managing the list at > svt...@li... > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of svtoolkit-help digest..." > > > Today's Topics: > > 1. "Invalid sequence position" (John Broxholme) > 2. Re: "Invalid sequence position" (Bob Handsaker) > 3. Re: "Invalid sequence position" (John Broxholme) > 4. Re: "Invalid sequence position" (Bob Handsaker) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 23 May 2013 15:21:57 +0100 > From: John Broxholme <jo...@we...> > Subject: [svtoolkit-help] "Invalid sequence position" > To: svt...@li... > Cc: Linda Hughes <li...@we...> > Message-ID: > <CADGtfyC7wyVAqJfRO3SAwEPVuhGQtjrGJMrZxu3B= > Xjv...@ma...> > Content-Type: text/plain; charset="utf-8" > > Pre-processing of one(of 270+) deep BAM files has failed with: > > ... > INFO 12:54:38,366 ComputeGCProfiles - Processing input file > org.broadinstitute.sv.dataset.SAMFileLocation@986cff74 ... > Exception in thread "main" java.lang.RuntimeException: Invalid sequence > position: 17:81195230 > ... > > Where would this have come from? The pipeline has been the same to prepare > all 270+ of the (25x deep) BAMs, and this is the only failure. Any > suggestions on what might be wrong and how to fix it will be most welcome! > > Thanks > John > > -- > John Broxholme > Wellcome Trust Centre for Human Genetics > Roosevelt Drive, Oxford, OX3 7BN, UK > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 2 > Date: Thu, 23 May 2013 10:35:31 -0400 > From: Bob Handsaker <han...@br...> > Subject: Re: [svtoolkit-help] "Invalid sequence position" > To: svt...@li... > Message-ID: <519...@br...> > Content-Type: text/plain; charset="iso-8859-1" > > Including the stack trace would be most helpful. > If you are using an hg19-based reference, then chr17 is only 81195210 long. > Does ValidateSamFile like this bam? > This may be related to the bwa idiosyncracy of occasionally leaving > nominally-invalid POS fields in unmapped records. > If so, I will try to fix it if you can send me the command line and > stack trace. > -Bob > > On 5/23/13 10:21 AM, John Broxholme wrote: > > Pre-processing of one(of 270+) deep BAM files has failed with: > > > > ... > > INFO ? 12:54:38,366 ComputeGCProfiles - Processing input file > > org.broadinstitute.sv.dataset.SAMFileLocation@986cff74 ...? > > Exception in thread "main" java.lang.RuntimeException: Invalid > > sequence position: 17:81195230 > > ... > > > > Where would this have come from? ? The pipeline has been the same? to > > prepare all 270+ of the (25x deep) BAMs, and this is the only failure. > > ? Any suggestions on what might be wrong and how to fix it will be > > most welcome! > > > > Thanks > > John > > > > -- > > John Broxholme > > Wellcome Trust Centre for Human Genetics > > Roosevelt Drive, Oxford, OX3 7BN, UK > > > > > > > > > ------------------------------------------------------------------------------ > > Try New Relic Now & We'll Send You this Cool Shirt > > New Relic is the only SaaS-based application performance monitoring > service > > that delivers powerful full stack analytics. Optimize and monitor your > > browser, app, & servers with just a few lines of code. Try New Relic > > and get this awesome Nerd Life shirt! > http://p.sf.net/sfu/newrelic_d2d_may > > > > > > _______________________________________________ > > svtoolkit-help mailing list > > svt...@li... > > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help > > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 3 > Date: Tue, 28 May 2013 15:37:18 +0100 > From: John Broxholme <jo...@we...> > Subject: Re: [svtoolkit-help] "Invalid sequence position" > To: svt...@li... > Cc: Linda Hughes <li...@we...> > Message-ID: > < > CAD...@ma...> > Content-Type: text/plain; charset="utf-8" > > Hi Bob, > > Thanks for the quick response. Yes it is NCBI build37 (actually the 1000G > reference with decoy). I can fill in a bit more detail on this. The error > I sent earlier was using svtoolkit build 1.04.1068 (which I couldn't use at > first since dependency on some new(?) R package required an upgrade of R). > I first saw this using our production version, build 1.04.857: > > ... > INFO 11:41:09,579 ComputeGCProfiles - Opened reference GC profile. > INFO 11:41:09,579 ComputeGCProfiles - Processing input file > /gpfs1/well/htseq/ILLUMINA-WGS/SV-Freeze5/AS_CLL_156GL.bam ... > Exception in thread "main" java.lang.RuntimeException: Invalid sequence > position: 17:81195216 > at > > org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:40) > at > > org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221) > at > > org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:23) > at > > org.broadinstitute.sv.apps.ComputeGCProfiles.main(ComputeGCProfiles.java:104) > Caused by: java.lang.IllegalArgumentException: Invalid sequence position: > 17:81195216 > ... > I note that the invalid coordinate reported differs - 81195216 vs 81195230, > although this is the same BAM. > > Anyhow, I have since updated picard-tools to 1.92 and run 'picard > validateSamFile on the offending BAM > The only error (in 10k errors) I see is "Mate not found for paired read" (I > see many of these) which I assume has been caused by deduping (I used > picard for this). > Meanwhile I have removed reads mapping to the last 100bp of chr17 from that > BAM, which is an ugly fix but it allows me to progress a bit. > And I have to resolve something else causing crashes with the current > (1.04.1068) version, which would be the subject of another thread... > > John > > On Thu, May 23, 2013 at 3:21 PM, John Broxholme <jo...@we...> > wrote: > > > Pre-processing of one(of 270+) deep BAM files has failed with: > > > > ... > > INFO 12:54:38,366 ComputeGCProfiles - Processing input file > > org.broadinstitute.sv.dataset.SAMFileLocation@986cff74 ... > > Exception in thread "main" java.lang.RuntimeException: Invalid sequence > > position: 17:81195230 > > ... > > > > Where would this have come from? The pipeline has been the same to > > prepare all 270+ of the (25x deep) BAMs, and this is the only failure. > Any > > suggestions on what might be wrong and how to fix it will be most > welcome! > > > > Thanks > > John > > > > -- > > John Broxholme > > Wellcome Trust Centre for Human Genetics > > Roosevelt Drive, Oxford, OX3 7BN, UK > > > > > > > -- > John Broxholme > Wellcome Trust Centre for Human Genetics > Roosevelt Drive, Oxford, OX3 7BN, UK > Tel: (+44 1865) 287611 FAX: 287697 > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 4 > Date: Tue, 28 May 2013 16:23:48 -0400 > From: Bob Handsaker <han...@br...> > Subject: Re: [svtoolkit-help] "Invalid sequence position" > To: svt...@li... > Message-ID: <51A...@br...> > Content-Type: text/plain; charset="iso-8859-1" > > This doesn't look like the entire stack trace. Did you truncate it? > I'm happy to try to help, but you need to send me the full stack trace > and you should send the full command line as well. > -Bob > > On 5/28/13 10:37 AM, John Broxholme wrote: > > Hi Bob, > > > > Thanks for the quick response. ? Yes it is NCBI build37 (actually the > > 1000G reference with decoy). ? I can fill in a bit more detail on > > this. ? The error I sent earlier was using svtoolkit build 1.04.1068 > > (which I couldn't use at first since dependency on some new(?) R > > package required an upgrade of R). ? I first saw this using our > > production version, build 1.04.857: > > > > ... > > INFO ? 11:41:09,579 ComputeGCProfiles - Opened reference GC profile.? > > INFO ? 11:41:09,579 ComputeGCProfiles - Processing input file > > /gpfs1/well/htseq/ILLUMINA-WGS/SV-Freeze5/AS_CLL_156GL.bam ...? > > Exception in thread "main" java.lang.RuntimeException: Invalid > > sequence position: 17:81195216 > > ? ? ? ? at > > > org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:40) > > ? ? ? ? at > > > org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221) > > ? ? ? ? at > > > org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:23) > > ? ? ? ? at > > > org.broadinstitute.sv.apps.ComputeGCProfiles.main(ComputeGCProfiles.java:104) > > Caused by: java.lang.IllegalArgumentException: Invalid sequence > > position: 17:81195216 > > ... > > I note that the invalid coordinate reported differs -? 81195216 > > vs? 81195230, although this is the same BAM. > > > > Anyhow, I have since updated picard-tools to 1.92 and run 'picard > > validateSamFile on the offending BAM > > The only error (in 10k errors) I see is "Mate not found for paired > > read" (I see many of these) which I assume has been caused by deduping > > (I used picard for this). > > Meanwhile I have removed reads mapping to the last 100bp of chr17 from > > that BAM, which is an ugly fix but it allows me to progress a bit. > > And I have to resolve something else causing crashes with the current > > (1.04.1068) version, which would be the subject of another thread... > > > > John > > > > On Thu, May 23, 2013 at 3:21 PM, John Broxholme <jo...@we... > > <mailto:jo...@we...>> wrote: > > > > Pre-processing of one(of 270+) deep BAM files has failed with: > > > > ... > > INFO ? 12:54:38,366 ComputeGCProfiles - Processing input file > > org.broadinstitute.sv.dataset.SAMFileLocation@986cff74 ...? > > Exception in thread "main" java.lang.RuntimeException: Invalid > > sequence position: 17:81195230 > > ... > > > > Where would this have come from? ? The pipeline has been the > > same? to prepare all 270+ of the (25x deep) BAMs, and this is the > > only failure. ? Any suggestions on what might be wrong and how to > > fix it will be most welcome! > > > > Thanks > > John > > > > -- > > John Broxholme > > Wellcome Trust Centre for Human Genetics > > Roosevelt Drive, Oxford, OX3 7BN, UK > > > > > > > > > > -- > > John Broxholme > > Wellcome Trust Centre for Human Genetics > > Roosevelt Drive, Oxford, OX3 7BN, UK > > Tel: (+44 1865) 287611 <tel:%28%2B44%201865%29%20287611> FAX: 287697 > > > > > > > ------------------------------------------------------------------------------ > > Try New Relic Now & We'll Send You this Cool Shirt > > New Relic is the only SaaS-based application performance monitoring > service > > that delivers powerful full stack analytics. Optimize and monitor your > > browser, app, & servers with just a few lines of code. Try New Relic > > and get this awesome Nerd Life shirt! > http://p.sf.net/sfu/newrelic_d2d_may > > > > > > _______________________________________________ > > svtoolkit-help mailing list > > svt...@li... > > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help > > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > > ------------------------------------------------------------------------------ > Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET > Get 100% visibility into your production application - at no cost. > Code-level diagnostics for performance bottlenecks with <2% overhead > Download for free and get started troubleshooting in minutes. > http://p.sf.net/sfu/appdyn_d2d_ap1 > > ------------------------------ > > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help > > > End of svtoolkit-help Digest, Vol 23, Issue 1 > ********************************************* > -- John Broxholme Wellcome Trust Centre for Human Genetics Roosevelt Drive, Oxford, OX3 7BN, UK Tel: (+44 1865) 287611 FAX: 287697 |