|
From: Petr D. <pd...@sa...> - 2012-03-08 07:22:22
|
gunzip -c valid-3.3.vcf.gz | vcf-convert -v 4.1 | bgzip -c >
valid-4.1.vcf.gz
On Wed, 2012-03-07 at 21:31 +0000, Paul Denny wrote:
> Dear Petr
>
> I've installed vcftools_0.1.8a and done some tests with one of the files supplied - using the file called
>
> valid-3.3.vcf
>
> I compressed with bgzip, then indexed with tabix. Then I tried to convert it into v4.0 using the command:
>
> zcat /Applications/vcftools_0.1.8a/examples/valid-3.3.vcf.gz | vcf-convert > valid-3.3to4.vcf.gz
>
> and got these errors:
>
> zcat: /Applications/vcftools_0.1.8a/examples/valid-3.3.vcf.gz.Z: No such file or directory
> Downgrading of VCF versions is experimental: expect troubles!
> Broken VCF header, no column names?
> at Vcf.pm line 171
> Vcf::throw('Vcf4_1=HASH(0x7feb22830bf0)', 'Broken VCF header, no column names?') called at Vcf.pm line 860
> VcfReader::_read_column_names('Vcf4_1=HASH(0x7feb22830bf0)') called at Vcf.pm line 602
> VcfReader::parse_header('Vcf4_1=HASH(0x7feb22830bf0)') called at /usr/bin/vcf-convert line 63
> main::convert_file('HASH(0x7feb22803ed0)') called at /usr/bin/vcf-convert line 12
>
> I'm using Mac OS X (Lion version 10.7.3).
>
> suggestions?
>
> regards,
> Paul
>
>
> Mobile: +44-07922078038
> E Mail: pau...@gm...
> Web: http://www.thesmartcoachingcompany.com/smartblog/
> Twitter: http://twitter.com/pauldennyuk
> LinkedIn: http://linkd.in/giXwQj
>
>
>
>
> On 7 Mar 2012, at 07:40, Petr Danecek wrote:
>
> > Hi Paul,
> >
> > what is the command line you are running and have you tried with the
> > latest version of VCFtools? Before reporting errors it is advisable to
> > check the latest SVN revision first.
> >
> > What platform are you running this on? Zcat on Mac OS likes to
> > prepend .Z suffix to the argument which seems to be happening in your
> > case. The error you are observing is most likely that vcf-convert is not
> > able to read the VCF because of this. Older versions of VCFtools used
> > zcat but this was later changed to `gunzip -c` for compatibility
> > reasons. Upgrading to a newer version should help.
> >
> > Best,
> > Petr
> >
> >
> > On Tue, 2012-03-06 at 16:31 +0000, Paul Denny wrote:
> >> Dear Laura
> >>
> >> thx for the advice, I've now done that, but am now getting these errors:
> >>
> >> zcat: /Volumes/MacExternal/WTSI-17Genomes/20111102-snps-all.annotated.vcf.gz.Z: No such file or directory
> >> Broken VCF header, no column names?
> >> at /Applications/vcftools_0.1.7/perl//Vcf.pm line 171
> >> Vcf::throw('Vcf4_1=HASH(0x7faa6b930b38)', 'Broken VCF header, no column names?') called at /Applications/vcftools_0.1.7/perl//Vcf.pm line 845
> >> VcfReader::_read_column_names('Vcf4_1=HASH(0x7faa6b930b38)') called at /Applications/vcftools_0.1.7/perl//Vcf.pm line 589
> >> VcfReader::parse_header('Vcf4_1=HASH(0x7faa6b930b38)') called at /usr/bin/vcf-convert line 63
> >> main::convert_file('HASH(0x7faa6b803ed0)') called at /usr/bin/vcf-convert line 12
> >> <rnal/WTSI-17Genomes/20111102-indels-all.annotated.vcf.gz chr7:24890278-25390278 > chr7-17genomes-25Mbpeak-indels.vcf
> >>
> >> Which suggests that it does't like the header, but I don't know what to try next - the original files can be viewed here:
> >>
> >> ftp://ftp-mouse.sanger.ac.uk/current_snps/20111102-snps-all.annotated.vcf.gz
> >> ftp://ftp-mouse.sanger.ac.uk/current_snps/20111102-snps-all.annotated.vcf.gz.tbi
> >>
> >> Best regards,
> >>
> >> Paul Denny
> >>
> >> Mobile: +44-07922078038
> >> E Mail: pau...@gm...
> >> Web: http://www.thesmartcoachingcompany.com/smartblog/
> >> Twitter: http://twitter.com/pauldennyuk
> >> LinkedIn: http://linkd.in/giXwQj
> >>
> >>
> >>
> >>
> >> On 5 Mar 2012, at 14:55, Laura Clarke wrote:
> >>
> >>> Hello Paul
> >>>
> >>> You need to add the directory which contains the vcftools perl code to
> >>> your PERL5LIB env variable
> >>>
> >>> thanks
> >>>
> >>> Laura
> >>>
> >>> On Mon, Mar 5, 2012 at 8:16 AM, Paul Denny <pau...@gm...> wrote:
> >>>> Dear VCF_tools folks
> >>>>
> >>>> thanks for help in the past. I have now got tabix, bgzip, etc., working, but want to convert a VCF file from version 3.3 into v4.0 to extract some data. I've tried using this command:
> >>>>
> >>>> zcat file.vcf.gz | vcf-convert -r reference.fa > out.vcf
> >>>>
> >>>> my starting file is called:
> >>>>
> >>>> 20111102-snps-all.annotated.vcf.gz
> >>>>
> >>>> and I've installed the various VCF tools in my
> >>>>
> >>>> /usr/bin
> >>>>
> >>>> directory, where tabix has worked for indexing VCF files and selecting subsets of data from within a compressed, indexed file,
> >>>>
> >>>> but I get this error message:
> >>>>
> >>>> zcat: /Volumes/MacExternal/WTSI-17Genomes/20111102-snps-all.annotated.vcf.gz.Z: No such file or directory
> >>>> Can't locate Vcf.pm in @INC (@INC contains: /Users/p.denny/Applications/vcftools_0.1.7/bin/ /Library/Perl/5.12/darwin-thread-multi-2level /Library/Perl/5.12 /Network/Library/Perl/5.12/darwin-thread-multi-2level /Network/Library/Perl/5.12 /Library/Perl/Updates/5.12.3 /System/Library/Perl/5.12/darwin-thread-multi-2level /System/Library/Perl/5.12 /System/Library/Perl/Extras/5.12/darwin-thread-multi-2level /System/Library/Perl/Extras/5.12 .) at /usr/bin/vcf-convert line 6.
> >>>> BEGIN failed--compilation aborted at /usr/bin/vcf-convert line 6.
> >>>>
> >>>> Because I'm working only with SNPS, I have not included a FASTA format indexed reference sequence file in the command, because i believed it to not be obligatory and because I don't know where to get a mouse FASTA format indexed reference sequence.
> >>>>
> >>>> Suggestions, please!
> >>>>
> >>>> thanks,
> >>>>
> >>>> Paul Denny
> >>>>
> >>>> Mobile: +44-07922078038
> >>>> E Mail: pau...@gm...
> >>>> Web: http://www.thesmartcoachingcompany.com/smartblog/
> >>>> Twitter: http://twitter.com/pauldennyuk
> >>>> LinkedIn: http://linkd.in/giXwQj
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 7 Feb 2012, at 09:36, Petr Danecek wrote:
> >>>>
> >>>>> Hi Paul,
> >>>>>
> >>>>> first of all, as described in the online documentation, bgzip (not gzip)
> >>>>> must be used to compress your VCFs. You should be able to run the
> >>>>> command shown in the error message below from the command line without
> >>>>> errors (tabix -l your.vcf.gz). If that does not work, then your
> >>>>> installation is messed up. The exact location of tabix binary does not
> >>>>> matter, but it must be in one of the search directories defined by the
> >>>>> PATH environment variable. This is a standard shell variable, please
> >>>>> google it up to learn more about this.
> >>>>> http://vcftools.sourceforge.net/docs.html
> >>>>>
> >>>>> By the way, you should address questions about VCFtools to the
> >>>>> vcftools-help mailing list only, samtools-help list is dedicated to,
> >>>>> well, samtools.
> >>>>> ;-)
> >>>>>
> >>>>> Best,
> >>>>> Petr
> >>>>>
> >>>>>
> >>>>> On Mon, 2012-02-06 at 17:03 +0000, Paul Denny wrote:
> >>>>>> Dear VCFtools folks,
> >>>>>>
> >>>>>> I'm trying to use the vcf-isec command to find differences between two vcf.gz files, but get this error message:
> >>>>>>
> >>>>>> pauls-MacBook-Pro:P110202 p.denny$ ./vcf-isec -c chr7-BALBOla-B6-variants-25MbGWASpeak.vcf.gz chr7-CBAOla-B6-variants-25MbGWASpeak.vcf.gz | ./bgzip -c > BALB-CBA-chr7-25Mb.vcf.gz
> >>>>>> Can't exec "tabix": No such file or directory at Vcf.pm line 2155.
> >>>>>> at Vcf.pm line 2155
> >>>>>> The command "tabix -l chr7-BALBOla-B6-variants-25MbGWASpeak.vcf.gz" exited with an error. Is the file tabix indexed?
> >>>>>>
> >>>>>> at Vcf.pm line 171
> >>>>>> Vcf::throw('Vcf4_0=HASH(0x7f7f8182d568)', 'The command "tabix -l chr7-BALBOla-B6-variants-25MbGWASpeak.v...') called at Vcf.pm line 2156
> >>>>>> VcfReader::get_chromosomes('Vcf4_0=HASH(0x7f7f8182d568)') called at ./vcf-isec line 228
> >>>>>> main::vcf_isec('HASH(0x7f7f81828908)') called at ./vcf-isec line 12
> >>>>>>
> >>>>>> I have attached the vcf.gz files and a screenshot of the directory in which they are kept. As you can see, tabix, gzip and vcf.pm are all in the same directory, so any help would be appreciated.
> >>>>>>
> >>>>>> Best regards,
> >>>>>>
> >>>>>> Paul Denny
> >>>>>>
> >>>>>> Mobile: +44-07922078038
> >>>>>> E Mail: pau...@gm...
> >>>>>> Web: http://www.thesmartcoachingcompany.com/smartblog/
> >>>>>> Twitter: http://twitter.com/pauldennyuk
> >>>>>> LinkedIn: http://linkd.in/giXwQj
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2 Feb 2012, at 09:11, Sendu Bala wrote:
> >>>>>>
> >>>>>>> Either include the directory that contains tabix in your PATH environment variable, or move the tabix executable to a directory that is already in your PATH.
> >>>>>>>
> >>>>>>> http://kb.iu.edu/data/acar.html
> >>>>>>>
> >>>>>>>
> >>>>>>> On 1 Feb 2012, at 20:06, Paul Denny wrote:
> >>>>>>>> Dear Quang & VCF Tools folks
> >>>>>>>>
> >>>>>>>> I'm now using the vcf-tools and would appreciate help with directory structure when using the perl script
> >>>>>>>>
> >>>>>>>> vcf-query
> >>>>>>>>
> >>>>>>>> to select a subset of data rows from a file.vcf.gz file. when I use it, I get these errors:
> >>>>>>>>
> >>>>>>>> pauls-MacBook-Pro:perl p.denny$ ./vcf-query /Volumes/MacExternal/P110202/WTCHG_26484.bam.vcf.gz -r chr4:66480073-66595329
> >>>>>>>> Can't exec "tabix": No such file or directory at Vcf.pm line 217.
> >>>>>>>> tabix -h /Volumes/MacExternal/P110202/WTCHG_26484.bam.vcf.gz chr4:66480073-66595329 |: No such file or directory
> >>>>>>>> at Vcf.pm line 171
> >>>>>>>> Vcf::throw('Vcf=HASH(0x7fe4fa828ff8)', 'tabix -h /Volumes/MacExternal/P110202/WTCHG_26484.bam.vcf.gz...') called at Vcf.pm line 217
> >>>>>>>> Vcf::_open('Vcf=HASH(0x7fe4fa828ff8)', 'region', 'chr4:66480073-66595329', 'print_header', 1) called at Vcf.pm line 165
> >>>>>>>> Vcf::new('Vcf', 'file', '/Volumes/MacExternal/P110202/WTCHG_26484.bam.vcf.gz', 'region', 'chr4:66480073-66595329', 'print_header', 1) called at ./vcf-query line 156
> >>>>>>>> main::read_data('HASH(0x7fe4fa803ed0)') called at ./vcf-query line 12
> >>>>>>>> pauls-MacBook-Pro:perl p.denny$
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> it looks like the vcf-query script is trying to find the tabix executable, but is not looking in the correct directory. how should I either structure the directories, or edit the script(s) to 'know' where to look?
> >>>>>>>>
> >>>>>>>> NB I'm a complete newcomer to perl scripting...
> >>>>>>>>
> >>>>>>>> many thanks,
> >>>>>>>>
> >>>>>>>> Paul
> >>>>>>>>
> >>>>>>>> Mobile: +44-07922078038
> >>>>>>>> E Mail: pau...@gm...
> >>>>>>>> Web: http://www.thesmartcoachingcompany.com/smartblog/
> >>>>>>>> Twitter: http://twitter.com/pauldennyuk
> >>>>>>>> LinkedIn: http://linkd.in/giXwQj
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 18 Jan 2012, at 04:25, Quang Trinh wrote:
> >>>>>>>>
> >>>>>>>>> - Download tabix
> >>>>>>>>>> tar xvjf tabix-0.2.5.tar.bz2
> >>>>>>>>>> cd tabix-0.2.5
> >>>>>>>>>> make
> >>>>>>>>>> ./tabix
> >>>>>>>>>
> >>>>>>>>> see http://samtools.sourceforge.net/tabix.shtml on how to use tabix.
> >>>>>>>>>
> >>>>>>>>> Q
> >>>>>>>>>
> >>>>>>>>> On Fri, Jan 6, 2012 at 8:05 AM, Paul Denny <pau...@gm...> wrote:
> >>>>>>>>>> Dear SAM tools guys / gals
> >>>>>>>>>>
> >>>>>>>>>> I am a Unix newbie, wanting to use tabix and bgzip from SAM tools to
> >>>>>>>>>> compress and index some .VCF files. I've downloaded tabix-0.2.5 from here:
> >>>>>>>>>>
> >>>>>>>>>> http://sourceforge.net/projects/samtools/files/tabix/
> >>>>>>>>>>
> >>>>>>>>>> and also tried reading this:
> >>>>>>>>>>
> >>>>>>>>>> http://samtools.sourceforge.net/tabix.shtml
> >>>>>>>>>>
> >>>>>>>>>> about tabix, but do not know what to do with the downloaded files - what do
> >>>>>>>>>> they need to be run on a Mac OS X (10.5.8) version of Unix? If so, please
> >>>>>>>>>> point me at advice.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Best regards,
> >>>>>>>>>>
> >>>>>>>>>> Paul Denny
> >>>>>>>>>>
> >>>>>>>>>> Mobile: +44-07922078038
> >>>>>>>>>> E Mail: pau...@gm...
> >>>>>>>>>> Web: http://www.thesmartcoachingcompany.com/smartblog/
> >>>>>>>>>> Twitter: http://twitter.com/pauldennyuk
> >>>>>>>>>> LinkedIn: http://linkd.in/giXwQj
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ------------------------------------------------------------------------------
> >>>>>>>>>> Keep Your Developer Skills Current with LearnDevNow!
> >>>>>>>>>> The most comprehensive online learning library for Microsoft developers
> >>>>>>>>>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> >>>>>>>>>> Metro Style Apps, more. Free future releases when you subscribe now!
> >>>>>>>>>> http://p.sf.net/sfu/learndevnow-d2d
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Samtools-help mailing list
> >>>>>>>>>> Sam...@li...
> >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/samtools-help
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ------------------------------------------------------------------------------
> >>>>>>>> Keep Your Developer Skills Current with LearnDevNow!
> >>>>>>>> The most comprehensive online learning library for Microsoft developers
> >>>>>>>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> >>>>>>>> Metro Style Apps, more. Free future releases when you subscribe now!
> >>>>>>>> http://p.sf.net/sfu/learndevnow-d2d
> >>>>>>>> _______________________________________________
> >>>>>>>> Samtools-help mailing list
> >>>>>>>> Sam...@li...
> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/samtools-help
> >>>>>>>
> >>>>>> ------------------------------------------------------------------------------
> >>>>>> Try before you buy = See our experts in action!
> >>>>>> The most comprehensive online learning library for Microsoft developers
> >>>>>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> >>>>>> Metro Style Apps, more. Free future releases when you subscribe now!
> >>>>>> http://p.sf.net/sfu/learndevnow-dev2
> >>>>>> _______________________________________________ Vcftools-help mailing list Vcf...@li... https://lists.sourceforge.net/lists/listinfo/vcftools-help
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> The Wellcome Trust Sanger Institute is operated by Genome Research
> >>>>> Limited, a charity registered in England with number 1021457 and a
> >>>>> company registered in England with number 2742969, whose registered
> >>>>> office is 215 Euston Road, London, NW1 2BE.
> >>>>
> >>>>
> >>>> ------------------------------------------------------------------------------
> >>>> Try before you buy = See our experts in action!
> >>>> The most comprehensive online learning library for Microsoft developers
> >>>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> >>>> Metro Style Apps, more. Free future releases when you subscribe now!
> >>>> http://p.sf.net/sfu/learndevnow-dev2
> >>>> _______________________________________________
> >>>> Vcftools-help mailing list
> >>>> Vcf...@li...
> >>>> https://lists.sourceforge.net/lists/listinfo/vcftools-help
> >>
> >>
> >> ------------------------------------------------------------------------------
> >> Keep Your Developer Skills Current with LearnDevNow!
> >> The most comprehensive online learning library for Microsoft developers
> >> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> >> Metro Style Apps, more. Free future releases when you subscribe now!
> >> http://p.sf.net/sfu/learndevnow-d2d
> >> _______________________________________________
> >> Vcftools-help mailing list
> >> Vcf...@li...
> >> https://lists.sourceforge.net/lists/listinfo/vcftools-help
> >
> >
> >
> >
> > --
> > The Wellcome Trust Sanger Institute is operated by Genome Research
> > Limited, a charity registered in England with number 1021457 and a
> > company registered in England with number 2742969, whose registered
> > office is 215 Euston Road, London, NW1 2BE.
>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
|