|
From: Eric G. <shi...@gm...> - 2016-10-13 21:17:23
|
Thank you very much, Maybe is because my VCF versions. one of my files is VCFv4.0 and the other VCFv4.2. Can I convert these files from one version to the other? If this is not the problem I guess that I am going to remove the tags as you suggest. Best Wishes, Eric 2016-10-13 14:25 GMT-05:00 Petr Danecek <pd...@sa...>: > Both programs, bcftools merge and vcf-merge, expect valid VCFs on input. > If you don't care about PL and GQ tags, you can remove them using > `bcftools annotate -x` and merge afterwards. > > On Thu, 2016-10-13 at 13:14 -0500, Eric González wrote: > > Hi > > > > > > Using bcftools I have the next error: > > > > Could not parse the header line: > > "##SAMPLE=<ID=LANMLR17B002:250550514,Barcode=CTCGA, > BarcodeWell=A04,Col=4,DNASample=LANMLR17B002,DNA_ > Plate=PTXB73_RIL,Enzyme=ApeKI,Flowcell=C81KJANXX,Flowcell_ > Lane=C81KJANXX_6,FullSampleName=LANMLR17B002:C81KJANXX:6:250550514,Genus= > Zea,Lane=6,LibraryPlate=PTXB73_RIL,LibraryPlateID=450024469,LibraryPrepID=250550514,Population= > ,Row=A,SampleDNA_Well=A04,Species=mays,Status=private>" > > > > > > This is because the bcftools version right? I am using bcftools/1.2 > > > > > > I also have the next errors: > > > > [W::bcf_hdr_check_sanity] PL should be declared as Number=G > > Warning: trying to combine "GQ" tag definitions of different types > > Warning: trying to combine "PL" tag definitions of different lengths > > Warning: trying to combine "PL" tag definitions of different types > > > > > > I call the snps of my two files using different programs one using > > TASSEL and the other using GATK. Can I convert the INFO fileds of > > these files? or what do you think could be the solution to this? call > > the SNPs using the same software? > > > > > > > > > > 2016-10-13 2:41 GMT-05:00 Petr Danecek <pd...@sa...>: > > Hi Eric, > > > > the error messages indicate that the VCF files on input have > > incorrect > > number of values for the PL field at the position 1:201777216. > > Please > > check vcf-validator and VCF specification > > http://samtools.github.io/hts-specs/VCFv4.3.pdf > > > > By the way, the perl side vcf-tools has been made obsolete by > > the new > > (and much faster) bcftools > > http://samtools.github.io/bcftools/ > > > > Best wishes, > > Petr > > > > On Wed, 2016-10-12 at 16:59 -0500, Eric González wrote: > > > Hi > > > > > > I am using vcf-merge to join to vcf files, one contain the > > data of 56 > > > individuals and the other of only one. One of my files have > > 524754 > > > SNPs and the other 105878 however when I join them I obtain > > just a few > > > SNPs ~50000 SNPs and I obtain the next message error: > > > > > > Wrong number of values in LANMLR17B064:250550525/PL at > > 1:201777216 .. > > > nAlleles=4, nValues=3. > > > Expected 10 values for diploid genotypes or 4 for haploid > > genotypes. > > > > > > > > at /LUSTRE/storage/data/software/vcftools/lib/perl5/site_perl/ > Vcf.pm > > > line 172, <$__ANONIO__> line 9640. > > > Vcf::throw(Vcf4_0=HASH(0x7b11a8), "Wrong number of > > values in > > > LANMLR17B064:250550525/PL at 1:2017"...) called > > > > > at /LUSTRE/storage/data/software/vcftools/lib/perl5/site_perl/ > Vcf.pm > > > line 1767 > > > VcfReader::parse_AGtags(Vcf4_0=HASH(0x7b11a8), > > HASH(0xb37ed0)) > > > called > > at /LUSTRE/storage/data/software/vcftools/bin/vcf-merge line > > > 464 > > > main::merge_vcf_files(HASH(0xa4e960)) called > > > at /LUSTRE/storage/data/software/vcftools/bin/vcf-merge line > > 12 > > > > > > when I run vcf-validate in one of my files I get the next: > > > > > > vcf-validator -u > > PTxB73BC1S5RIL_0.1_0.9_56ind_V3_sorted.vcf.gz > > > Leading or trailing space in attr_key-attr_value pairs is > > discouraged: > > > [Population] [ ] > > > > > > > > SAMPLE=<ID=LANMLR17B002:250550514,Barcode=CTCGA, > BarcodeWell=A04,Col=4,DNASample=LANMLR17B002,DNA_ > Plate=PTXB73_RIL,Enzyme=ApeKI,Flowcell=C81KJANXX,Flowcell_ > Lane=C81KJANXX_6,FullSampleName=LANMLR17B002:C81KJANXX:6:250550514,Genus= > Zea,Lane=6,LibraryPlate=PTXB73_RIL,LibraryPlateID=450024469,LibraryPrepID=250550514,Population= > ,Row=A,SampleDNA_Well=A04,Species=mays,Status=private> > > > 1:1785738 .. Could not parse the allele(s) [N], first base > > does not > > > match the reference. > > > 2:17275732 .. REF allele listed in the ALT field?? > > > > > > > > > ------------------------ > > > Summary: > > > 3337 errors total > > > > > > 3168 .. 1:1785738 .. Could not parse the allele(s) > > [N], > > > first base does not match the reference. > > > 55 .. Leading or trailing space in > > attr_key-attr_value pairs > > > is discouraged: > > > 55 .. [Population] [ ] > > > 55 .. > > > > > SAMPLE=<ID=LANMLR17B002:250550514,Barcode=CTCGA, > BarcodeWell=A04,Col=4,DNASample=LANMLR17B002,DNA_ > Plate=PTXB73_RIL,Enzyme=ApeKI,Flowcell=C81KJANXX,Flowcell_ > Lane=C81KJANXX_6,FullSampleName=LANMLR17B002:C81KJANXX:6:250550514,Genus= > Zea,Lane=6,LibraryPlate=PTXB73_RIL,LibraryPlateID=450024469,LibraryPrepID=250550514,Population= > ,Row=A,SampleDNA_Well=A04,Species=mays,Status=private> > > > 4 .. 2:17275732 .. REF allele listed in the ALT > > field?? > > > > > > and in the other: > > > > > > vcf-validator -u PTxB73F1_filtrado_sorted.vcf_bgzip.gz > > > INFO field at 1:57749972 .. Could not validate the float > > [NaN] > > > > > > > > > ------------------------ > > > Summary: > > > 29 errors total > > > > > > 29 .. INFO field at 1:57749972 .. Could not > > validate the > > > float [NaN] > > > > > > vcf-validator -u PTxB73F1_filtrado_sorted.vcf_bgzip.gz > > > INFO field at 1:57749972 .. Could not validate the float > > [NaN] > > > > > > > > > ------------------------ > > > Summary: > > > 29 errors total > > > > > > 29 .. INFO field at 1:57749972 .. Could not > > validate the > > > float [NaN] > > > > > > Could anyone tell me what is happening? > > > > > > Best Wishes, > > > > > > > > > -- > > > Eric > > > > > > > ------------------------------------------------------------ > ------------------ > > > Check out the vibrant tech community on one of the world's > > most > > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > > > _______________________________________________ > > Vcftools-help mailing list Vcf...@li... > > https://lists.sourceforge.net/lists/listinfo/vcftools-help > > > > > > > > > > -- > > The Wellcome Trust Sanger Institute is operated by Genome > > Research > > Limited, a charity registered in England with number 1021457 > > and a > > company registered in England with number 2742969, whose > > registered > > office is 215 Euston Road, London, NW1 2BE. > > > > > > > > -- > > Eric > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > -- Eric |