|
From: Eric G. <shi...@gm...> - 2016-10-13 18:14:20
|
Hi Using bcftools I have the next error: Could not parse the header line: "##SAMPLE=<ID=LANMLR17B002:250550514,Barcode=CTCGA,BarcodeWell=A04,Col=4,DNASample=LANMLR17B002,DNA_Plate=PTXB73_RIL,Enzyme=ApeKI,Flowcell=C81KJANXX,Flowcell_Lane=C81KJANXX_6,FullSampleName=LANMLR17B002:C81KJANXX:6:250550514,Genus=Zea,Lane=6,LibraryPlate=PTXB73_RIL,LibraryPlateID=450024469,LibraryPrepID=250550514,Population= ,Row=A,SampleDNA_Well=A04,Species=mays,Status=private>" This is because the bcftools version right? I am using bcftools/1.2 I also have the next errors: [W::bcf_hdr_check_sanity] PL should be declared as Number=G Warning: trying to combine "GQ" tag definitions of different types Warning: trying to combine "PL" tag definitions of different lengths Warning: trying to combine "PL" tag definitions of different types I call the snps of my two files using different programs one using TASSEL and the other using GATK. Can I convert the INFO fileds of these files? or what do you think could be the solution to this? call the SNPs using the same software? 2016-10-13 2:41 GMT-05:00 Petr Danecek <pd...@sa...>: > Hi Eric, > > the error messages indicate that the VCF files on input have incorrect > number of values for the PL field at the position 1:201777216. Please > check vcf-validator and VCF specification > http://samtools.github.io/hts-specs/VCFv4.3.pdf > > By the way, the perl side vcf-tools has been made obsolete by the new > (and much faster) bcftools > http://samtools.github.io/bcftools/ > > Best wishes, > Petr > > On Wed, 2016-10-12 at 16:59 -0500, Eric González wrote: > > Hi > > > > I am using vcf-merge to join to vcf files, one contain the data of 56 > > individuals and the other of only one. One of my files have 524754 > > SNPs and the other 105878 however when I join them I obtain just a few > > SNPs ~50000 SNPs and I obtain the next message error: > > > > Wrong number of values in LANMLR17B064:250550525/PL at 1:201777216 .. > > nAlleles=4, nValues=3. > > Expected 10 values for diploid genotypes or 4 for haploid genotypes. > > > > at /LUSTRE/storage/data/software/vcftools/lib/perl5/site_perl/Vcf.pm > > line 172, <$__ANONIO__> line 9640. > > Vcf::throw(Vcf4_0=HASH(0x7b11a8), "Wrong number of values in > > LANMLR17B064:250550525/PL at 1:2017"...) called > > at /LUSTRE/storage/data/software/vcftools/lib/perl5/site_perl/Vcf.pm > > line 1767 > > VcfReader::parse_AGtags(Vcf4_0=HASH(0x7b11a8), HASH(0xb37ed0)) > > called at /LUSTRE/storage/data/software/vcftools/bin/vcf-merge line > > 464 > > main::merge_vcf_files(HASH(0xa4e960)) called > > at /LUSTRE/storage/data/software/vcftools/bin/vcf-merge line 12 > > > > when I run vcf-validate in one of my files I get the next: > > > > vcf-validator -u PTxB73BC1S5RIL_0.1_0.9_56ind_V3_sorted.vcf.gz > > Leading or trailing space in attr_key-attr_value pairs is discouraged: > > [Population] [ ] > > > > SAMPLE=<ID=LANMLR17B002:250550514,Barcode=CTCGA,BarcodeWell=A04,Col=4, > DNASample=LANMLR17B002,DNA_Plate=PTXB73_RIL,Enzyme=ApeKI, > Flowcell=C81KJANXX,Flowcell_Lane=C81KJANXX_6,FullSampleName=LANMLR17B002: > C81KJANXX:6:250550514,Genus=Zea,Lane=6,LibraryPlate= > PTXB73_RIL,LibraryPlateID=450024469,LibraryPrepID=250550514,Population= > ,Row=A,SampleDNA_Well=A04,Species=mays,Status=private> > > 1:1785738 .. Could not parse the allele(s) [N], first base does not > > match the reference. > > 2:17275732 .. REF allele listed in the ALT field?? > > > > > > ------------------------ > > Summary: > > 3337 errors total > > > > 3168 .. 1:1785738 .. Could not parse the allele(s) [N], > > first base does not match the reference. > > 55 .. Leading or trailing space in attr_key-attr_value pairs > > is discouraged: > > 55 .. [Population] [ ] > > 55 .. > > SAMPLE=<ID=LANMLR17B002:250550514,Barcode=CTCGA,BarcodeWell=A04,Col=4, > DNASample=LANMLR17B002,DNA_Plate=PTXB73_RIL,Enzyme=ApeKI, > Flowcell=C81KJANXX,Flowcell_Lane=C81KJANXX_6,FullSampleName=LANMLR17B002: > C81KJANXX:6:250550514,Genus=Zea,Lane=6,LibraryPlate= > PTXB73_RIL,LibraryPlateID=450024469,LibraryPrepID=250550514,Population= > ,Row=A,SampleDNA_Well=A04,Species=mays,Status=private> > > 4 .. 2:17275732 .. REF allele listed in the ALT field?? > > > > and in the other: > > > > vcf-validator -u PTxB73F1_filtrado_sorted.vcf_bgzip.gz > > INFO field at 1:57749972 .. Could not validate the float [NaN] > > > > > > ------------------------ > > Summary: > > 29 errors total > > > > 29 .. INFO field at 1:57749972 .. Could not validate the > > float [NaN] > > > > vcf-validator -u PTxB73F1_filtrado_sorted.vcf_bgzip.gz > > INFO field at 1:57749972 .. Could not validate the float [NaN] > > > > > > ------------------------ > > Summary: > > 29 errors total > > > > 29 .. INFO field at 1:57749972 .. Could not validate the > > float [NaN] > > > > Could anyone tell me what is happening? > > > > Best Wishes, > > > > > > -- > > Eric > > ------------------------------------------------------------ > ------------------ > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > > _______________________________________________ Vcftools-help mailing > list Vcf...@li... https://lists.sourceforge.net/ > lists/listinfo/vcftools-help > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > -- Eric |