|
From: Rebecca W. <We...@ca...> - 2018-04-23 12:37:56
|
Hi, I would like to identify all of the intersecting SNPs in .vcf files generated from snippy (https://github.com/tseemann/snippy). There are 71 .vcf files that were generated from mapping reads onto a P. aeruginosa PA14 reference genome. I am trying to use vcftools vcf-isec to identify the intersecting snps. The .vcf files have been compressed by bgzip and indexed by tabix to give .vcf.gz and .vcf.gz.tbi files. When I run vcf-isec on just two of the files as a test I get the following error message: $ vcf-isec -n +2 1_snps.vcf.gz 2_snps.vcf.gz | bgzip -c > isec.vcf.gz Leading or trailing space in attr_key-attr_value pairs is discouraged: [Description] [Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ] INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' "> at /usr/share/perl5/Vcf.pm line 180. If I run vcf-validator on one of the files I get the following error message: $ vcf-validator 1_snps.vcf.gz Leading or trailing space in attr_key-attr_value pairs is discouraged: [Description] [Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ] INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' "> However I can run a check on the file before it is compressed with bgzip and it works with the following output: $ vcftools --vcf 1_snps.vcf VCFtools - v0.1.13 (C) Adam Auton and Anthony Marcketta 2009 Parameters as interpreted: --vcf Tsb_1_snps.vcf After filtering, kept 1 out of 1 Individuals After filtering, kept 40 out of a possible 40 Sites Run Time = 0.00 seconds Please do you have any suggestions about the error messages I am getting with vcf-isec or vcf-validator? Any help would be appreciated. Thanks. Beky |