|
From: Adam A. <ada...@gm...> - 2012-07-13 11:53:29
|
This is expected behavior. You need to add --recode-INFO-all to retain the INFO field. Adam Sent from my iPhone On Jul 13, 2012, at 6:13 AM, David Jones <dr...@sa...> wrote: > Morning all, > > I just want to check before I submit an official bug report that this is actually a bug not an expected behaviour and I've missed something: > > I have a vcf file with 2 samples, where the INFO, ID, and FILTER fields are populated with two samples as shown below. > > <snip> > > ##FILTER=<ID=UM,Description="description"> > #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOUR > 1 16257 ID_1 G C . UM;MQ;HSD > DP=67;GP=7.0e-03;MP=9.9e-01;SG=CG/GG;SP=7.0e-03;TG=GG/CG;TP=9.9e-01 > GT:AA:CA:GA:TA:PM 0|0:0:1:28:0:3.4e-02 0|1:0:7:31:0:1.8e-01 > 1 20136 ID_2 T C . UM;MQ;HSD > DP=92;GP=1.6e-01;MP=8.4e-01;SG=CT/TT;SP=1.6e-01;TG=TT/CT;TP=8.4e-01 > GT:AA:CA:GA:TA:PM 0|0:0:1:0:27:3.6e-02 0|1:0:8:0:56:1.2e-01 > 1 57999 ID_3 G T . UM;MN > DP=25;GP=4.0e-03;MP=9.9e-01;SG=GG/TT;SP=7.3e-02;TG=GG/GT;TP=9.1e-01 > GT:AA:CA:GA:TA:PM 0|0:0:0:15:0:0.0e+00 0|1:0:0:7:3:3.0e-01 > 1 61219 ID_4 T C . UM;MQ > DP=18;GP=1.7e-01;MP=8.3e-01;SG=TT/CC;SP=3.2e-01;TG=TT/CT;TP=5.1e-01 > GT:AA:CA:GA:TA:PM 0|0:0:0:0:9:0.0e+00 0|1:0:4:0:5:4.4e-01 > 1 62578 ID_5 G A . UM;MN;MQ DP=56;GP=2.4e-03;MP=1.0e > +00;SG=GG/AAG;SP=3.1e-01;TG=GG/AGG;TP=6.8e-01 GT:AA:CA:GA:TA:PM 0| > 0:2:0:35:0:5.4e-02 0|1:6:0:13:0:3.2e-01 > 1 73841 ID_6 C T . UM > DP=28;GP=2.6e-04;MP=8.1e-01;SG=CC/CTT;SP=3.6e-01;TG=CC/CCT;TP=4.3e-01 > GT:AA:CA:GA:TA:PM 0|0:0:19:0:0:0.0e+00 0|1:0:6:0:3:3.3e-01 > 1 84020 ID_7 A G . SR;RP DP=27;GP=3.2e-04;MP=1.0e > +00;SG=AA/GGG;SP=2.2e-01;TG=AA/AGG;TP=6.1e-01 GT:AA:CA:GA:TA:PM 0| > 0:19:0:0:0:0.0e+00 0|1:4:0:4:0:5.0e-01 > 1 84022 ID_8 G A . SR;RP DP=29;GP=8.5e-05;MP=1.0e > +00;SG=GG/AAA;SP=2.2e-01;TG=GG/AAG;TP=6.1e-01 GT:AA:CA:GA:TA:PM 0| > 0:0:0:21:0:0.0e+00 0|1:4:0:4:0:5.0e-01 > 1 84024 ID_9 A G . SR;UM;RP DP=30;GP=4.3e-05;MP=1.0e > +00;SG=AA/GGG;SP=2.2e-01;TG=AA/AGG;TP=6.1e-01 GT:AA:CA:GA:TA:PM 0| > 0:22:0:0:0:0.0e+00 0|1:4:0:4:0:5.0e-01 > 1 84026 ID_10 G A . SR;UM;MN;RP > DP=32;GP=1.8e-02;MP=9.8e-01;SG=GG/AAA;SP=3.2e-01;TG=GG/AAG;TP=5.8e-01 > GT:AA:CA:GA:TA:PM 0|0:1:0:22:0:4.3e-02 0|1:5:0:4:0:5.6e-01 > > </snip> > > > when I run the command: > > > vcftools --vcf input.vcf --out testOut --recode --remove-filtered-all > > > I seem to get, as expected, a vcf file named testOut.recode.vcf with > only those positions that have PASS in the filter field. HOWEVER the > data seems to have lost the INFO field and it has been replaced with a > '.' (see below). > > > <snip> > > > ##FILTER=<ID=UM,Description="description"> > #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOUR > 1 129151 ID_14 T G . PASS . GT:AA:CA:GA:TA:PM 0|0:0:0:0:13:0.0e+00 0| > 1:0:0:3:9:2.5e-01 > 1 1666175 ID_119 C T . PASS . GT:AA:CA:GA:TA:PM 0|0:0:15:0:0:0.0e+00 > 0|1:0:20:0:4:1.7e-01 > 1 1857336 ID_124 G A . PASS . GT:AA:CA:GA:TA:PM 0|0:0:0:20:0:0.0e+00 > 0|1:6:0:15:0:2.9e-01 > 1 2329409 ID_130 A G . PASS . GT:AA:CA:GA:TA:PM 0|0:41:0:0:0:0.0e+00 > 0|1:40:0:16:0:2.9e-01 > 1 2391122 ID_131 G T . PASS . GT:AA:CA:GA:TA:PM 0|0:0:0:10:0:0.0e+00 > 0|1:0:0:12:3:2.0e-01 > 1 2620133 ID_204 C G . PASS . GT:AA:CA:GA:TA:PM 0|0:0:26:0:0:0.0e+00 > 0|1:0:20:6:0:2.3e-01 > 1 2628561 ID_244 C T . PASS . GT:AA:CA:GA:TA:PM 0|0:0:116:0:7:5.7e-02 > 0|1:0:146:0:15:9.3e-02 > 1 2628576 ID_245 G T . PASS . GT:AA:CA:GA:TA:PM 0|0:3:1:77:4:4.7e-02 > 0|1:4:1:104:9:7.6e-02 > 1 3093904 ID_264 T A . PASS . GT:AA:CA:GA:TA:PM 0|0:0:0:0:32:0.0e+00 > 0|1:16:0:0:29:3.6e-01 > 1 3802432 ID_271 T G . PASS . GT:AA:CA:GA:TA:PM 0|0:0:0:0:22:0.0e+00 > 0|1:0:0:8:26:2.4e-01 > > > </snip> > > > Is this an expected behaviour or a bug? > > > Dave > > -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a compa ny registered in England with number 2742969, whose registered office is 2 15 Euston Road, London, NW1 2BE. > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Vcftools-help mailing list > Vcf...@li... > https://lists.sourceforge.net/lists/listinfo/vcftools-help |