|
From: Billie G. <bil...@ut...> - 2014-03-18 16:24:21
|
Hello all, I have a hopefully not too naive question about missing data and depth filtering with vcftools. When I filter my vcf file so that it has no missing data for any individual (using option --geno 1) as I understand it this should eliminate any marker with low or no read coverage in any one individual. command: vcftools --vcf good_variants_final_autosomes.recode.vcf --maf 0.0625 --geno 1 --out Variants_screened --recode But when I then calculate per site total depth for each marker in the file (using --site-depth output option) . . . vcftools --vcf Variants_screened.recode.vcf --site-depth I find that the minimum site coverage value (total across all individuals) is 1. This means there is one or more sites with a total of 1 read covering that variant site (across all individuals (n=32)) in the vcf file. How is this possible? shouldn't the first missing data filter have eliminated these variants from the file ? any advice much appreciated! |