|
From: DAVID V. <dav...@st...> - 2019-09-16 09:57:41
|
Good morning vcftools users,
I am currently using vcftools to filter my vcf file and I have a question
regarding the --max-meanDP filtering option, as I think I am
misunderstanding something about it. Here my problem:
I want to filter out SNPs with too high coverage, specifically I want to
exclude SNPs with mean depth values (over all included individuals) greater
than 1.5X the mean depth of my entire dataset. I think the --max-meanDP
should do the job.
So, first of all I generate a report ("my_report.ldepth.mean") containing
the mean depth per site averaged across all individuals, using the option
--site-mean-depth. Using the information at the column "MEAN_DEPTH" I can
see that the mean depth of my SNPs is 36.06636. Therefore I will try to
exclude all SNPs with mean depth greater than 54.09954, by doing:
vcftools --vcf my_file.vcf --max-meanDP 54.1 --recode --out my_filtered_vcf
Unfortunately it does not work, as my filtered vcf file will still contain
SNPs with average depth greater than 54.1.
Specifically, according to the report "my_report.ldepth.mean", there
are 11,737 with mean depth greater than 54.1, but vcftools removes only 879
SNPs when I apply the code above. Additionally, I attach to this email two
histograms: one containing the mean depth distribution for all SNPs
("OV_All_SNPs_meanDepth_distribution") and one containing the same
information after I've applied the --mean-maxDP filter
("OV_Filtered_SNPs_meanDepth_distribution"). As you can see there are still
some SNPs with depth greater than 54.1.
Why is this happening? What is the difference between the "mean depth per
site averaged across all individuals" calculated by the option
--site-mean-depth (that generates the report) and the option --mean-maxDP
(that filters the vcf file)?
Many thanks for your support,
All the best,
David
--
David Vendrami
CACHE Marie Curie ITN fellow
Department of Animal Behaviour
Molecular Ecology Group
University of Bielefeld
Postfach 100131
33501 Bielefeld
Germany
s <%2B49%20%280%29521%201062711>kype: david.vendrami1
http://www.
<http://www.uni-bielefeld.de/biologie/animalbehaviour/hoffman/publications.html>
thehoffmanlab.com
|