Menu

#62 --window-pi Issues

v1.0_(example)
open
nobody
None
1
2017-04-03
2017-04-03
No

Hi there,

I am having a similar issue to the above with the --window-pi setting in VCF-tools, where bins are being skipped: however, in this case, I am sure that there are relevant SNPs in the skipped windows.

My data set consists of individuals that have been sorted into different groups: In one case, all individuals were merged into a file called All.merged.vcf. The second case merged VCF files from only individuals from a specific geographic site, into a file called AM.merged.vcf. Worth noting that AM.merged.vcf is a subset of All.merged.vcf.

My commands are as follows:

vcftools --vcf All.merged.vcf --window-pi 50000 --out Run1

vcftools --vcf AM.merged.vcf --window-pi 50000 --out Run2

However, when I look at the results, the All.merged.vcf result skips over the first million bases in the chromosome:

cat Run1.windowed.pi | grep "ChrX" | head
Bm_v4_ChrX_scaffold_001 1050001 1100000 3 3.06977e-05
Bm_v4_ChrX_scaffold_001 1700001 1750000 1 8.11839e-06
Bm_v4_ChrX_scaffold_001 1950001 2000000 1 9.19662e-06
Bm_v4_ChrX_scaffold_001 7800001 7850000 1 4.8203e-06
Bm_v4_ChrX_scaffold_001 10550001 10600000 2 2.04651e-05
Bm_v4_ChrX_scaffold_001 10800001 10850000 2 2.04651e-05

These bases are not, however, skipped in the AM.merged.vcf result:

cat Run2.windowed.pi | grep "ChrX" | head
Bm_v4_ChrX_scaffold_001 1 50000 1 5e-06
Bm_v4_ChrX_scaffold_001 50001 100000 1 1.07143e-05
Bm_v4_ChrX_scaffold_001 150001 200000 1 1.14286e-05
Bm_v4_ChrX_scaffold_001 200001 250000 1 1.14286e-05
Bm_v4_ChrX_scaffold_001 400001 450000 1 1.07143e-05
Bm_v4_ChrX_scaffold_001 450001 500000 1 5e-06

I can definitely confirm that the All.merged.vcf file has valid SNPs in the first million bases (in fact, through other means I can confirm that the SNP density is quite high there), so I am wondering if I am entering the command wrong, and why the tool might be skipping those buckets in one case but not the other, when the second is a subset of the first?

I can provide the vcf-merge commands to create those files, if that might be a cause of the issue, but other tools seem to work fine with the vcf files (specifically ldepth.mean).

Thank you for your help!

John

Discussion


Log in to post a comment.

MongoDB Logo MongoDB