|
From: Jing Z. <jin...@gm...> - 2018-03-30 16:39:56
|
Hi, I was trying to extract SNPs with LD>0.8 for a list of input SNPs based on EUR population from 1kg phase vcf files. I have tried several ways met several problems. Would you please help. Thank you very much. 1. I extracted EUR samples and used the following command "*vcftools --vcf f.vcf --ld-window-bp 200000 --hap-r2-positions positions.list --min-r2 0.8 --out prefix*" the positions list file is like: 1 1287055 1 7917076 1 8095500 However, most of SNPs have no LD SNPs found. I did chrom by chrom, for those returned, they only return no more than LD SNPs for only 1 orginal SNPs . While I used other web-based tools and returns quite a lot of LD SNPs. 2. Then I also tried to use PLINK. However, the 1kg phase3 data have multiallelic positions, which have >1 lines in the vcf files. I also tried "--min-alleles 2 --max-alleles 2 " to remove but it does not work. Then PLINK will always give me an error because of this duplicates. Would you please suggest in this case? Thank you very much! -- Jing Zhang --------------------------------------------------------------- Associate Research Scientist, Molecular Biophysics & Biochemistry, Computational Biology and Bioinformatics, Yale University ___________________________________ Email: jin...@gm..., j....@ya... ᐧ |