|
From: Petr D. <pd...@sa...> - 2011-04-07 11:21:07
|
Hi Mao, if you want to use existing tools, the script fill-an-ac can calculate the value of AN and AC tags. Then you could get the stats using vcf-stats -f INFO/AN,INFO/AC. For the second task I'd create one VCF file per population using vcf-subset and then vcf-compare to get the numbers you want. Hope this helps, Petr On Thu, 2011-04-07 at 11:42 +0200, Mao Jianfeng wrote: > Dear vcftools-listers, > > VCF format and VCFTOOLS eased my work of NGS genomics. Thanks a lot > for its generation. > > I am facing problem, when I attempt to calculate the frequency > distribution of different types of variations in accessions and > populations. I am new to genomics and not good at computer. I would > like to hear advice/directions from you. > > I have NGS genomic data from 100 individuals of 10 geographic regions. > The data (snp, indels) have been in VCF format. Now, I asked two > problems. > > (1) what is the frequency distribution of genomic variants (snp, > indel) in the different numbers of individuals? For example, in my > case, I want to count how many snp/indel occurred in 10 individuals, > 20 individuals, 30, 40 ,50 ,60, 70, 80, 90, 100 (the whole data). > > (2) what is the frequency distribution of genomic variants (snp, > indel) in the different numbers of populations? For example, in my > case, I want to count how many snp/indel occurred in 2 populations, 4 > populations, 6, 8, 10 (the whole data). > > I check the manu for VCFTOOLS, but I have not find functionality for > these two jobs. Have you, any listers, faced the same problems? What > is your strategy for that? Could you please share any hints with me? > And, I think you advice should be valuable for all the VCF users. > > Thanks a lot in advance. > > ------------------------------------------------------------------------------ > Xperia(TM) PLAY > It's a major breakthrough. An authentic gaming > smartphone on the nation's most reliable network. > And it wants your games. > http://p.sf.net/sfu/verizon-sfdev > _______________________________________________ > Vcftools-help mailing list > Vcf...@li... > https://lists.sourceforge.net/lists/listinfo/vcftools-help -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |