|
From: Petr D. <pd...@sa...> - 2011-04-08 07:24:03
|
Hi Mao, please take a look at the VCF specification, both tags are described there http://vcftools.sourceforge.net/specs.html Best, Petr On Thu, 2011-04-07 at 17:37 +0200, Mao Jianfeng wrote: > Dear Petr, > > Thanks a lot for your kind reply. > > I will try them out. But, please see the followings. > > > 2011/4/7 Petr Danecek <pd...@sa...>: > > Hi Mao, > > > > if you want to use existing tools, the script fill-an-ac can calculate > > the value of AN and AC tags. Then you could get the stats using > > vcf-stats -f INFO/AN,INFO/AC. > > I have no idea about AN and AC tags. I have not find enough INFO about > them. I want to understand them more. Could you please give me more > explanation on them, or point out where I can find them in detail? > > I am new to genomics, and I have no colleagues who use VCF and > VCFTools tool. So my improvement depends on the mailing list. > > Thanks in advance. > > > > > For the second task I'd create one VCF file per population using > > vcf-subset and then vcf-compare to get the numbers you want. > > > > Hope this helps, > > Petr > > > > > > On Thu, 2011-04-07 at 11:42 +0200, Mao Jianfeng wrote: > >> Dear vcftools-listers, > >> > >> VCF format and VCFTOOLS eased my work of NGS genomics. Thanks a lot > >> for its generation. > >> > >> I am facing problem, when I attempt to calculate the frequency > >> distribution of different types of variations in accessions and > >> populations. I am new to genomics and not good at computer. I would > >> like to hear advice/directions from you. > >> > >> I have NGS genomic data from 100 individuals of 10 geographic regions. > >> The data (snp, indels) have been in VCF format. Now, I asked two > >> problems. > >> > >> (1) what is the frequency distribution of genomic variants (snp, > >> indel) in the different numbers of individuals? For example, in my > >> case, I want to count how many snp/indel occurred in 10 individuals, > >> 20 individuals, 30, 40 ,50 ,60, 70, 80, 90, 100 (the whole data). > >> > >> (2) what is the frequency distribution of genomic variants (snp, > >> indel) in the different numbers of populations? For example, in my > >> case, I want to count how many snp/indel occurred in 2 populations, 4 > >> populations, 6, 8, 10 (the whole data). > >> > >> I check the manu for VCFTOOLS, but I have not find functionality for > >> these two jobs. Have you, any listers, faced the same problems? What > >> is your strategy for that? Could you please share any hints with me? > >> And, I think you advice should be valuable for all the VCF users. > >> > >> Thanks a lot in advance. > >> > >> ------------------------------------------------------------------------------ > >> Xperia(TM) PLAY > >> It's a major breakthrough. An authentic gaming > >> smartphone on the nation's most reliable network. > >> And it wants your games. > >> http://p.sf.net/sfu/verizon-sfdev > >> _______________________________________________ > >> Vcftools-help mailing list > >> Vcf...@li... > >> https://lists.sourceforge.net/lists/listinfo/vcftools-help > > > > > > > > > > -- > > The Wellcome Trust Sanger Institute is operated by Genome Research > > Limited, a charity registered in England with number 1021457 and a > > company registered in England with number 2742969, whose registered > > office is 215 Euston Road, London, NW1 2BE. > > -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |