Summarize and plot(auto detect variable type) for all or specified variables in a vcf file (except CHROM POS ID fields,which I think not necessary).
Extract arbitrary fixed fields or values in sample field to TAB delimited file.
Simplest usage: vcfsummarize.sh -f filename.vcf -a #extract and summarize all fields and subfields
For large file: nohup vcfsummarize.sh -f filename.vcf -a &
Usage
-f [required] Take 1 file. The target vcf file. Support plain txt and gz,bz file
-a [optional] Take 0 argument. If specified, extract and summarize all variables
-q [optional] Take 0 argument. If specified, will skip REF, ALT, QUAL, FILTER fields
-c [optional] Take 1 string. e.g. -c \"chr1\" will limit analysis to records with CHROM fields equal to chr1
-I [optional] Take 1 string. e.g. -I \"AN DB\" will extract and summarize AN and DB subfields in INFO field. Will overwrite option -a, which analyze all subfields in INFO.
-F [optional] Take 1 string. e.g. -F \"GT AD DP\" will extract and summarize GT AD DP subfields in sample columns. Will overwrite option -a, which analyze all subfields in sample columns
-s [optional] Take 0 argument. If specified, just do data extraction. Suppress summarization and plotting
-o [optional] Take 1 string. The output directory name. Default is vcfsummarize
This is a personal effort without any funding support
Report suggestion and bug to ruansun@163.com";