VarScan - Variant detection in massively parallel sequencing data
VarScan is a platform-independent, technology-independent software tool for identifying SNPs and indels in massively parallel sequencing of individual and pooled samples. Given data for a single sample, VarScan identifies and filters germline variants based on read counts, base quality, and allele frequencyh. Given data for a tumor-normal pair, VarScan also determines the somatic status of each variant (Germline, Somatic, or LOH) by comparing read counts between samples.
VarScan is implemented in Java, so it runs on any machine with the Java Virtual Machine (VM) installed. Most operating systems (Linux, Mac OSX, Windows) come with Java pre-installed. The steps to install and run VarScan:
- Download the latest VarScan JAR file
- Run VarScan with the command
java -jar VarScan.v2.2.jar
Running the above command with no arguments will display VarScan usage.
Like SAMtools, all VarScan tools are accessed by subcommands:
USAGE: java java -jar VarScan.v2.2.jar [COMMAND] [OPTIONS] COMMANDS: pileup2snp [pileup_file] >outfile.snp Call SNPs from a pileup file pileup2indel [pileup_file] >outfile.indel Call indels a pileup file pileup2cns [pileup_file] >outfile.cns Call consensus and variants from a pileup file somatic [normal_pileup] [tumor_pileup] varscan.out Call germline/somatic variants from tumor-normal data readcounts [pileup] --variants-file [variants] Compute read counts supporting each allele filter [outfile.snp] >outfile.snp.filter Filter SNPs/indels by coverage, frequency, p-value. somaticFilter varscan.out.snp >varscan.out.snp.filter Filter somatic variants for clusters/indels compare [file1] [file2] [type] [output] Merge/intersect/substract 2 sets of variants limit [infile] [--positions-file] [--regions-file] Restrict pileup/snps/indels to ROI positions