Atlas-SNP2 is designed to evaluate and distinguish true SNP from sequencing and mapping errors in whole-exome capture sequencing (WECS) data.
ruby Atlas-SNP2.rb Atlas-SNP2.rb -i [in.sorted.bam] -r [reference.fa] -o [output file] -n [sample name] [choosing platform]
Atlas-SNP2 is coded in Ruby and basic usage can be viewed by running the program without any argument.
-i FILE BAM format alignment file (Required to be sorted by start position)
-r FILE FASTA format reference sequence file (Required)
-o STR name of output result file (Required)
-n Sample name used in VCF file (Required)
-t STR Only call SNP on given target region (Optional, please refer "samtools view" for the target region format)
-a FILE file containing sites will always be included(optional)
-w only evaluate sites in the list (use with -a, optional)
-F Include filtered lines in the output that have a QUAL of at least 1
--454_FLX 454 FLX
--454_XLR 454 Titanium
Setting up prior probability
-e FLT Prior(error|c) when variant coverage number is above 2 for 454 and Illumina data (Default is 0.1)
-l FLT Prior(error|c) when variant coverage number is 1 or 2 for 454 data (Default is 0.9)
Setting up filters
-c Posterior probability cutoff (Default is 0.95)
-y Minimal Coverage required for high confidence SNP calls (Default is 6)
-m FLT maximum percentage of substitution bases allowed in the alignment (Default is 5.0)
-g FLT maximum percentage of insertion and deletion bases allowed in the alignment (Default is 5.0)
-f INT maximum number of alignments allowed to be piled up on a site (Default is 1024)
-p INT insertion size for pair-end re-sequencing data (Default is OFF)
# Call SNP of one Illumina BAM only on chr1 and output the SNP results in VCF file ~/NA12275.chr1.snp.vcf
ruby Atlas-SNP2.rb -i NA12275.bam -r ~/refs/human_g1k_v37.fasta -o NA12275.chr1.snp –-Illumina –t chr1 –v –n NA12275
SOLiD-SNP-caller \<in.bam> \<ref.fa> [.bed region] > [output.vcf]
SOLiD-SNP-caller is coded in C++ and basic usage can be viewed by running the program without any argument.
\<in.bam> FILE BAM format alignment file (Required to be sorted by start position)
\<ref.fa> FILE FASTA format reference sequence file (Required)
\<.bed> FILE Only call SNP on given regions defined in bed format (optional)
# Call SNP of one SOLiD BAM only on coding regions and output the SNP results in VCF file NA20532.ontarget.vcf
SOLiD-SNP-caller NA20532.bam ~/refs/human_g1k_v37.fasta ccds.bed > NA20532.ontarget.vcf
The Atlas2 Ion software incorporates the well-established algorithms of current Atlas2 builds, but also focuses attention on the specific systematic error modes of the platform's base-caller in order to return results of the highest quality. Although analysis platforms already exist that have been adapted to call Ion data to a good standard (for example GATK), we aim to surpass these projects by building an Ion-specific platform. This should provide end-users with increased confidence in their data and in the Ion torrent platform itself. It will also expand the scope of the Atlas2 project itself, which aims to remain the leading platform of its type.
Analysis capabilities for the Ion Torrent PGM are currently in the late stages of development. The features are currently being fine-tuned prior to the software's version 1 release. Several new algorithms have been developed, with a focus on Ion-specific error modes such as those related to read length, homopolymer length and GC content. Other, more generally-applicable, algorithms are also being calibrated to the Ion platform.
Atlas2 Ion will initially be made available as a stand-alone product, with integration into the main Atlas2 suite possible in the future. There will be an announcement once Atlas2 Ion Version 1 is released. Comprehensive usage documentation will also be added to this wiki page at that time.