Full Commit List :


  • VcfPrinter: New flag "--indel" added to handle indels when merging vcfs.
  • VcfPrinter: Added unit test cases for VcfLine class.


Version 1.4.3 (03-01-2013)

  • Add an option “-w” to only evaluate the sites in a given VIP list.
  • Atlas-SNP2 will evaluate the VIP sites of extra-high coverage, regardless of the setting of “maximum coverage”. And these VIP sites will be marked as “high_coverage” in the filter column if they are higher than “maximum coverage”. (In previous version, these sites are skipped)
  • Require the users to set the sequencing platform. The labels are “--Illumina”,”454_FLX”,”454_XLR”. To make it compatible to previous submission scripts of various users, “-s” for Illumina data will still work.
  • Bug fixed in -always-include option which caused included low-Quality sites to be printed with a QUAL of 'false' and missing the P value. See ticket.
  • Removed the ReqIncl filter, this is now only indicated in the INFO column.
  • New "--fast" implemented. Since it uses memory for storing variant information across all samples should only be used to merge small number of sites (~50000) across small number of samples (~20).
  • New option "--cluster". Designed to be used in a HPC environment. Useful when merging millions of variants across thousands of samples.
  • Logging implemented.

Version 1.4.1 (09-06-2012)

  • Added always-include option
  • Added show-filtered option
  • Added version information and running commend in the VCF header
  • Added alway-include option
  • Added show-filtered option
  • Fixed bug caused by passing a non-fasta reference genome
  • Fixed bug occationally returning infinite P value in INFO column
  • Fixed bug caused by reads mapping past the end of the reference genome
  • Made Atlas2-Indel2 more tolerant of malformed SAM lines
  • For Illumina/ 454 platforms

Version 1.3 (08-18-2011)

  • Add a new option to call SNP on given regions or by chromosomes
  • Change the default maximum coverage for SNP calling to 1024
  • For pair-end data, add an option to use insertion size for mapping quality control
  • Improve the performance of crossmatch2SAM

Version 1.2 (01-18-2011)

This is a major upgrade of Atlas2


New features

  • one-stop running: take sorted BAM files and reference file as input and output SNP genotypes in VCF format
  • use mapping quality score as alignment quality control
  • use insertion size as mapping quality control for pair-end re-sequencing data
  • more filters are integrated for higher quality SNP calls


  • whole genome SNP calling is doable on a typical PC with 4G memory now. In our test, it can process 1 million reads per 5 minutes for whole exome SNP calling only using one CPU core of Xeon 5520 and 4G memory

Bugs fixed and compatibility

  • more robust to alignment errors
  • crossmatch2SAM tool is compatible to Ruby 1.9.X now
  • a few minor bugs

Version 1.1 (04-26-2010)

  • added a heuristics-based genotyping module
  • added a column of “numRefReads_afterFilter” in Atlas-SNP2 result file
  • revised the header line in Atlas-SNP2 output file to be more explicit
  • skipped duplicate reads masked in the BAM files when processing
  • added an option for the user to setup the max number of alignments allowed to be piled up at a particular site
  • printed more running information and more detailed alignments statistics
  • more robust to various alignments errors
  • fixed several bugs

Version 1.0 (01-20-2010)

  • added Illumina Platform support
  • all calculations are now based on required fields of SAM to get maximum compatibility
  • added CIGAR and reference sequence test code
  • used pileup number to calculate TotalCoverage
  • improved performance
  • migrated to Ruby 1.9
  • many minor improvements

Draft release version 0.1 (12-10-2009):
initial implementation
initial support of SAM files

For SOLiD platform (08-18-2011)

  • Major SNP calling model update
  • Support GATK base quality re-calibrated BAM by using OQ tags
  • Call SNPs only on regions defined in a bed format file
  • Output the SNP calls in vcf format directly
  • updated SOLiD model and adjusted P cutoffs
  • changed -P cutoff to apply to both 1bp insertions and deltions (rather than just 1bp deletions)

Version 0.3.1

  • added options to use original base quality
  • fixed bug that sometimes returned success exit code when there was a failure
  • fixed bug in simple_genotyper that caused samples with exactly 0.05 variant read ratio to be 0/0
  • fixed bug in simple genotyper that caused genotypes to occationaly read ./.
  • fixed bug in bed_filter that was filtering some on-target reads in very small target regions

Version 0.3

  • updated SOLiD and Illumina models and recalibrated default settings
  • implemented the ability to input a bed file to call only on-target indels
  • switched from using z cutoffs to using p cutoffs
  • modified 1bp p cutoff to only filter 1bp deletions
  • fixed bug where the strand direction filter failed to be enabled
  • added check for proper ruby version
  • fixed bug that occasionally allows an indel quality of 110 (max should be 100)
  • minor code-structure changes

Version 0.2.1

  • added read_level model and improved site level model for SOLiD data
  • adjusted default SOLiD z cutoff to 0.0 (to reflect new model)
  • added check for proper ruby version
  • minor codes structure changes
  • added additional heuristic filter that allows for a stricter z cutoff for 1bp indels, very useful for SOLiD data
  • integrated heuristic genotyping –implemented
  • fixed bug where Atlas-Indel2 crashes if a BAM chromosome is not in the reference
  • now will keep ‘chr’ in the chromosome label if it is in the BAM
  • the depreciated script "Atlas-Indel2-Illum-Exome.rb, has been removed. Please use Atlas-
    Indel2.rb with the -I flag instead

Version 0.2

  • Implemented regression model for SOLiD data. You must now specify a regression model with
    -S or -I.
  • Renamed main script to Atlas-Indel.rb.
  • Modified Reference sequence class to allow for unsorted reference genomes.
  • Added the indel z to the info column of the VCF output (not included after running VCF printer).
  • Now echos all settings back onto the command line.
  • Fixed a bug that caused loss of precision in the normalized variant square variable of the
    Illumina site model.
  • Fixed a bug in the depth coverage algorithm that caused reads not to be counted in total depth at the deleted sites.
  • Fixed the sample columns order to be comaptible with vcfPrinter.
  • Removed "x flagged lines skipped" message at end of run.


Wiki: Atlas2 Suite