SAM (Sequence Alignment/Map) is a flexible generic format for storing nucleotide sequence alignment. SAMtools provide efficient utilities on manipulating alignments in the SAM format. The main samtools source code repository moved to GitHub in March 2012. For ongoing development since then, see http://github.com/samtools/samtools
Genetic variants discovery tool
Bioinformatics pipeline for discovery of genetic variants from NGS reads.
Lipi Toolkit is a generic toolkit for online Handwriting Recognition (HWR), and contains tools and algorithms for HWR. The current versions focus on isolated shapes and characters. Details at http://lipitk.sourceforge.net.
Pretty Damn Quick (PDQ) analytically solves queueing network models of computer and manufacturing systems, data networks, etc., written in conventional programming languages. Generic or customized reports of predicted performance measures are output.
GGI stands for "General Graphics Interface", and it is a project that aims to develop a reliable, stable and fast graphics system that works everywhere. We want to allow any program using GGI to run on any platform requiring at most a recompile.
Bisulfite-seq/NOMe-seq SNPs & cytosine methylation caller
Now in Github: https://github.com/dnaase/Bis-tools/tree/master/Bis-SNP BisSNP is a package based on the Genome Analysis Toolkit (GATK) map-reduce framework for genotyping in bisulfite treated massively parallel sequencing (Bisulfite-seq, NOMe-seq and RRBS) on Illumina platform. It uses bayesian inference with either manually specified or automatically estimated methylation probabilities of different cytosine context(not only CpG, CHH, CHG in Bisulfite-seq, but also GCH et.al. in other bisulfite treated sequencing) to determine genotypes and methylation levels simultaneously. It works for both of single-end and paired-end reads.Specificity and sensitivity has been validate by Illumina IM SNP array. In default threshold 30X data (Phred scale score > 20), it could detect 92.21% heterozygous SNPs with 0.14% false positive rate Cytosine calling is not only based on reference context, so it could detect non-reference cytosine context. Google group for help: http://goo.gl/zL7Nj
This project is for maintaining a linux system that concentrates on math, logic, and geometry related softwares.
FLOSSmole (formerly OSSmole) is a set of tools for gathering metrics and publishing analyses about development of free/libre/open source projects. UPDATE: Our data is now released to Google Code. Please see our new home page at http://flossmole.org
Tools for genomic analysis
BamBam includes numerous tools for analyzing DNA next-generation sequencing data. Tools are provided for calling SNPs and indels, identifying large scale deletions, tabulating counts of mapped reads, methylation analysis, and more. Depends on SAMtools (http://samtools.sourceforge.net/) and BAMtools (https://github.com/pezmaster31/bamtools). Also uses BioPerl, which is included in the download tarball.
Evoker is a graphical tool for plotting genotype intensity data in order to assess quality of genotype calls. It implements a compact, binary format which allows rapid access to data, even with hundreds of thousands of observations. PLEASE NOTE: This source repository is no longer active. See the github link above for the latest version.
Computational Suite For Bioinformaticians and Biologists
CSBB is a command line based bioinformatics suite to analyze biological data acquired through varied avenues of biological experiments. CSBB is implemented in Perl, while it also leverages the use of R, java and ruby in background for specific modules. Major focus of CSBB is to allow users from biology and bioinformatics community, to get benefited by performing down-stream analysis tasks while eliminating the need to write programming code. CSBB is currently available on Linux, UNIX and Windows platforms. Currently CSBB provides 17 modules focused on analytical tasks like normalization, visualization, statistics, RNA-SEQ etc.
If you manage phylogenetic data, Bio::NEXUS can make your life easier with a library and ready-made tools to manipulate and visualize NEXUS files (see http://www.molevol.org/nexplorer and http://search.cpan.org/dist/Bio-NEXUS/doc/Tutorial.pod).
Computational Suite for Bioinformaticians and Biologists
CSBB is a command line based bioinformatics suite to analyze biological data acquired through varied avenues of biological experiments. CSBB is implemented in Perl, while it also leverages the use of R, java and ruby in background for specific modules. Major focus of CSBB is to allow users from biology and bioinformatics community, to get benefited by performing down-stream analysis tasks while eliminating the need to write programming code. CSBB is currently available on Linux, UNIX and Windows platforms. Currently CSBB provides 16 modules focused on analytical tasks like normalization, visualization, statistics, RNA-SEQ etc.
A collection of utilities, developer tools and libraries developed by the Maize Functional Genomics of Chromatin Consortium
Cloud storage class, open source software.
DIASER, Geo-data duplication long-term archive system & WAN vault. Manage mixed data archives generated by existing backup software. Ensure availability using commodity hardware. Retain administrative and financial control.
Enrichment analysis for customized organisms
Fanimae is a desktop music information retrieval system.
Code repository for the Laboratory for Genome Bioinformatics at Texas A&M. The LGB project was initiated primarily to support biologists at Texas A&M needing help with bioinformatics in order to use new genomic technologies.
A FOSS operating system that aims to meet NSA's TPEP TCB A1 evaluation standard while retaining application compatibility at minimal performance overhead.
Maximal Clique Motif Reduction (MCMR) is a software program for running and then combining the output of multiple motif finder programs, such as MEME, AlignACE and Weeder, into a set of consensus predictions with associated confidence rankings.
This project collects the software developed as part of the OSG Scalability, Reliability and Usability Area.
Cancer driver prediction via integrative omics
OncoIMPACT is a model-driven approach to integrate omics profiles (genomics, transcriptomics etc.) and provides patient-specific cancer driver gene predictions.
OpenCloudAV is the first open source multi-engine based malware analysis service from the network cloud. This project is in alpha release, run only in GNU/Linux, and is mainly developed using Perl SOAP::Lite module. Version 0.2 alpha is available now
An open-source data management, analysis and visualization system to make the process of the scientific data development clean, easily reproducible, and easily sharable with the outside world.
The Scenario Data Selector is a Perl script that serves up scenario data on the web. Users can browse data and choose to show by value or as growth rates or indices. Data can be shown by region or indicator - the script takes care of regional aggrega