Platform for parallel computation in the Amazon cloud, including machine learning ensembles written in R for computational biology and other areas of scientific research. Home to MR-Tandem, a hadoop-enabled fork of X!Tandem peptide search engine.
UI for fscaret
User Interface (ui) application which implements the automated feature selection provided by the 'fscaret' package of R-environment.
The TreeRank project is a R package implementing a Machine Learning algorithm to build tree-based ranking rules from data with binary labels, based on ROC optimization.
Suite of community detection algorithms based on Modularity
- MixtureModel_v1r1: overlapping community algorithm , which includes novel partition density and fuzzy modularity metrics. - OpenMP versions of algorithms in  are available to download. - Main suite containing three community detection algorithms based on the Modularity measure containing: Geodesic and Random Walk edge Betweenness  and Spectral Modularity . Collaborator: Theologos Kotsos.  M. Newman & M. Girvan, Physical Review, E 69 (026113), 2004.  M. Newman, Physical Review E, 74(3):036104, 2006.  B. Ball et al, An efficient and principled method for detecting communities in networks, 2011. The suite is based upon the fast community algorithm implemented by Aaron Clauset <firstname.lastname@example.org>, Chris Moore, Mark Newman, and the R IGraph library Copyright (C) 2007 Gabor Csardi <email@example.com>. It also makes of the classes available from Numerical Recipies 3rd Edition W. Press, S. Teukolsky, W. Vetterling, B. Flanne
A machine learning system for supervised document classification
An open source system for supervised document classification based on statistical machine learning techniques. On the contrary of the state of art classification techniques, MyNook just requires the title of the document, not the content itself.
This is a variant of k-means algorithm which allows datas to belong to several clusters instead of just one.
This application allow user to predict dissolution profile of solid dispersion systems based on algorithms like symbolic regression, deep neural networks, random forests or generalized boosted models. Those techniques can be combined to create expert system. Application was created as a part of project K/DSC/004290 subsidy for young researchers from Polish Ministry of Higher Education.
Supervised Ranking of Contigs in de novo Assemblies
SuRankCo is a machine learning based software to score and rank contigs from de novo assemblies of next generation sequencing data. It trains with alignments of contigs with known reference genomes and predicts scores and ranking for contigs which have no related reference genome yet. For more details about SuRankCo and its functioning, please see "SuRankCo: Supervised Ranking of Contigs in de novo Assemblies" Mathias Kuhring, Piotr Wojtek Dabrowski, Andreas Nitsche and Bernhard Y. Renard (http://www.biomedcentral.com/1471-2105/16/240/abstract) PLEASE NOTE, it is recommended to read the paper and the readme.txt file before using SuRankCo. Update Jun2015: * Minor changes to enable BAM support. Update Feb2014: * Added support for FASTA/SAM assemblies in addition to ACE/FASTQ(QUAL). NOTE: features of FASTA/SAM assemblies do not include BaseCount, BaseSeqmentCount and ContigQualities yet.