R-package to perform analysis of landmark based geometric morphometrics. Installation instructions: https://github.com/zarquon42b/Morpho/blob/master/README.md
FDA's f2 computation with bootstrap technique
This program was developed as a help in establishing pharmaceutical equivalence by use of FDA f2 coefficient. It was designed to help with f2 computation in cases when intra- and inter-batch variability is large, namely RSD>10%. The use of statistical bootstrap technique allows to implement confidence interval (CI) into the f2 coefficients resulting in overcoming of their major drawback in the original metrics. The algorithm provides possible “worst case scenario” of f2 values, thus supporting claim about pharmaceutical equivalence. The target users are researchers from industry and academia dealing with pharmaceutical equivalence problem. The software is Open Source. It was developed in Lazarus environment, therefore source code is available in ObjectPascal.
Visualization application for various TASEP, ASEP and SSEP models.
Visualization of TASEP, ASEP, SSEP models with various update rules based on Zarja simulation library using QT GUI library.
Library for optimization using a genetic algorithm or particle swarms
libfgen is a library that implements an efficient and customizable genetic algorithm (GA). It also provides particle swarm optimization (PSO) functionality and an interface for real-valued function minimization or model fitting. It is written in C, but can also be compiled with a C++ compiler. Both Linux and Windows are supported.
psignifit is a toolbox to fit psychometric functions and test hypotheses on psychometric data. This is version 3 which will now predominantly support python.
BAYESIAN INFERENCE OF METABOLIC DIVERGENCE AMONG MICROBIAL COMMUNITIES
Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks), for inferring differential prevalence of metabolic networks among microbial communities. To infer the structure of community-level metabolic interactions, BiomeNet applies a mixed-membership modelling framework to enzyme abundance information. The basic idea is that the mixture components of the model (metabolic reactions, subnetworks, and networks) are shared across all groups (microbiome samples), but the mixture proportions vary from group to group. Through this framework, the model can capture nested structures within the data. BiomeNet is unique in modeling each metagenome sample as a mixture of complex metabolic systems (metabosystems).
Sphere surface layers of visual cortex approach maximum info density
Near the surface (even horizon) of a black hole, there is maximum information density in units of squared plancks (and some translation to qubits). Similarly, our imagination is the set of all possible things we can draw onto our most dense layer of visual cortex in electricity patterns. Bigger layers have more neurons to handle those possibilities. A Black Hole Cortex is a kind of visual cortex that has density of neuron layers similar to density at various radius from a black hole. What we think our eyes see, the imagination, is the densest and smallest layer. SphereSurfaces outside it recursively have more neurons, more surface area, but less density since it has to eventually dimension-reduce to high level ideas, like there are 10000 Wikipedia page names that cover most parts of the world. We can think of Wikipedia as a layer above our brains, a global SphereSurface of large surface area (a cortex layered on billions of minds) and small (10000 most important pages) density.
Log-linear analysis (data modelling) for high-dimensional data
===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. Chordalysis makes it possible to discover the structure of datasets with thousands of variables on a standard desktop computer. Associated papers at ICDM 2013, ICDM 2014 and SDM 2015 can be found at http://www.francois-petitjean.com/Research/ YourKit is supporting Chordalysis open source project with its full-featured Java Profiler. YourKit is the creator of innovative and intelligent tools for profiling Java and .NET applications. http://www.yourkit.com
CORe microBiome Analysis Tools
Corbata is a set of statistical tools that can be used to analyze the core microbiome across a set of samples.
JStats is a Java application/applet for statistical testing.
JStats is a small but powerful Java application/applet for conducting statistical tests. The following tests are supported: * Parametric tests: T-test, ANOVA, Repeated Measures ANOVA * Non-parametric tests: Wilcoxon Rank-Sum, Wilcoxon Signed-Ranks, Kruskal-Wallis, Friedman * Check if datasets are normally distributed: Jarque-Bera, Shapiro-Wilk * Check if datasets have equal variances: F-test, Bartlett's test, John, Nagao and Sugiura's test * Correlation: Correlation coefficient, Spearman Rank correlation, linear regression * Confidence intervals test * Outliers: Generalized Extreme Studentized (ESD) test, outliers in ANOVA The latest version is available as applet on http://aiguy.org/Statistics.html
Mark Six Analyst is a database of Hong Kong Lottery or so-called Mark Six. It provides a tool of various statistics method to show relationships between data and forecasts.
ProbAbilistic CALculator - a package for computing with probability distributions
Predicting ribosome footprint profile shapes from transcript sequences
Riboshape is a suite of algorithms to predict ribosome footprint profile shapes from transcript sequences. It applies kernel smoothing to codon sequences to build predictive features, and uses these features to builds a sparse regression model to predict the ribosome footprint profile shapes. Reference: Liu, T.-Y. and Song, Y.S. Prediction of ribosome footprint profile shapes from transcript sequences. Proceedings of ISMB 2016, Bioinformatics, Vol. 32 No. 12 (2016) i183-i191.
The 'runjags' R package and standalone JAGS extension module
This package provides high-level interface utilities for MCMC models via Just Another Gibbs Sampler (JAGS), facilitating the use of parallel (or distributed) processors for multiple chains, automated control of convergence and sample length diagnostics, and evaluation of the performance of a model using drop-k validation or against simulated data. Template model specifications can be generated using a standard lme4-style formula interface to assist users less familiar with the BUGS syntax. A JAGS extension module provides additional distributions including the Pareto family of distributions, the DuMouchel prior and the half-Cauchy prior.
C++ Statistical ToolKit
STK++ (http://www.stkpp.org) is a versatile, fast, reliable and elegant collection of C++ classes for statistics, clustering, linear algebra, arrays (with an Eigen-like API), regression, dimension reduction, etc. Some functionalities provided by the library are available in the R environment as R functions (http://cran.at.r-project.org/web/packages/rtkore/index.html). At a convenience, we propose the source packages on sourceforge. The library offers a dense set of (mostly) template classes in C++ and is suitable for projects ranging from small one-off projects to complete data mining application suites.
M-S Lab is a free software for statistical computing and graphics. M-S Lab is available in two distributions: The built-in language support is implemented in "Own" version. And for users who like to code in Lua scripting language we made a "Lua" version. Both versions are available for free download in the Downloads page.
MXLib is a C++ wrapper around the Intel® Integrated Performance Primitives (IPP) library and NVidia NPP CUDA library. You can use either IPP code (or a subset of functions that do not require IPP) on the CPU side, or use NPP/CUDA on the GPU side, or use both together. The function syntax is similar to that found in MatLab and the library is designed to make it easy to port your code from MatLab to C++. The idea is to provide Scientists, Engineers, Researchers and other non full-time programmers an easy to use, high performance library of functions.
Scale Assistant is an OpenOffice.org extension designed to give OOo Calc additional power that meet some needs of social sciences researcher.
An R package implementation of a consensus clustering methodology. This package allows users to perform re-sampling statistics based clustering using multiple clustering algorithms to assess the robustness of both clusters and members of clusters.
Blueprint XAS is a Matlab-based suite designed for the processing and analysis of near-edge x-ray absorption spectroscopy (XAS) data. The suite is designed primarily to assist users in exploring reasonable fit solutions while minimizing user bias.
That project aims at providing a clean API and a simple implementation, as a C++ library, of a Travel-oriented Distribution System. It corresponds to the simulated version of the real-world Computerized Reservation Systems (CRS).
A population-based method for DNA copy number analysis: recurrent copy number aberration indentification in multiple samples (with no need of single-sample calling). Developed for a quick analysis of high resolution and large population data.
Dynamic Multispecies Metabolic Modeling framework
The Dynamic Multispecies Metabolic Modeling (DyMMM [dĭm]) framework is a mathematical modeling tool that integrates multiple constraint-based metabolic models into a single dynamic community metabolic model. The DyMMM framework was formerly known as the DMMM framework. Please use the following citation for bibliographical purposes: Zhuang, K., Izallalen, M., Mouser, P., Richter, H., Risso, C., Mahadevan, R., & Lovley, D. R. (2011). Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments. The ISME journal. Zhuang, K., Ma, E., Lovley, D. R., & Mahadevan, R. (2012). The design of long-term effective uranium bioremediation strategy using a community metabolic model. Biotechnology and Bioengineering.
An R Package for Environmental Statistics
EnvStats is an R package for environmental statistics. It is the open-source successor to the commercial module for S-Plus© called "EnvironmentalStats for S-Plus", which was first released in April, 1997. The EnvStats package, along with the R software environment, provides comprehensive and powerful software for environmental data analysis. EnvStats brings the major environmental statistical methods found in the literature and regulatory guidance documents into one statistical package, along with an extensive hypertext help system that explains what these methods do, how to use these methods, and where to find them in the environmental statistics literature. Also included are numerous built-in data sets from regulatory guidance documents and the environmental statistics literature. EnvStats combined with other R packages (e.g., for spatial analysis) provides the environmental scientist, statistician, researcher, and technician with tools to “get the job done!”
A versatile MCMC and downhill optimization engine
Hrothgar is a parallel minimizer and Markov Chain Monte Carlo generator by Andisheh Mahdavi of San Francisco State University. It has been used to solve optimization problems in astrophysics (galaxy cluster mass profiles) as well as in experimental particle physics (hadronic tau decays). It is probably adaptable enough to be applied to your merit function if you can write it in C.