SOFA is a statistics, analysis, and reporting program with an emphasis on ease of use, learn as you go, and beautiful output.
Statcato is a Java software application for elementary statistics. Its features include data and graph generation, probability distributions, descriptive statistics, confidence intervals, hypothesis tests, correlation, regression, and analysis of var
a Small (Matlab/Octave) Toolbox for Kriging
The STK is a (not so) Small Toolbox for Kriging. Its primary focus in on the interpolation / regression technique known as kriging, which is very closely related to Splines and Radial Basis Functions, and can be interpreted as a non-parametric Bayesian method using a Gaussian Process (GP) prior. The STK also provides tools for the sequential and non-sequential design of experiments. Even though it is, currently, mostly geared towards the Design and Analysis of Computer Experiments (DACE), the STK can be useful for other applications areas (such as Geostatistics, Machine Learning, Non-parametric Regression, etc.).
Java library of statistical distribution
A Java package that provides routines for various statistical distributions. Based on R version 2.14.1 (continuously updated; current as of R v3.3.0). The major difference is that JDistlib is thread safe. The library contains the density (pdf), cumulative (cdf), quantile, and random number generator (RNG) routines of the following distributions: Ansari, Beta, Binomial, Cauchy, Chi square, Exponential, Fisher's F, Gamma, Geometric, Hypergeometric, Kendall, Logistic, Log normal, Negative binomial, Noncentral beta, Noncentral chi square, Noncentral f, Noncentral t, Normal, Poisson, Sign Rank, Spearman, Student's T, Tukey, Uniform, Weibull, Wilcoxon, and many more. Normality tests, such as: Kolmogorov-Smirnov, Anderson-Darling, Cramer-Von Mises, D'Agostino-Pearson, Jarque Bera, Kolmogorov-Lilliefors, Shapiro-Francia, Shapiro-Wilk. And many others.
Free Matlab toolbox to compute robust correlations
The Robust Correlation Toolbox is a free collection of Matlab functions allowing to visualize data in univariate and bivariate space, check assumptions of normality and homoscedasticity and compute Pearson's and Spearman's, percentage bend, and skipped correlations with bootstrapped confidence intervals - see http://www.frontiersin.org/Quantitative_Psychology_and_Measurement/10.3389/fpsyg.2012.00606/full
MinimPy is a desktop application program for sequential allocation of subjects to treatment groups in clinical trials by using the method of minimisation. Comprehensive reference help is available at: http://minimpy.sourceforge.net For those who have difficulty installing MinimPy, an online version is available at: http://qminim.sourceforge.net MinimPy has been full described in the foolowing article: Saghaei, M. and Saghaei, S. (2011) Implementation of an open-source customizable minimization program for allocation of patients to parallel groups in clinical trials. Journal of Biomedical Science and Engineering, 4, 734-739. doi: 10.4236/jbise.2011.411090. Available at: http://www.scirp.org/journal/PaperInformation.aspx?PaperID=8518
JStats is a Java application/applet for statistical testing.
JStats is a small but powerful Java application/applet for conducting statistical tests. The following tests are supported: * Parametric tests: T-test, ANOVA, Repeated Measures ANOVA * Non-parametric tests: Wilcoxon Rank-Sum, Wilcoxon Signed-Ranks, Kruskal-Wallis, Friedman * Check if datasets are normally distributed: Jarque-Bera, Shapiro-Wilk * Check if datasets have equal variances: F-test, Bartlett's test, John, Nagao and Sugiura's test * Correlation: Correlation coefficient, Spearman Rank correlation, linear regression * Confidence intervals test * Outliers: Generalized Extreme Studentized (ESD) test, outliers in ANOVA The latest version is available as applet on http://aiguy.org/Statistics.html
A population-based method for DNA copy number analysis: recurrent copy number aberration indentification in multiple samples (with no need of single-sample calling). Developed for a quick analysis of high resolution and large population data.
Statistical data analysis
Programa multiplataforma de libre distribución para el análisis estadístico y epidemiológico de datos. Free distribution cross-platform program for statistical and epidemiological analysis of data. Sitio web: http://www.sergas.es/Saude-publica/EPIDAT Souceforge: https://sourceforge.net/projects/epidat/ Wikipedia: https://es.wikipedia.org/wiki/Epidat
Math.NET aims to provide a self contained clean framework for symbolic mathematical (Computer Algebra System) and numerical/scientific computations, including a parser and support for linear algebra, complex differential analysis, system solving and more
A simple programmable spreadsheet for learning statistics.
Myrtle is a simple programmable spreadsheet and statistical analysis software specifically designed for learning statistics. It provides the standard spreadsheet functionality one would expect like multiple tabbed sheets, relative and absolute row and column referencing in formulas, and a large catalog of built-in functions. Functions specific to logic and computer science, mathematics, probability, and statistics are available. Student's can easily create, customize, and update plots and graphical summaries of their analyses. Myrtle offers a unique bookmarking facility which allows students to create and reuse named references to their favorite cell ranges. This can help students focus attention on the important relationships among particular rows or columns of data. Myrtle's graphics and reporting features allow students to report back to their instructors their mastery of course content.
An R Package for Environmental Statistics
EnvStats is an R package for environmental statistics. It is the open-source successor to the commercial module for S-Plus© called "EnvironmentalStats for S-Plus", which was first released in April, 1997. The EnvStats package, along with the R software environment, provides comprehensive and powerful software for environmental data analysis. EnvStats brings the major environmental statistical methods found in the literature and regulatory guidance documents into one statistical package, along with an extensive hypertext help system that explains what these methods do, how to use these methods, and where to find them in the environmental statistics literature. Also included are numerous built-in data sets from regulatory guidance documents and the environmental statistics literature. EnvStats combined with other R packages (e.g., for spatial analysis) provides the environmental scientist, statistician, researcher, and technician with tools to “get the job done!”
Software made available by the Residual Analysis blog, primarily having to do with anthropogenic global warming, e.g. GHCN Processor.
Tail probability calculator for continuous random variable
A suite of Matlab functions that calculate the tail probability / cdf / pdf / quantile of linear combination of random variables in one of the following classes: (1) symmetric random variables with support on the real axis (normal, Student's t, uniform and triangular); (2) random variables with support on the positive real axis (chi-squared and log-Lambert W x chi-squared distributions; inverse gamma distribution is temporarily disabled due to numerical issues).
The Time2 Java library provides generic time series with configurable time domains.
computing f2 bootstrap CI BCA
Computing similarity factor (f2) bootstrap bias corrected and accelerated confidence interval
An R package implementation of a consensus clustering methodology. This package allows users to perform re-sampling statistics based clustering using multiple clustering algorithms to assess the robustness of both clusters and members of clusters.
Software tool for Research in Computational Population Genetics
Development of exact and approximate methods (Importance Sampling and MCMC based) for computing likelihoods under the standard population genetic models of mutation,migration & recombination. Project issues are mainatined at https://freecode4susant.atlassian.net/browse/COALESCENT
A Python package for estimating the statistical impact of features
This package let's you compute the statistical impact of features given a scikit-learn estimator. The computation is based on the mean variation of the difference between quantile and original predictions. The impact is reliable for regressors and binary classifiers. Currently, all features must consist only of pure-numerical, non-categorical values.
Python module to track the overall median of a stream of values "on-line" in reasonably efficient fashion.