SOFA is a statistics, analysis, and reporting program with an emphasis on ease of use, learn as you go, and beautiful output.
Machine Learning Python
mlpy is a Python module for Machine Learning built on top of NumPy/SciPy and of GSL. mlpy provides high-level functions and classes allowing, with few lines of code, the design of rich workflows for classification, regression, clustering and feature selection. mlpy is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 3. mlpy is available both for Python >=2.6 and Python 3.X.
Please participate in the SURVEY on rgedit's future: https://www.surveymonkey.com/s/VNMMJMJ your answers are much appreciated! Gedit (Gnome editor, www.gedit.org) plug-in allowing it to become an easy-to-use and yet light-weight IDE for the statistical programming environment, R (www.r-project.org).
Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout
Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens technology. For instructions and documentations, please refer to the wiki page. MAGeCK is developed and maintained by Wei Li and Han Xu from Dr. Xiaole Shirley Liu's lab at Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health. We thank the support from Claudia Adams Barr Program in Innovative Basic Cancer Research to develop MAGeCK.
SalStat is a small application for statistical analysis emphasising the sciences and social sciences (particularly Psychology). The project is designed around the user interface which has been designed to be simple to use. Think SPSS, but better!
Data Plotting and Analysis for Science and Engineering
- Save and open a Work/Project (spf) file - Single fitting/ Batch fitting (user defined custom func) - Matrix to XYZ in Tool menu - Symbol plot: makers, curve, landscape, bar, etc. - Implemented a 3d surface plot (GLSurface) based on OpenGL (ScienPlot v1.3.2 and above) - ColorMap surface, trisurface, Pie, Polar plots, and 3D height field, 3dBar, scatter plots (under developing), and more - Column by column plotting/calculation - LaTex commands enclosed by $ symbols can be used for the labels in Graph - Accept txt(Text) and csv(Comma separated values) formatted data - Save, copy, print Graph - Use spread sheets to display data - Textboard to organize the results - Graphs in a publishable quality - Source codes based on: Python Numpy Scipy Matplotlib WxPython Visvis etc. - Special functions - Drag and drop data files - Python console is back (since v1.3.3), capable of reusing column data - Debye and Guinier models for SANS / SAX data - More apps in our Web below
That project aims at providing a clean API and a simple implementation, as a C++ library, of an Airline-related Inventory Management system. That library uses the Standard Airline IT C++ object model (http://sf.net/projects/stdair).
MinimPy is a desktop application program for sequential allocation of subjects to treatment groups in clinical trials by using the method of minimisation. Comprehensive reference help is available at: http://minimpy.sourceforge.net For those who have difficulty installing MinimPy, an online version is available at: http://qminim.sourceforge.net MinimPy has been full described in the foolowing article: Saghaei, M. and Saghaei, S. (2011) Implementation of an open-source customizable minimization program for allocation of patients to parallel groups in clinical trials. Journal of Biomedical Science and Engineering, 4, 734-739. doi: 10.4236/jbise.2011.411090. Available at: http://www.scirp.org/journal/PaperInformation.aspx?PaperID=8518
Maximal Information-based Nonparametric Exploration
The minepy homepage has moved to http://minepy.readthedocs.io. The download page is now at https://github.com/minepy/minepy/releases.
That project aims at providing a clean API and a simple implementation, as a C++ library, of a Travel-oriented Distribution System. It corresponds to the simulated version of the real-world Computerized Reservation Systems (CRS).
Demonstrate errors in transmission of a file over a noisy channel.
This program was written to dimonstrate errors in transmission for a presentation on Claude Shannon's Noisy Channel Coding Theorem. It takes an input file, the probability of a bit being flipped, and, if specified, the size of the header of the file. The program was intended to take monochrome bitmap files as input, so that each bit refers to a pixel in the image and thus, it would be easy to see errors in the output file, as some of the pixels would be flipped; however, it will work on any input file.
Uranie is CEA's uncertainty analysis platform, based on ROOT
Uranie is a sensitivity and uncertainty analysis plateform based on the ROOT framework (http://root.cern.ch) . It is developed at CEA, the French Atomic Energy Commission (http://www.cea.fr). It provides various tools for: - data analysis - sampling - statistical modeling - optimisation - sensitivity analysis - uncertainty analysis - running code on high performance computers - etc. Thanks to ROOT, it is easily scriptable in CINT (c++ like syntax) and Python. Is is available both for Unix and Windows platforms (a dedicated platform archive is available on request). Note : if you have downloaded version 3.12 before the 8th of february, a patch exists for a minor bug on TOutputFileKey file, don't hesitate to ask us.
A Python package for estimating the statistical impact of features
This package let's you compute the statistical impact of features given a scikit-learn estimator. The computation is based on the mean variation of the difference between quantile and original predictions. The impact is reliable for regressors and binary classifiers. Currently, all features must consist only of pure-numerical, non-categorical values.
GUANO - Graphical User interface for performing ANalysis Of variance
Free and open source standalone program capable of conducting between, within, and mixed analyses of variance (ANOVA). Provides a simple graphical user interface for specifying analyses and interaction plots (analyses performed by http://code.google.com/p/pyvttbl/). Features: - Capable of high order factorial designs (> 2 factors) - Within and mixed analyses of variance provide corrections for violations of sphericity (Huynh-Feldt, Greenhouse-Geisser, Box) - A variety of data transformations can be applied (log10, reciprocal, arcsine, square-root, and Windsor) - Generalized eta-squared measures of effect size - Post-hoc power analysis (should match G*Power) - Outputs include tables of estimated marginal means - Up to 4-way interaction plots with errorbars (png, svg) - Confidence intervals account for within-subject variability (where applicable; Loftus and Masson, 1994) - Non-proprietary HTML output files - Non-proprietary codebase Gotchas: - Assumes balanced designs
1. Create an object-oriented python script that can represent mathematical concepts and their properties. 2. Represent all numeric values exactly. 3. Provide a variety of formats to export or embed representations of the mathematical concepts.
psignifit is a toolbox to fit psychometric functions and test hypotheses on psychometric data. This is version 3 which will now predominantly support python.
A general recommender system with basic models and MRA
Multi-categorization Recommendation Adjusting (MRA) is to optimize the results of recommendation based on traditional(basic) recommendation models, through introducing objective category information and taking use of the feature that users always get the habits of preferring certain categories. Besides this, there are two advantages of this improved model: 1) it can be easily applied to any kind of existing recommendation models. And 2) a controller is set in this improved model to provide controllable adjustment range, which thereby makes it possible to provide optional modes of recommendation aiming different kinds of users.
An unlimited calculator for Chi Squared
This Chi Squared Calculator allows the user to enter any number of rows and columns, enter the observed frequencies used in the calculation, and the program will output the answer, as well as the degrees of freedom. This program runs on Python 3.2 ## BUT NO LONGER REQUIRES PYTHON to run! (Now in .exe form!) ## Sorry for the lack of floating point (Decimal) numbers support; attempting to input decimals will crash the program. Will fix soon! If you need any help, information, my email is: firstname.lastname@example.org
MLE survival analysis: Gompertz, Weibull, Logistic and mixed morality.
DeDAY (Demography Data Analyses) is a tool of analyzing demography data. It supports Gompertz, Weibull and Logistic distributions. DeDay also supports mixed mortality models based on these distribution such as the Gompertz-Makeham distribution. Distributions such as Gompertz describes only age-dependent mortality, which increases over time. Mixed mortality models, such as in Gompertz-Makeham distribution, consider a more general case where mortality is consist of both age-dependent and in-dependent mortality. Mixed models partition mortality into exogenous and endogenous components, so that the intrinsic survivorship can be estimated without the interference from extrinsic noise. DeDAY supports both interval-censored data and exact event-time data. Using MLE (Maximum Likelihood Estimate), DeDAY fits statistic model to the data. DeDAY also calculates the variances and the multi-dimensional confidence limits of model parameters. DeDAY is free for academic users.
Pequeno script em Python para provar o problema de Monty Hall
O jogo consiste no seguinte: Monty Hall (o apresentador) apresentava 3 portas aos concorrentes, sabendo que atrás de uma delas está um carro (prémio bom) e que as outras têm prêmios de pouco valor. Na 1ª etapa o concorrente escolhe uma porta (que ainda não é aberta); De seguida Monty abre uma das outras duas portas que o concorrente não escolheu, sabendo à partida que o carro não se encontra aí; Agora com duas portas apenas para escolher — pois uma delas já se viu, na 2ª etapa, que não tinha o prêmio — e sabendo que o carro está atrás de uma delas, o concorrente tem que se decidir se permanece com a porta que escolheu no início do jogo e abre-a ou se muda para a outra porta que ainda está fechada para então a abrir. Qual é a estratégia mais lógica? Ficar com a porta escolhida inicialmente ou mudar de porta? Com qual das duas portas ainda fechadas o concorrente tem mais probabilidades de ganhar? Por quê?
A python based Bayesian network implementation. This project aims to provide a single point of entry-solution for searching through available networks matching data and optimizing CPT's.
That project aims at studying and comparing typical airline IT methods, for instance RM-related algorithms. It works from a Unix/Linux/Mac command-line, and exposes basic APIs. It is being developed in C++, with Python wrappers for some components.
That project aims at providing a clean API and a simple implementation, as a C++ library, of a Travel-oriented fare engine. It corresponds to the simulated version of the real-world Fare Quote System.
Differential Expression Analysis for Pathways
This project contains the source code associated with the PLoS Computational Biology publication: "Differential Expression Analysis for Pathways". The paper text can be found here: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002967
GUI based toolkit for running common Machine Learning algorithms.
ExoPlanet provides a graphical interface for the construction, evaluation and application of a Machine Learning model in predictive analysis. With the back-end built using the numpy and scikit-learn libraries, as a toolkit, ExoPlanet couples fast and well tested algorithms, a UI designed over the Qt4 framework, and graphs rendered using Matplotlib to provide the user with a rich interface, rapid analytics and interactive visuals. ExoPlanet is designed to have a minimal learning curve, allowing researchers to focus on the applicative aspect of Machine Learning rather than their implementation details. It provides algorithms for unsupervised and supervised learning, which may be done with continuous or discrete labels. Post analysis, the toolkit further automates building the visual representations for the trained model.