The fantail machine learning toolkit (Moved)
Moved to https://github.com/quansun/fantail-ml
Format converting tool for genotype Data (e.g.PLINK-MACH,MACH-PLINK)
Main application is twofold: first to convert genotype SNP data into formats of different imputation tools like PLINK MACH, IMPUTE, BEAGLE and BIMBBAM, second to transform imputed data into different file formats like PLINK, HAPLOVIEW, EIGENSOFT and SNPTEST. Readable file formats: plink-pedigree (ped and map), plink-raw, plink-dosage, mach , minimac, impute, snptest, beagle and bimbam. Similarly all kinds of imputation of outputs are also accepted. Formats which can be generated by fcGENE: plink-pedigree, plink-raw, plink-dosage, mach-inputs, minimac-inputs, impute-inputs, beagle-inputs and bimbam-inputs, HAPLOVIEW-inputs, EIGENSOFT-inputs. Further application: -obtaining templates of necessary imputation commands and commands of other imputation tool - Quality control according as MAF,HWE & CALLRATE. key words: genotype transformation, convert genotype format, imputation output, PLINK, IMPUTE, MACH, minimac, HAPLOVIEW, BEAGLE, BIMBAM,EIGENSOFT.
A Python package for estimating the statistical impact of features
This package let's you compute the statistical impact of features given a scikit-learn estimator. The computation is based on the mean variation of the difference between quantile and original predictions. The impact is reliable for regressors and binary classifiers. Currently, all features must consist only of pure-numerical, non-categorical values.
A brief summary of the main features of Fesslix: - Perform non-intrusive reliability analysis or Bayesian updating either --- by running commands on the command line or --- by means of an Octave interface or --- by means of a Python interface - Flexible input language for writing Fesslix parameter files --- Control flow statements (e.g. if, for, while) --- Most parameters can be defined as functions - Working with response surfaces - Linear finite element analysis using truss, beam and plane stress/strain elements - Spectral Stochastic Finite Elements - Bayesian networks This is the download page for Windows executables of Fesslix.
A software framework to build maps for Neurospora crassa genome based on probabilistic models of meiotic recombination. A netbeans platform application is built to incorporate the computations. Project issues are mainatined at https://freecode4susant.atlassian.net/browse/GENOMEMAP
create a game: black jack
A cross-platform statistical package for econometric analysis
gretl is a cross-platform software package for econometric analysis, written in the C programming language.
* GSA-SNP2 is a successor of GSA-SNP (Nam et al. 2010, NAR web server issue). GSA-SNP2 accepts human GWAS summary data (rs numbers, p-values) or gene-wise p-values (possibly obtained from VEGAS or GATES) and outputs pathway gene sets ‘enriched’ with genes associated with the given phenotype. It also provides both local and global protein interaction networks in the associated pathways. * IMPORTANT NOTE: -> PLEASE MOVE OR MAKE A COPY OF 'DATA' FOLDER INTO YOUR INTENSIVE TEST FOLDER (I.E. LINUX OR MAC OR WINDOWS SPECIFIED FOLDER) TO ALLOW THE PROGRAM TO FIND THE PREDESIGNED DATA. * UPDATE NOTE: -> Mar-7-2018: revise header terms in the output file (all versions) -> Jan-8-2018: minor output format update for all versions -> Apr-3-2017: MacOSX command-line version is added. It also provides the PPI net summarization -> Mar-31-2017: Linux and Windows command-line versions now provide the PPI net summary results (except the net visualization)
R package for hierarchical species distribution models
hSDM is an R package for hierarchical species distribution models. Such models allows interpreting the observations (occurrence and abundance of a species) as a result of several hierarchical processes including ecological processes (habitat suitability, spatial dependence and anthropogenic disturbance) and observation processes (species detectability). Hierarchical species distribution models are essential for accurately characterizing the environmental response of species, predicting their probability of occurrence, and assessing uncertainty in the model results.
'Hierarchical Bond Graph Modelling of Biochemical Networks'
Scripts and supplementary data for the manuscript 'Hierarchical Bond Graph Modelling of Biochemical Networks': Peter J. Gawthrop(1), Joseph Cursons(1,2), and Edmund J. Crampin(1-4). (1) Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia. (2) ARC Centre of Excellence in Convergent Bio-Nano Science, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia. (3) Department of Mathematics and Statistics, University of Melbourne, Victoria 3010, Australia. (4) School of Medicine, University of Melbourne, Victoria 3010, Australia Draft paper available at http://arxiv.org/abs/1503.01814
This program was designed to facilitate and accelerate expert’s evaluation of histogram similarity.
Tools for multivariate data visualization, exploration and analysis.
You may also like some newer work: http://createdatasol.com/ Project imDEV is an application of RExcel, which seamlessly integrates Excel and R for tasks focused on multivariate data visualization, exploration, and analysis. Interactive modules for dimensional reduction (imPCA), prediction (imPLS), feature selection, analysis of correlation (imCorrelations) and generation of networked structures (imGraph) provide an integrated environment for systems level analysis of multivariate data.
IMAEL stands for Image Matlab Analysis and Estimation Library. It consists in a collection of functions for image filtering, analysis, visualization, for 2D, 3D grayscale and color images.
jAgg - Java Aggregation Operations
jAgg is a Java 5.0 API that supports “group by” operations on Lists of Java objects: aggregate operations such as count, sum, max, min, avg, and many more. It supports Super Aggregation: Rollups, Cube, and Grouping Sets. It supports analytic operations such as lag/lead and row number and more. It also allows custom aggregate and analytic operations.
A Java library to model and fit ARTA processes.
Iskakov Azamat kaftk
Utility classes from maps to search engine to random samplers
Collection of several multi-purpose Java libraries. --- knowceans-tools = collection of Java utility classes. --- Highlights: --- org.knowceans.util: IndexQuickSort, TableList: apply order of one array/list to others +++ Vectors, ArrayUtils: array convenience +++ RandomSamplers, CokusRandom, ArmSampler, Densities: random sampling and distributions +++ Arguments: command line parser +++ StopWatch, Which, ExternalProcess: runtime stuff +++ ParallelFor: OpenMP workalike +++ PatternString, NamedGroupRegex: regex convenience --- org.knowceans.corpus: CorpusSearcher: full-text search engine +++ LabelNumCorpus: svmlight corpus storage and filtering +++ NIPS corpus with text, authors, labels and citations --- org.knowceans.map: InvertibleHashMultiMap, BijectiveHashMap: implement n:m and 1:1 relations. --- Other libs: knowceans-arms = port of the Adaptive Rejection Metropolis Sampler (ARMS) for arbitrary distributions +++ lda-j = port of lda-c, implementing Latent Dirichlet Allocation (LDA)
Leave-k-out-Jackknife Pearson correlation
In general, Pearson correlation coefficients can be quite error-prone to outliers, thus Jackknifing may be helpful. I could not find any useful implementation for calculating more than leave-1-out Jackknife Pearson correlation coefficients. 'leave-kojack' offers an implementation for Jackknife Pearson correlation with dynamic values of k, i.e. leave-k-out Pearson correlation.
A C library to handle compositional (=closed) data, like proportions.
Inspired by Aitchison 2003 (http://www.amazon.com/The-Statistical-Analysis-Compositional-Data/dp/1930665784/) this library aims to make compositional data analysis rigorously.
Library for optimization using a genetic algorithm or particle swarms
libfgen is a library that implements an efficient and customizable genetic algorithm (GA). It also provides particle swarm optimization (PSO) functionality and an interface for real-valued function minimization or model fitting. It is written in C, but can also be compiled with a C++ compiler. Both Linux and Windows are supported.
A C++ library for principal component analysis
libpca is a C++ library for principal component analysis and related transformations. It comes with example and unit tests. libpca is successfully tested on Linux and MacOSX using g++ (>=4.6), clang++ (>=3.2), and icc (>=14.0). libpca requires Armadillo (>=3.2.4) which can be obtained as a pre-compiled package on most distributions or directly from http://arma.sourceforge.net. libpca is being developed by Christian Blume. Contact Christian at email@example.com for any questions or comments.
The project liquiditymeter is for measuring liquidity in various financial markets.
Massively Parallel Graph processing on GPUs -- now part of Blazegraph
Mapgraph is SYSTAP’s disruptive new technology to exploit the main memory bandwidth advantages of GPUs. The early work was co-developed with the University of Utah SCI Institute and has its pedigree in the UINTAH software running on over 750M cores on the TITAN Super Computer. Today, SYSTAP has commercialized this technology into it’s Blazegraph Accelerator and Blazegraph HPC products. Checkout our options for GPU acceleration of graphs or contact us to learn more: https://www.blazegraph.com/product/gpu-accelerated/. The early work was released under the Apache 2 open source license and is available on here at Sourceforge. This work was (partially) funded by the DARPA XDATA program under AFRL Contract #FA8750-13-C-0002 and DARPA Contract #D14PC00029.
Powerful Calculus Environment and Matrices Handling Engine
mathSuite is a very powerful Mathematical Suite which deals principally with complex algebric and geometric operations. It is powered by the fabulous ExprEval C Parser. The main purpose of this project is fast math-oriented algorithm virtualization, with an optimized direct text interface. Also it gives you a very powerful and fast Calculus Environment which let you handle easily interchangeable variable lists, matrices, LOGS and Settings Layouts with optimized Items Lists Managing Engine. You'll also be able to execute your own scriptfiles with a basic math-oriented beta script language. Some feature regarding Matrices and Linear Algebra Operations are LU-Factorization, SVD Decomposition, Rank Calculator, Ill Condition Checking and more... There are also a very powerful Linear System Solver and Basic PRELoaded Functions Integrator Engine.
Tools to analyse and use passport data for biological collections.