Graph-based Extraction and Summarization - a generic graph-based summarization framework. Basic functionality is provided - third-party modules can be plugged in.
Methods and testing of methods for automatic analysis of in situ cyclic
voltammetry data.
This, at least initially, is the code from my masters thesis, which was
done as a contribution to a larger project called Aevum. Aevum is being
developed at t
ngram is a module to compute the similarity between two strings. It is different to python's "difflib.SequenceMatcher" in that it cares more about the size of both strings. ngram is an port and extension of the perl module called "String::Trigram
Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
The Varro toolkit is a system for identifying and frequently recurring unordered subtrees in semi-structured data. It is mostly for linguistics but has applications in semi-structured data mining too.
ConDEnSE (Confidential Data Enabled Statistical Exploration) will be a web-based environment for statistical analysis of confidential data from various database sources, based on Plone and R, and using the Jackknife method of confidentiality protection.
This project is a python script to abstract S.M.A.R.T. messages from /var/log/messages (from the smartd daemon) into .csv file(s), one for each disk, suitable for graphing.
Helps compatibility with Vernier's Graphical Analysis and Logger Pro software. Includes a converter to extract important data out of Vernier .ga3 and .cmbl data files, and spreadsheet to analyze the data with tables, graphs, and curve fitting.
PMML-compliant scoring engine and analytic toolkit
Augustus development has moved to google code. The new project page is augustus.googlecode.com. New releases of the project are not currently being released to sourceforge.
Augustus is designed for statistical and data mining models and produces and consumes models with 10,000s of segments.
Versions of Augustus support PMML 3, 4.0.1, and 4.1.
The Simple Versatile Plotting (SVP) tools create camera-ready plots of performance analysis data gathered from high performance computing (HPC) applications.
The Serial Data Acquisition is a lightweight data acquisition system able to parse a vast majority of mostly unidirectional streams. Results are saved in a SQLite DB and accessible over XML-RPC or plain HTTP. Its design is modular and easily extendable.
The Toolkit for Advanced Discriminative Modeling (TADM) is a C++ implementation for estimating the parameters of discriminative models, such as maximum entropy models. It uses the PETSc and TAO toolkits to provide high performance and scalability.
sarface is a user-interface to the sysstat/sar database which inputs data from sar and plots to a live X11 graph via gnuplot. It mimics the cmd-line options from sar but can cross-plot any two or more stats and apply simple mathematical functions them.
Using this plugin-based framework, you can instantly start working on the *brain* of your bot (irc bot, chatterbot, robot, ...). With support for db, irc, logging and programming-language independent plugins, users can easily enhance the functionality.
KML is a knowledge base with support of logical modeling. Advanced model is used to represent knowledge as a set of statements similar to natural language sentences. This project hosts a set of model storage library and server (vrb-ols) and clients.
clusterviz allows to cluster three-dimensional data. The clustering process is visualized using OpenGL. As clustering algorithms the family of k-means algorithms is implemented, including mixture models.
This is a implementation of the 'enhanced Topic-based Vector Space Model' (eTVSM) using the python language. A Java-Version and maybe other java-code contributions are planned.