A group a subprojects for Data Cleaning projects, mainly as a step of a Data Mining Project. Visit www.datacleaningopensource.com to review our current applications or if you want to add yours. NOTE: PROGRAMMING SKILLS ARE REQUIRED.
Parse, analyze and -- most importantly -- use COBOL data definitions. This gives you access to COBOL data from Python programs. Write data analyzers, one-time data conversion utilities and Python programs that are part of COBOL systems. Really.
This is an ETL software which loads data from DBF/XBase files into MySQL. This utility has command line interface, designed to work without user interaction.
A colorized interactive dotplot program designed for pair-wise comparisons of RNA & DNA. The original idea was from the mind of late Prof. William J. Dreyer of Caltech. The idea is to be able to see the "tapestry" of life, which comes alive with color.
To create a framework to extract Web data and store in local RDBMS, to generate assessment reports on quality of the data being extract, and to publish the quality reports on the Web.
Pypes is a framework which allows users to break complex data processing logic down into a series of smaller less complex tasks. These tasks, referred to as components, can then be connected so that the output of one becomes the input to another.
MIVF - Medical Imaging and Visualization Factory,
is a framework for medical applications. It supplies a platform, in which image processing and 3D visualization algorithms can be employed as reusable components (functional modules or plugins).
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.
Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
Procup is a system that allows the user to do a centralized backup from a group of servers. It also indexes the information of the backup to be able to do fast searching into the data.
PyTioga is for creating figures and plots with high quality text and graphics in PDF format. Text is processed directly by TeX (not an emulation), and the graphics covers a broad range of PDF features including images, curves, clipping, and transparency.
Spire is a print stream converter/manipulator. It can transform print streams from Metacode to Postscript, Postscript to Metacode. PCL support will be added soon. Spire is also capable of sorting documents (think postal sortation) and added barcodes.
PMML-compliant scoring engine and analytic toolkit
Augustus development has moved to google code. The new project page is augustus.googlecode.com. New releases of the project are not currently being released to sourceforge.
Augustus is designed for statistical and data mining models and produces and consumes models with 10,000s of segments.
Versions of Augustus support PMML 3, 4.0.1, and 4.1.
PyGTS is a python package used to construct, manipulate, and perform computations on triangulated surfaces. It is a hand-crafted and pythonic binding for the GNU Triangulated Surface (GTS) Library. THIS PROJECT IS NOT BEING MAINTAINED.
Garmon (Gnome/GTK+ Car Monitor) is a tool that lets you connect to an OBD-II compliant vehicle using an ELM327 based scanning device. It supports monitoring live and freeze frame data as well as reading and clearing DTCs. It is written in python.
SnapLogic is an Open Source Data Integration framework that combines the power of state-of-the-art dynamic programming languages with standard Web interfaces to solve today's most pressing problems in data integration.
WebBabel is a python web application using OpenBabel to convert files
from one format to another.
It runs under Windows, Mac or Linux on your desktop, workstation or laptop.
It uses the Jmol (or Marvin) viewer to show the structures being converted.
A lightweight, browsing-based, 100% Python, federated data integration framework. Users may create custom schemas for disparate sources, query and expand results across sources to find related data; for use in fields such as bioinformatics and datamining
StormForce is a free open source program written in Python/PyGame for the Boltek LD-250/StormTracker. StormForce works in FreeBSD, Linux, and Microsoft Windows.
FLAG was designed to simplify the process of log file analysis and forensic investigations. FLAG facilitates efficient analysis of large quantities of data within an interactive environment. PyFlag is the reimplementation of FLAG in Python.
PyMaTi is a simple and easy to use GUI for numerical and scientific computing in Python. It surrounds well know packages NumPy and Matplotlib and provides possibility to immediately play with numerical python from intuitive user interface.