DataCleaner is a data quality analysis application and a solution platform for DQ solutions. It's core is a strong data profiling engine, which is extensible and thereby adds data cleansing, transformations, enrichment, deduplication, matching and merging.
Website: http://datacleaner.github.io
Features
- Profiles and analyzes your database within minutes!
- Access almost any datastore - Oracle, MySQL, PostgreSQL, MS SQL Server, MongoDB, CUBRID, CSV files, Excel spreadsheets, dbase and more
- Discover patterns in your textual data with the Pattern Finder
- Find out which values occur the most with the Value Distribution profile
- Cleanse your contact details with name and address validations
- Detect duplicates using fuzzy logic and configurable weights and thresholds
- Merge your duplicates and create a single version of the truth
- Write data back to relational databases, CSV files, Excel spreadsheets or MongoDB databases
Categories
Data Warehousing, Information Analysis, Business Intelligence, Database Management Systems (DBMS), Data Quality, Data ProfilingLicense
GNU Library or Lesser General Public License version 3.0 (LGPLv3)Other Useful Business Software
AI-powered service management for IT and enterprise teams
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of DataCleaner!
Additional Project Details
Intended Audience
Information Technology, Quality Engineers, Science/ResearchUser Interface
Java Swing, Web-basedProgramming Language
JavaDatabase Environment
Firebird/InterBase, Flat-file, HSQL, JDBC, Microsoft SQL Server, MySQL, Oracle, Other network-based DBMS, PostgreSQL (pgsql), Project is a database conversion tool, Project is a database management tool, SQLite, XML-basedRelated Categories
Java Data Warehousing Software, Java Information Analysis Software, Java Business Intelligence Software, Java Database Management Systems (DBMS), Java Data Quality ToolRegistered
Find a Partner

Human Inference
Human Inference is the European market leader in data quality solutions. The solutions are based on natural language processing and contain a core of knowledge to provide our customers with the best quality possible.

Neopost Customer Information Management
Neopost Customer Information Management is a set of solutions and services that covers the entire lifecycle of customer information and communication management.
Add-ons & Plugins

ElasticSearch for DataCleaner

Groovy DataCleaner
