With growing amount of information from multiple sources it has become very hard to relate information to the correct real life entities. Record matching software try to solve this by machine learning techniques. To do this effectively, its necessary to train the record matcher with proper test data which is identical to real life data. Hence, there is a need for a data generator to create the synthetic data to be used for evaluating the quality and capability of record matching software.

A data generator creates qualitative test data considering various the real life data glitches entered through various means like human data entry, voice dictation and data scanning. The data generation process is done in many steps like org data creation, data grouping, pair generation, data mutation and matching data patterns. Data generator also mangles field values of generated test data to achieve data errors and co-relate them in real life contexts like Family, Households, Organizations etc

Features

  • Ability to generate Match, Hold and Differ record pairs
  • Support for various types of data mutatotions
  • Ability to co-relate data records into real life contexts like family, household, organizations
  • Intra group and inter group pair generation
  • Attribute dependancy support

Project Samples

Project Activity

See All Activity >

Follow A Data Generator

A Data Generator Web Site

You Might Also Like
Cyber Risk Assessment and Management Platform Icon
Cyber Risk Assessment and Management Platform

ConnectWise Identify is a powerful cybersecurity risk assessment platform offering strategic cybersecurity assessments and recommendations.

When it comes to cybersecurity, what your clients don’t know can really hurt them. And believe it or not, keep them safe starts with asking questions. With ConnectWise Identify Assessment, get access to risk assessment backed by the NIST Cybersecurity Framework to uncover risks across your client’s entire business, not just their networks. With a clearly defined, easy-to-read risk report in hand, you can start having meaningful security conversations that can get you on the path of keeping your clients protected from every angle. Choose from two assessment levels to cover every client’s need, from the Essentials to cover the basics to our Comprehensive Assessment to dive deeper to uncover additional risks. Our intuitive heat map shows you your client’s overall risk level and priority to address risks based on probability and financial impact. Each report includes remediation recommendations to help you create a revenue-generating action plan.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of A Data Generator !

Additional Project Details

Operating Systems

Linux, Windows

Languages

English

Intended Audience

Information Technology, Science/Research

User Interface

Command-line

Programming Language

Java

Database Environment

MySQL

Related Categories

Java Artificial Intelligence Software, Java Synthetic Data Generation Software

Registered

2011-07-18