Entity Resolution is the process by which a dataset is processed and records are identified that represent the same real-world entity.

OYSTER (Open sYSTem Entity Resolution) is an entity resolution system that supports probabilistic direct matching, transitive linking, and asserted linking. To facilitate prospecting for match candidates (blocking), the system builds and maintains an in-memory index of attribute values to identities. Because OYSTER has an identity management system, it also supports persistent identity identifiers. OYSTER is unique among other ER systems in that it is built to incorporate Entity Identity Information Management (EIIM). OYSTER supports EIIM by providing methods that enforce identifiers to be unique among identities, maintain persistent IDs over the life of an identity, and allowing the ability to fix false-positive and false-negative resolutions, which cannot be done with matching rules, through the use of assertion, traceability, and other features.

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 2.0 (GPLv2)

Follow OYSTER Entity Resolution

OYSTER Entity Resolution Web Site

Other Useful Business Software
Total Network Visibility for Network Engineers and IT Managers Icon
Total Network Visibility for Network Engineers and IT Managers

Network monitoring and troubleshooting is hard. TotalView makes it easy.

This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
Learn More
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
2
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 3 / 5

User Reviews

  • Great tool !
  • OYSTER has been critical to our enterprise-wide data warehouse at the University of Arkansas for Medical Sciences. We have also used it for detecting and resolving duplicate addresses and participant ids in the National Children's Study.
Read more reviews >

Additional Project Details

Operating Systems

Linux, Mac, Windows

Intended Audience

Information Technology, Management

User Interface

Command-line

Programming Language

Java

Database Environment

Flat-file, MySQL

Related Categories

Java Report Generators

Registered

2011-08-25