Entity Resolution is the process by which a dataset is processed and records are identified that represent the same real-world entity.

OYSTER (Open sYSTem Entity Resolution) is an entity resolution system that supports probabilistic direct matching, transitive linking, and asserted linking. To facilitate prospecting for match candidates (blocking), the system builds and maintains an in-memory index of attribute values to identities. Because OYSTER has an identity management system, it also supports persistent identity identifiers. OYSTER is unique among other ER systems in that it is built to incorporate Entity Identity Information Management (EIIM). OYSTER supports EIIM by providing methods that enforce identifiers to be unique among identities, maintain persistent IDs over the life of an identity, and allowing the ability to fix false-positive and false-negative resolutions, which cannot be done with matching rules, through the use of assertion, traceability, and other features.

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 2.0 (GPLv2)

Follow OYSTER Entity Resolution

OYSTER Entity Resolution Web Site

Other Useful Business Software
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
2
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 3 / 5

User Reviews

  • Great tool !
  • OYSTER has been critical to our enterprise-wide data warehouse at the University of Arkansas for Medical Sciences. We have also used it for detecting and resolving duplicate addresses and participant ids in the National Children's Study.
Read more reviews >

Additional Project Details

Operating Systems

Linux, Mac, Windows

Intended Audience

Information Technology, Management

User Interface

Command-line

Programming Language

Java

Database Environment

Flat-file, MySQL

Related Categories

Java Report Generators

Registered

2011-08-25