Under the covers

  • Does this use _Fellegi and Sunter _under the covers or some other approach?

  • Rick Hall
    Rick Hall

    ChoiceMaker is a probabilistic record matching application, so it owes a great deal to the theory of Fellegi and Sunter. It calculates comparison vectors of what Felligi and Sunter call characteristics of records, and assigns weights for whether the components of these vectors predict matches or differs or holds (what Fellegi and Sunter call links, non-links or possible links). However, ChoiceMaker uses maximum entropy to calculate comparison vectors and weights, which is different than the methods that are described in Felligi and Sunter.

    More details about what is under the covers of ChoiceMaker software is available in the patents that cover the software:

    1) US patent 6523019, "Probabilistic record linkage model derived from training data"

    2) US patent 7152060, "Automated database blocking and record matching"

    3) US patent 7899796, "Batch automated blocking and record matching"

    These patents are now held by Open Invention Network (http://www.openinventionnetwork.com), so they're available royalty-free to any company, institution or individual that agrees not to assert its patents against the Linux System.