Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

Using febrl instead of Automatch

Help
nightfly
2005-07-01
2013-04-17
  • nightfly
    nightfly
    2005-07-01

    Hi,

    Setting up a new cancer registry I got the permission to use another registries database (same data structure). Only the linkage software is missing (they use Automatch, which is not availiable any more).

    In my search for alternatives I tested Febrl and found it rather easy to configure even with very small python knowledge. Using dummy configurations for blocking and deduplication/matching it took me about two days to get the depupe and linkage scripts running (data is already standardized; names and day of birth encrypted) .

    The second step will be the adjustment of  blocking and matching criteria:

    From the other registry I have a complete set of Automatch config files (8 undup and 8 match files with each 8 passes (only the first pass differs between the files). They use some arrays for matching. Example:

    Undup-File1,  Pass1
    [...]
    BLOCK1 CHAR Lastname
    BLOCK1 CHAR Firstname
    BLOCK1 CHAR DOB
    BLOCK1 CHAR MOB
    BLOCK1 CHAR JOB
    MATCH1 ARRAY CHAR Lastname    0.95 0.01
    MATCH1 ARRAY CHAR Firstname 0.95 0.01
    MATCH1 ARRAY CHAR Birthname 0.95 0.01
    ;MATCH1 ARRAY CHAR Prevname 0.95 0.01
    MATCH1 CHAR Sex     0.98 0.5
    MATCH1 CHAR DOB    0.98 0.03
    MATCH1 CHAR MOB    0.98 0.08
    MATCH1 CHAR JOB    0.98 0.02
    MATCH1 CHAR Region   0.95 0.01

    Is it possible to specify the same matching criteria in febrl?

    If yes, Is there any conversion help availiable to transform automatch scripts in febrl

    Best wishes

    Stefan Gawrich
    Cancer Registry of Hesse
    Germany