Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo
Setting up a new cancer registry I got the permission to use another registries database (same data structure). Only the linkage software is missing (they use Automatch, which is not availiable any more).
In my search for alternatives I tested Febrl and found it rather easy to configure even with very small python knowledge. Using dummy configurations for blocking and deduplication/matching it took me about two days to get the depupe and linkage scripts running (data is already standardized; names and day of birth encrypted) .
The second step will be the adjustment of blocking and matching criteria:
From the other registry I have a complete set of Automatch config files (8 undup and 8 match files with each 8 passes (only the first pass differs between the files). They use some arrays for matching. Example:
BLOCK1 CHAR Lastname
BLOCK1 CHAR Firstname
BLOCK1 CHAR DOB
BLOCK1 CHAR MOB
BLOCK1 CHAR JOB
MATCH1 ARRAY CHAR Lastname 0.95 0.01
MATCH1 ARRAY CHAR Firstname 0.95 0.01
MATCH1 ARRAY CHAR Birthname 0.95 0.01
;MATCH1 ARRAY CHAR Prevname 0.95 0.01
MATCH1 CHAR Sex 0.98 0.5
MATCH1 CHAR DOB 0.98 0.03
MATCH1 CHAR MOB 0.98 0.08
MATCH1 CHAR JOB 0.98 0.02
MATCH1 CHAR Region 0.95 0.01
Is it possible to specify the same matching criteria in febrl?
If yes, Is there any conversion help availiable to transform automatch scripts in febrl
Cancer Registry of Hesse