Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Matching Medical Data

makboul
2007-04-19
2013-04-25
  • makboul
    makboul
    2007-04-19

    Is there any information on what method is best for matching healthcare providers data:

    1) Doctor First and last name
    2) Practice address
    3) Hospital name

     
    • ReverendSam
      ReverendSam
      2007-04-20

      1) this depends on how clean the data is, is it just transliteration and occasional spelling errors or is it first name and last name switched, is there middle names and titles in some versions or is it always set to be  first_name space last_name? are the names international (i expect so)

      a base expectation of international names and just transliteration and spelling errors then i would recomend smithwatermangotoh

      2) MongeElkan perhaps but again it is hard to say without data examples

      3) JaroWinkler or SmithWatermanGotoh seem best to cope with what i would expect minor deviations spelling added hyphens capitals etc

      Out of interest you may be interested in the following paper

      Jaro, M. A. 1995 "Probabilistic linkage of large public health data file" Statistics in Medicine 14:491-498