Drug name recognition and normalisation/grounding to DrugBank ids and standard names.
Package provides 2 taggers:
1. DrugTagger - CRF-based with DrugBank presence feature (see feature set for details).
2. DrugnameGazetteer - gazetteer/dictionary-based. Dictionary created from DrugBank.ca database.
Both taggers include grounding/normalisation to DrugBank ids and standard names.
Feature set:
Word, Word-1, Word+1, Word-1_Word, Word_Word+1, DrugBankPresence, POS
DrugBankPresence feature indicates the presence of the drug name in the DrugBank.
Using CONLL-Evaluation:
processed 32065 tokens with 3656 phrases; found: 3251 phrases; correct: 2786.
accuracy: 95.25%; precision: 85.70%; recall: 76.20%; FB1: 80.67
Using GATE Corpus Benchmark:
Strict: P: 0.65 R: 0.73 F1: 0.69
Lenient: P: 0.74 R: 0.84 F1: 0.78
The details of how to reproduce evaluation, see README.
To use standalone version for tagging download DrugExtractionStandalone.tar.gz from Files.
Downloads:
0 This Week