Main Page
From metamap-uima
Contents |
MetaMap UIMA Wrapper
Overview
The MetaMap UIMA Wrapper is a wrapper that makes the UMLS MetaMap tool (http://mmtx.nlm.nih.gov/) available as an UIMA (http://incubator.apache.org/uima/) analysis engine.
Requirements for org.metamap.uima:
- Woodstox XML parser (Version > 4.0), to parse the XML output of MetaMap.
- The UIMA runtime
- MetaMap 2009 (you have to specify it's location in the analysis engine descriptor as parameter)
Requirements for org.metamap.uima.test:
- JUnit 4
- The UIMA runtime
Features:
- Compatible with MetaMap 2009 and MetaMap 2009 version 2.
- Execute a locally installed MetaMap and parse the output into the type system [BasicMetaMapAnnotator].
- Supports MetaMap JAVA API (http://mmtx.nlm.nih.gov/#MetaMapJavaApi). Gains some performance improvements.
- Remote processing of MetaMap by using SKR API. This allows to use MetaMap without having MetaMap locally installed (especially interesting for Windows users).
- Start and stop the Tagger, Disambiguation and Java API server automatically [BasicMetaMapAnnotator].
- An UIMA type system for the MetaMap output [BasicMetaMapTypeSystem].
- Decode the semantic types of the machine output into human readable types [AdvMetaMapAnnotator].
- Find entities (different text representations of the same concept) and group them together [AdvMetaMapAnnotator].
Todo:
- Incorporate the acronyms list of the MetaMap output.
- Support MMtx (low priority).
- Write more test cases.
Documentation
Prerequisites
In the following sections I presume that you are using Eclipse [1] with the latest UIMA plugins installed [2].
You also need an UMLS license number [3] and an account for the UMLS Knowledge Source Server (UMLSKS) [4].
Now you have two options:
- a) Install MetaMap on your local machine (Linux and Solaris only). For downloading MetaMap see [5] (UMLSKS account needed!). For installation see [6]. You may also use the MetaMap Java API for better performance (see MetaMapJavaApi and Readme)
- b) Access the MetaMap webservice online (all Java enabled systems, but slower), see SKR.
Architecture
The MetaMap UIMA Wrapper consists of two UIMA analysis engines (desc/BasicMetaMapAE.xml and desc/AdvMetaMapAE.xml).
The BasicMetaMapAE does parse the MetaMap XML output and puts it into a custom defined UIMA type system (desc/BasicMetaMapTypeSystem.xml). The MetaMap XML output is nearly one to one mapped to the UIMA type system (see [7] and desc/BasicMetaMapTypeSystem.xml).
The AdvMetaMapAE does some additional stuff that is not directly part of the XML output of MetaMap, like replacing the UMLS Semantic Type [8] codes with the written out identifiers, or grouping the same candidates of a text to one entity. For that task it uses an own type system (desc/AdvMetaMapTypeSystem.xml) that extends the BasicMetaMapTypeSystem. If you don't need the additional features of AdvMetaMapAE, you can use BasicMetaMapAE on it's own. But if you like to take advantage of the AdvMetaMapAE you need an upstream connected BasicMetaMapAE. For that task there is MetaMapAE (desc/MetaMapAE.xml) that simply combines BasicMetaMapAE and AdvMetaMapAE into one analysis engine (see [9].
Parameters
There are several parameters for the BasicMetaMapAE and the AdvMetaMapAE. All parameters are documented. Just double click the appropriate parameter in the "Parameters" tab of the UIMA Component Descriptor Editor in Eclipse to view the parameters description (or view the XML source file).
The most important parameter is the "mode" parameter. It tells the wrapper how to use MetaMap. The following parameter values are supported:
- "xml" mode to execute the local MetaMap program and parse it's output into the UIMA type system (local installation of MetaMap needed).
- "api" mode to use the local MetaMap Java API server and parse it's output into the UIMA type system (needs the additional installed MetaMap API server).
- "skr" mode to use the remote MetaMap service of the NLM (needs an UMLS account, but NO locally installed MetaMap program).
The other parameters depend on what mode is used and are pretty self-explanatory.
Example Run
Coming soon ...
