From: Christy D. <cd...@mi...> - 2008-12-15 15:08:53
|
This document reflects a series of discussions between the MITRE team and users of SpatialML. We are very eager to have your feedback, POSITIVE or NEGATIVE, as to how these changes would change the way SpatialML handles your data or task. In general, MITRE would also like to relate the SpatialML spec more clearly to the Callisto task, since users will likely be referring to the spec while annotating in Callisto. The MITRE Team ================== Proposal for Modifications to SpatialML 1. PATH becomes RLINK: We propose renaming the PATH tag as RLINK, which stands for Relative Location Link. The following example shows this change: “the town east of Boston” <PLACE type=“PPL” id=1 form=“NOM” ctv=“TOWN”>town</PLACE> <SIGNAL id="2">east</SIGNAL> <PLACE id="3" type="PPLA" state="US-MA" country=“US” form=“NAM”>Boston</PLACE> <RLINK id="4" source="3" destination="1" direction="E" signals="2"/> 2. Signals for Relative Locations and Topological Links: The current guidelines for SpatialML use the SIGNAL tag to capture distances (5 miles) and some directions (east). These signals are then included in the PATH tag. However, other spatial signals such as "in" in "restaurant in Boston" are not tagged as signals We believe they should be so that the source of a topological link can be backtracked. "restaurant in Boston" <PLACE id="1" type=“FAC” form=“NOM”>restaurant</PLACE> <SIGNAL id=“2”>in</SIGNAL> <PLACE id="3">Boston</PLACE> <LINK source=“1” target=“3” signals="2" linkType=“IN”/> 3. Distances tagged as a special kind of SIGNAL: Additionally, distances are not really signals of the same type. At the May 13 workshop at Brandeis University, it was proposed that distances be treated more like durations from TimeML rather than just an additional attribute for the PATH tag. Our proposed solution is to add types to SIGNALS, minimally “distance”, “direction” and “extent”. As such, distances are like a special case of PLACE. The id for the distances could then be used in the new RLINK in the distance attribute. "the island 5 miles east of Boston" <PLACE id="0" type=“RGN”>island</PLACE> <SIGNAL id=“1” type="DISTANCE">5 miles</SIGNAL> <SIGNAL id=“2” type=“DIRECTION”>east</SIGNAL> <PLACE id="3" state=“US-MA” country=“US” form=“NAM”>Boston</PLACE> <RLINK id=“4” source=“3” destination=“0” distance=“1” direction="2" signals="1 2"/> 4. Event-headed relative locations: In "We camped 5 miles east of Boston", we want to be able to say that the camping took place at a location 5 miles east of Boston. We propose extending SpatialML to allow non-consuming PLACE tags. "5 miles east of Boston" <PLACE id="0"></PLACE> <SIGNAL id=“1” type="DISTANCE">5 miles</SIGNAL> <SIGNAL id=“2” type=“DIRECTION”>east</SIGNAL> <PLACE id="3" state=”US-MA” country=“US” form=“NAM”>Boston</PLACE> <RLINK id=“4” source=“3” destination=“0” distance=“1” direction="2" signals="1 2"/> [Note: in the Callisto task, consuming and non-consuming PLACE tags would have different kinds of IDs, so that they can easily be differentiated by the annotator.] 5. EC renamed to "External Connection": EC in SpatialML stands for "extended connection" rather than the traditional "external connection" from RCC8. We propose restoring the traditional nomenclature. [This has already been done.] 6. "City of Boston" Problem: The current SpatialML specification says that only "Boston" will be annotated as a PLACE and "city" will not be tagged as it is only a property of Boston. It has been suggested that both city and Boston should be tagged as places and they should be linked with an EQ tag. We agree that this is a bit of an arbitrary distinction, but the relation is more of a predicative one than EQ. Currently, we would annotate these examples as follows (simplified mark-up). We are open to suggestions as to how to modify then, whether with EQ or otherwise: city of <PLACE>Boston</PLACE> (because only "Boston" is capitalized) <PLACE>City of Boston</PLACE> (because the whole thing is a PN) <PLACE>New York</PLACE> city <PLACE>New York City</PLACE> (same reasoning) river <PLACE>Thames</PLACE> <PLACE>River Thames</PLACE> (same reasoning) 7. "Near" no longer a link type: "Near" did not fit well with the other LINK types, which map clearly to the RCC8 calculus, so we propose removing it from the set of LINK types and retaining it only as a DISTANCE value. We would treat other “vague distances” similarly (e.g. “far”, “local” Current: the [river] near [Boston] <PLACE id=“1” type=“WATER” form=“NOM”>river</PLACE> <PLACE id="2" state=”US-MA” country=“US” form=“NAM”>Boston</PLACE> <LINK id=“3” source=“2” target=“1” linkType="NEAR"> Proposed: <PLACE id=“1” type=“WATER” form=“NOM”>river</PLACE> <SIGNAL id=“2” type="DISTANCE">near</SIGNAL> <PLACE id=”3” state=”US-MA” country=“US” form=“NAM”>Boston</PLACE> <RLINK source=”1” destination=”3” signals=”2”> 7. Roads, Rivers, etc.: There was some question as to whether the current SpatialML tags things like road or river as places. It does. This confusion may have arisen because facilities were not tagged in the SpatialML Corpus (LDC2008T03) and people may have been using that as a reference corpus. We are in the process of updating that corpus with facilities. 8. Encoding the SpatialML version: In order to make to easier to keep track of which version of SpatialML has been used for various documents, we have added a version tag. There is already a <SpatialML> root tag, although it hasn’t been used routinely. We have added a “version” attribute to capture the SpatialML version. |