You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(3) |
Oct
|
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
(2) |
Dec
(1) |
2009 |
Jan
(4) |
Feb
|
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Inderjeet M. <ind...@gm...> - 2011-04-06 00:11:17
|
Dear All, Just to let you know that MITRE has made MIPLACE v1.0b available under a GNU Lesser General Public License (LGPL). The MIPLACE SpatialML tagger applies PLACE tags (both named and nominal PLACEs) to documents, disambiguating the named PLACEs where possible. You may download it at: http://sourceforge.net/projects/spatialml/files/MIPLACE-release-v1.0b.tar.gz/download Regards, Inderjeet. -- Inderjeet Mani http://tinyurl.com/inderjeetmani/ |
From: Mani, I. <im...@mi...> - 2009-11-09 08:04:31
|
Hello everyone, The latest SpatialML Guidelines -- version 3.1 - have now been posted to sourceforge. You may pick them up at: http://sourceforge.net/projects/spatialml/files/SpatialML-Guidelines/3.1/SpatialML-3.1.pdf/download The changes from version 3.0 have been motivated in part by an inter-annotator study. They include modifications to the Extent Rules (Section 5), filling out Toponym information (Section 6), the MOD rules (Section 8), and miscellaneous other minor clarifications. The SpatialML DTD, from version 3.0, remains unchanged. Cheers, Inderjeet. |
From: Alexander Y. <as...@mi...> - 2009-05-12 02:26:43
|
Attached are the current guidelines for 'Log Analysis and Geographic Query Identification (LAGI)', which is part of the LogCLEF 2009 evaluation. LAGI deals with finding geographical entities in search queries. The web site for LogCLEF 2009 is http://www.uni-hildesheim.de/logclef/ Thank you -Alex Yeh |
From: Mani, I. <im...@mi...> - 2009-04-03 03:49:06
|
Hello everyone, The latest SpatialML Guidelines -- version 3.0 - have now been posted to sourceforge. You may pick them up at: http://sourceforge.net/project/platformdownload.php?group_id=186077 Cheers, Inderjeet. |
From: Alexander S. Y. <as...@mi...> - 2009-03-06 21:46:54
|
Resending now that I am on the mailing list (previous message is awaiting approval): Hello, In conjunction with a Mitre project dealing with SpatialML, there may soon be a need to annotate a small (100's to 1000's of short lines) amount of Portuguese text for locations according to a set of guidelines being developed. Please contact me if you may be interested. Thank you -Alex Yeh (as...@mi...) |
From: Alexander S. Y. <as...@mi...> - 2009-03-06 19:03:20
|
Hello, In conjunction with a Mitre project dealing with SpatialML, there may soon be a need to annotate a small (100's to 1000's of short lines) amount of Portuguese text for locations according to a set of guidelines being developed. Please contact me if you may be interested. Thank you -Alex Yeh (as...@mi...) |
From: Hitzeman, J. M. <hi...@mi...> - 2009-01-09 18:25:51
|
>-----Original Message----- >From: Mardis, Scott A. [mailto:ma...@mi...] >Sent: Friday, January 09, 2009 11:38 AM >To: Doran, Christine D.; spa...@li... >Subject: Re: [Spatialml-discussion] Corrected URl > >Questions about modifications 4 & 6 (included below) > >I'm not certain why one would introduce a non-consuming place tag >in cases like the example of 4. Wouldn't it be reasonable for the >tags to surround the entire phrase: > ><PLACE id="0"><SIGNAL>5 miles</SIGNAL> <SIGNAL>east</SIGNAL> ><PLACE id="3">Boston</PLACE></PLACE> > >After all, it is the phrase that describes the location of the >event. Is SpatialML otherwise restricted from having nested >PLACE tags? > >What would be the criteria for introducing a non-consuming place >tag? Seems like you would want it for cases where a location was >referred to but had NO lexemes that referred or described the >location: perhaps in cases of ellipsis. The example that started this whole discussion was: "We camped 10 miles north of Boston. The next day we drove 50 miles west." So, we drove from Boston to place X, the from place X to place Y. Janet |
From: Mardis, S. A. <ma...@mi...> - 2009-01-09 16:38:20
|
Questions about modifications 4 & 6 (included below) I'm not certain why one would introduce a non-consuming place tag in cases like the example of 4. Wouldn't it be reasonable for the tags to surround the entire phrase: <PLACE id="0"><SIGNAL>5 miles</SIGNAL> <SIGNAL>east</SIGNAL> <PLACE id="3">Boston</PLACE></PLACE> After all, it is the phrase that describes the location of the event. Is SpatialML otherwise restricted from having nested PLACE tags? What would be the criteria for introducing a non-consuming place tag? Seems like you would want it for cases where a location was referred to but had NO lexemes that referred or described the location: perhaps in cases of ellipsis. Regarding the "C/city of Boston", I have a naïve question about the linguistics. The discussion suggests that "city" (no capital) is a property of "Boston"; but I don't understand why it isn't the other way around. For example, how does "city of Boston" differ from "suburbs of Boston" and "backstreets of Boston". In these cases, we have no synonymy to confuse ourselves. It suggests to me that the two should be marked such: <PLACE>city of <PLACE>Boston</PLACE></PLACE> <PLACE>City of Boston</PLACE> -or- if the phrasal extent is a problem, "city" should at least be marked: <PLACE>city</PLACE> of <PLACE>Boston</PLACE> Of course, the variations between City and city, River and river might well not be meaningful at all and is just loose capitalization. Scott ---------------------------------------- 4. Event-headed relative locations: In "We camped 5 miles east of Boston", we want to be able to say that the camping took place at a location 5 miles east of Boston. We propose extending SpatialML to allow non-consuming PLACE tags. "5 miles east of Boston" <PLACE id="0"></PLACE> <SIGNAL id="1" type="DISTANCE">5 miles</SIGNAL> <SIGNAL id="2" type="DIRECTION">east</SIGNAL> <PLACE id="3" state="US-MA" country="US" form="NAM">Boston</PLACE> <RLINK id="4" source="3" destination="0" distance="1" direction="2" signals="1 2"/> ------------------------------------- 6. "City of Boston" Problem: The current SpatialML specification says that only "Boston" will be annotated as a PLACE and "city" will not be tagged as it is only a property of Boston. It has been suggested that both city and Boston should be tagged as places and they should be linked with an EQ tag. We agree that this is a bit of an arbitrary distinction, but the relation is more of a predicative one than EQ. -------------------------------------- >-----Original Message----- >From: Christy Doran [mailto:cd...@mi...] >Sent: Thursday, January 08, 2009 3:54 PM >To: spa...@li... >Subject: [Spatialml-discussion] Corrected URl > >Sorry, that tinyurl did not work. Here's the full path: > >http://sourceforge.net/mailarchive/forum.php? >thread_name=198F6FD6-6FC4-4937-8964-7CF84671D7D7% >40mitre.org&forum_name=spatialml-discussion > >---------------------------------------------------------------------- >-------- >Check out the new SourceForge.net Marketplace. >It is the best place to buy or sell services for >just about anything Open Source. >http://p.sf.net/sfu/Xq1LFB >_______________________________________________ >Spatialml-discussion mailing list >Spa...@li... >https://lists.sourceforge.net/lists/listinfo/spatialml-discussion |
From: Christy D. <cd...@mi...> - 2009-01-08 20:55:36
|
Sorry, that tinyurl did not work. Here's the full path: http://sourceforge.net/mailarchive/forum.php? thread_name=198F6FD6-6FC4-4937-8964-7CF84671D7D7% 40mitre.org&forum_name=spatialml-discussion |
From: Christy D. <cd...@mi...> - 2009-01-08 20:42:01
|
Hello folks-- In mid-December, we posted to this list a set of proposed changes to SpatialML. It seems that for at least some of you, this message was filtered as spam (not even to speak of the email backlog from holidays), which is why I'm copying people directly in addition to posting to the list. If there are no objections from the user community, these changes will be incorporated into SpatialML 3.0. Please send your comments to us by January 31st, either to me directly or back to the list. The December message detailing the proposed changes can be found at http://tinyurl.com/6ghf8y. These revisions are based on feedback from Brandeis University, deriving from two SpatialML mini-workshops held at Brandeis. In addition to participants from Brandeis (Jessica Moszkowicz, Alex Plotnick, James Pustejovsky, and Marc Verhagen, among others), the first workshop included Graham Katz from Georgetown and Inderjeet Mani from MITRE. The second meeting was attended by the above Brandeis participants as well as Ben Wellner, and included Christy Doran, Janet Hitzeman, Justin Richer, and Rob Quimby from MITRE. Christy -- Dr. Christine Doran Lead AI Engineer The MITRE Corporation Phone: 781-271-2870 Fax: 781-271-2352 |
From: Christy D. <cd...@mi...> - 2008-12-15 15:08:53
|
This document reflects a series of discussions between the MITRE team and users of SpatialML. We are very eager to have your feedback, POSITIVE or NEGATIVE, as to how these changes would change the way SpatialML handles your data or task. In general, MITRE would also like to relate the SpatialML spec more clearly to the Callisto task, since users will likely be referring to the spec while annotating in Callisto. The MITRE Team ================== Proposal for Modifications to SpatialML 1. PATH becomes RLINK: We propose renaming the PATH tag as RLINK, which stands for Relative Location Link. The following example shows this change: “the town east of Boston” <PLACE type=“PPL” id=1 form=“NOM” ctv=“TOWN”>town</PLACE> <SIGNAL id="2">east</SIGNAL> <PLACE id="3" type="PPLA" state="US-MA" country=“US” form=“NAM”>Boston</PLACE> <RLINK id="4" source="3" destination="1" direction="E" signals="2"/> 2. Signals for Relative Locations and Topological Links: The current guidelines for SpatialML use the SIGNAL tag to capture distances (5 miles) and some directions (east). These signals are then included in the PATH tag. However, other spatial signals such as "in" in "restaurant in Boston" are not tagged as signals We believe they should be so that the source of a topological link can be backtracked. "restaurant in Boston" <PLACE id="1" type=“FAC” form=“NOM”>restaurant</PLACE> <SIGNAL id=“2”>in</SIGNAL> <PLACE id="3">Boston</PLACE> <LINK source=“1” target=“3” signals="2" linkType=“IN”/> 3. Distances tagged as a special kind of SIGNAL: Additionally, distances are not really signals of the same type. At the May 13 workshop at Brandeis University, it was proposed that distances be treated more like durations from TimeML rather than just an additional attribute for the PATH tag. Our proposed solution is to add types to SIGNALS, minimally “distance”, “direction” and “extent”. As such, distances are like a special case of PLACE. The id for the distances could then be used in the new RLINK in the distance attribute. "the island 5 miles east of Boston" <PLACE id="0" type=“RGN”>island</PLACE> <SIGNAL id=“1” type="DISTANCE">5 miles</SIGNAL> <SIGNAL id=“2” type=“DIRECTION”>east</SIGNAL> <PLACE id="3" state=“US-MA” country=“US” form=“NAM”>Boston</PLACE> <RLINK id=“4” source=“3” destination=“0” distance=“1” direction="2" signals="1 2"/> 4. Event-headed relative locations: In "We camped 5 miles east of Boston", we want to be able to say that the camping took place at a location 5 miles east of Boston. We propose extending SpatialML to allow non-consuming PLACE tags. "5 miles east of Boston" <PLACE id="0"></PLACE> <SIGNAL id=“1” type="DISTANCE">5 miles</SIGNAL> <SIGNAL id=“2” type=“DIRECTION”>east</SIGNAL> <PLACE id="3" state=”US-MA” country=“US” form=“NAM”>Boston</PLACE> <RLINK id=“4” source=“3” destination=“0” distance=“1” direction="2" signals="1 2"/> [Note: in the Callisto task, consuming and non-consuming PLACE tags would have different kinds of IDs, so that they can easily be differentiated by the annotator.] 5. EC renamed to "External Connection": EC in SpatialML stands for "extended connection" rather than the traditional "external connection" from RCC8. We propose restoring the traditional nomenclature. [This has already been done.] 6. "City of Boston" Problem: The current SpatialML specification says that only "Boston" will be annotated as a PLACE and "city" will not be tagged as it is only a property of Boston. It has been suggested that both city and Boston should be tagged as places and they should be linked with an EQ tag. We agree that this is a bit of an arbitrary distinction, but the relation is more of a predicative one than EQ. Currently, we would annotate these examples as follows (simplified mark-up). We are open to suggestions as to how to modify then, whether with EQ or otherwise: city of <PLACE>Boston</PLACE> (because only "Boston" is capitalized) <PLACE>City of Boston</PLACE> (because the whole thing is a PN) <PLACE>New York</PLACE> city <PLACE>New York City</PLACE> (same reasoning) river <PLACE>Thames</PLACE> <PLACE>River Thames</PLACE> (same reasoning) 7. "Near" no longer a link type: "Near" did not fit well with the other LINK types, which map clearly to the RCC8 calculus, so we propose removing it from the set of LINK types and retaining it only as a DISTANCE value. We would treat other “vague distances” similarly (e.g. “far”, “local” Current: the [river] near [Boston] <PLACE id=“1” type=“WATER” form=“NOM”>river</PLACE> <PLACE id="2" state=”US-MA” country=“US” form=“NAM”>Boston</PLACE> <LINK id=“3” source=“2” target=“1” linkType="NEAR"> Proposed: <PLACE id=“1” type=“WATER” form=“NOM”>river</PLACE> <SIGNAL id=“2” type="DISTANCE">near</SIGNAL> <PLACE id=”3” state=”US-MA” country=“US” form=“NAM”>Boston</PLACE> <RLINK source=”1” destination=”3” signals=”2”> 7. Roads, Rivers, etc.: There was some question as to whether the current SpatialML tags things like road or river as places. It does. This confusion may have arisen because facilities were not tagged in the SpatialML Corpus (LDC2008T03) and people may have been using that as a reference corpus. We are in the process of updating that corpus with facilities. 8. Encoding the SpatialML version: In order to make to easier to keep track of which version of SpatialML has been used for various documents, we have added a version tag. There is already a <SpatialML> root tag, although it hasn’t been used routinely. We have added a “version” attribute to capture the SpatialML version. |
From: Mani, I. <im...@mi...> - 2008-11-21 03:22:07
|
Dear All, An XML Schema for SpatialML has now been posted at: http://sourceforge.net/project/showfiles.php?group_id=186077 Sincerely, Inderjeet. |
From: Mani, I. <im...@mi...> - 2008-11-19 16:40:14
|
Dear All, John Bateman at the University of Bremen has represented some key examples in the SpatialML guidelines in terms of the Generalized Upper Model (GUM) ontology, specifically the GUM 3.0 spatial module. A concise document containing the representations along with the corresponding SpatialML is available at: http://sourceforge.net/projects/spatialml/GUM-Ontology-SpatialML-19-Nov -2008.pdf As the examples indicate, a mapping between GUM and SpatialML is potentially feasible, although there may of course be disambiguation required when going from the coarser SpatialML to GUM, e.g., IN to containment versus generalized possession, etc. More details on GUM can be found at: http://www.ontospace.uni-bremen.de/ontology/stable/owlDoc/index.html Comments are most welcome, as are suggestions for mappings to other spatial ontologies. Sincerely, Inderjeet. |
From: Hitzeman, J. M. <hi...@mi...> - 2008-09-04 11:58:02
|
I am not convinced that changing "NR" to a path is a good idea, mainly because PATHs seem directional, while LINKs seem positional, e.g., "the road from Belmont to San Mateo" or even "walking towards Belmont" both strike me as directional, while "the Hillsdale Mall in San Mateo" or "San Mateo, near San Francisco" strike me as lacking movement and therefore describe the position rather than the path. That's the set of definitions I've been taking as an annotator, and it has helped me to have those types of intuitions concerning paths v. links. Janet Janet Hitzeman, Ph.D. Senior Scientist The MITRE Corporation 781-271-8246 -----Original Message----- From: spa...@li... on behalf of Mani, Inderjeet Sent: Wed 9/3/2008 9:08 PM To: spa...@li... Subject: [Spatialml-discussion] Proposed changes and clarifications relatedto LINK tag Dear SpatialMLers, Several of us have been informally discussing changes to the LINK tag in SpatialML, ever since a discussion on the subject that originated at the LREC workshop in Marrakesh. Specifically, in the LINK tag, we would like to remove the 'linkType' attribute value NR, for "near". NR discretizes a distance metric, and as such doesn't belong with the other RCC8-inspired link types. It belongs instead in the PATH relation, where it can be hospitably accommodated by extending the 'distance' attribute value. So, "Belmont, near San Mateo" would have a PATH tag with a source and a target and distance="NR". We would also like to change the expansion of EC to be "external connection", rather than "extended connection", since the former is the correct RCC8 terminology. Likewise, DC should expand to "disconnected" rather than "discrete connection". Regarding the provenance of the links, here is a clarification statement, that should perhaps be added to the guidelines. DC, EC, EQ, and PO are from RCC8. IN is not RCC8, but collapses two RCC8 relations, TPP and NTPP (tangential proper part and non-tangential proper part, respectively). The reason for the collapsing is that it is often difficult for annotators to decide whether the part's region touches or doesn't touch the container's. Finally, we don't include the remaining RCC8 inverse links TPPi and NTPPi from RCC8, since these can be represented in annotation by swapping arguments, and are in addition likely to confuse annotators. These changes leave SpatialML with 5 link types: DC, EC, EQ, PO, and IN. The result is not of course RCC8 (which has DC, EC, EQ, PO, TPP, NTPP, TPPi and NTPPi), but nor is it RCC5 (which has DR, EQ, PO, PP and PPi). While IN is in fact the PP "proper part" relation in RCC5, DC is different from DR ("discrete from") -- the latter just means the two regions don't overlap. Also, SpatialML has RCC8's EC as well, and doesn't of course have RCC5's PPi. So, the 5 'linkType' attribute values in SpatialML are a proper subset of the relations in RCC5 and RCC8. I am not sure where these RCC "impoverishments" leave us in terms of qualitative reasoning capabilities. Any comments or suggestions are of course welcome. Cheers, Inderjeet. ----------------------------------------------------------------------- -- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Spatialml-discussion mailing list Spa...@li... https://lists.sourceforge.net/lists/listinfo/spatialml-discussion |
From: Mani, I. <im...@mi...> - 2008-09-04 01:08:18
|
Dear SpatialMLers, Several of us have been informally discussing changes to the LINK tag in SpatialML, ever since a discussion on the subject that originated at the LREC workshop in Marrakesh. Specifically, in the LINK tag, we would like to remove the 'linkType' attribute value NR, for "near". NR discretizes a distance metric, and as such doesn't belong with the other RCC8-inspired link types. It belongs instead in the PATH relation, where it can be hospitably accommodated by extending the 'distance' attribute value. So, "Belmont, near San Mateo" would have a PATH tag with a source and a target and distance="NR". We would also like to change the expansion of EC to be "external connection", rather than "extended connection", since the former is the correct RCC8 terminology. Likewise, DC should expand to "disconnected" rather than "discrete connection". Regarding the provenance of the links, here is a clarification statement, that should perhaps be added to the guidelines. DC, EC, EQ, and PO are from RCC8. IN is not RCC8, but collapses two RCC8 relations, TPP and NTPP (tangential proper part and non-tangential proper part, respectively). The reason for the collapsing is that it is often difficult for annotators to decide whether the part's region touches or doesn't touch the container's. Finally, we don't include the remaining RCC8 inverse links TPPi and NTPPi from RCC8, since these can be represented in annotation by swapping arguments, and are in addition likely to confuse annotators. These changes leave SpatialML with 5 link types: DC, EC, EQ, PO, and IN. The result is not of course RCC8 (which has DC, EC, EQ, PO, TPP, NTPP, TPPi and NTPPi), but nor is it RCC5 (which has DR, EQ, PO, PP and PPi). While IN is in fact the PP "proper part" relation in RCC5, DC is different from DR ("discrete from") -- the latter just means the two regions don't overlap. Also, SpatialML has RCC8's EC as well, and doesn't of course have RCC5's PPi. So, the 5 'linkType' attribute values in SpatialML are a proper subset of the relations in RCC5 and RCC8. I am not sure where these RCC "impoverishments" leave us in terms of qualitative reasoning capabilities. Any comments or suggestions are of course welcome. Cheers, Inderjeet. |
From: Mani, I. <im...@mi...> - 2008-06-21 00:08:04
|
Dear All, Just to let you know that the proceedings of the LREC'08 Spatial workshop are now available at the University of Bremen's web site (thanks, Thora!): http://www.sfbtr8.spatial-cognition.de/SpatialLREC/ At the link, you'll also find slides for the invited talk and a bibliography generated from the papers, as well as information about some other relevant ongoing activities. Thank you for having helped in various ways to make the workshop a success! Best wishes, Inderjeet. |
From: Mani, I. <im...@mi...> - 2008-04-06 13:43:31
|
Dear Colleagues, [Please ignore if you're already signed up for the GeoCLEF Query Parsing task!] I'm writing to invite participation in the Query Parsing track of the GeoCLEF'2008 workshop (http://ir.shef.ac.uk/geoclef/), a track on evaluation of multilingual Geographic Information Retrieval (GIR) systems, part of CLEF, the Cross-Lingual Evaluation Forum (http://www.clef-campaign.org/). So far, 8 groups (I represent one of them) have signed up for the Query Parsing task, while at least 10 are needed for the task to take place. The 2008 task is very interesting for those interested in interpreting spatial language. It involves extracting particular entities, locations, geo-coordinates, and geographic relationships from a large set of English search engine queries. Further details of the task, workshop dates, and organizers' emails are provided at the GeoCLEF URL above. Please forward to any colleagues who might be interested. Thanks, Inderjeet Mani. |
From: Mani, I. <im...@mi...> - 2008-02-15 20:03:44
|
Hello List, The SpatialML guidelines version 2.2 have now been posted. A new Callisto SpatialML task plugin is also available for annotating SpatialML using Callisto (available at callisto.mitre.org) with these new guidelines. For download: http://sourceforge.net/project/showfiles.php?group_id=186077 Best wishes, Inderjeet. |
From: Mani, I. <im...@mi...> - 2008-01-23 01:16:18
|
Hello List, The SpatialML annotated corpus is now available from the Linguistic Data Consortium. Please see: http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=3DLDC2008T03 Best wishes, Inderjeet.=20 |
From: Justin R. <jr...@mi...> - 2007-11-19 16:29:30
|
Hi Everyone, We are pleased to announce the availability of an annotation tool that can import, export, and manipulate the SpatialML xml-based markup language. This tool is in the form of a SpatialML plugin for Callisto, a general text annotation application by MITRE and freely available from: http://callisto.mitre.org/ This tool is written in Java and should run on any Java SE platform version 1.4 or higher. The SpatialML plugin is included in release 1.5.0 and will be included in future revisions as well. Enjoy! -- Justin Richer The MITRE Corporation |
From: Justin R. <jr...@mi...> - 2007-09-11 14:44:38
|
> > I propose that we incorporate suggestions of several easily-parsable, > > mutually-disambiguatable syntactical formats for this field. While a > > user of the language COULD use a coordinate format different from =20 > > this, > > they SHOULD use one of these given formats to ensure interoperability. > > So this becomes a strong suggestion of use, with solid guidelines and > > examples. > > > > 42=C2=B0N 71=C2=B0W > > 42.358 -71.060 > > 42.358=C2=B0 -71.060=C2=B0 > > 42.358=C2=B0N 71.060=C2=B0W > > 42.358N 71.060W > > 42:35.8N 71:6.13W > > 42:35:8N 71:6:7W > > 42:35:8.453 -71:6:4.343 >=20 > Looks good to me. This scheme doesn't match any standards I've seen, =20 > but it seems like your purpose here is just to define something =20 > that's unambiguous and parsable. I've never seen colons used to =20 > separate D:M:S, but I like it, it's entirely analogous to time notation. I've seen it used at least once before, for exactly the reason that it's analogous to time and it makes no ambiguity about decimal numbers vs. unit demarcations. > Degree symbols (=C2=B0) are superfluous since coordinates are already =20 > defined in terms of degrees. I agree that it's superfluous, but some gazeteering systems may insert that symbol into their output strings (we have one here that does). By allowing it as an optional character in a well-defined location we give a bit more flexibility without sacrificing the simple parsable nature of the string value. -- Justin |
From: <gj...@al...> - 2007-09-10 18:03:50
|
Justin Richer wrote: > I propose that we incorporate suggestions of several easily-parsable, > mutually-disambiguatable syntactical formats for this field. While a > user of the language COULD use a coordinate format different from =20 > this, > they SHOULD use one of these given formats to ensure interoperability. > So this becomes a strong suggestion of use, with solid guidelines and > examples. > > 42=B0N 71=B0W > 42.358 -71.060 > 42.358=B0 -71.060=B0 > 42.358=B0N 71.060=B0W > 42.358N 71.060W > 42:35.8N 71:6.13W > 42:35:8N 71:6:7W > 42:35:8.453 -71:6:4.343 Looks good to me. This scheme doesn't match any standards I've seen, =20= but it seems like your purpose here is just to define something =20 that's unambiguous and parsable. I've never seen colons used to =20 separate D:M:S, but I like it, it's entirely analogous to time notation. Degree symbols (=B0) are superfluous since coordinates are already =20 defined in terms of degrees. -Greg |
From: Viechnicki, P. D. <Pet...@ng...> - 2007-09-05 16:23:29
|
Justin, Thanks for raising this issue. I would just add a recommendation that = you try to make sure your proposals are interoperable with geospatial = data exchange formats which already exist out there. Perhaps the best = one to start with would be the one defined by the United Nations Group = of Experts on Geographic Names (UNGEGN), which has a draft toponymic = data exchange format, which you can find at the end of this paper: http://unstats.un.org/unsd/geoinfo/9th-UNCSGN-Docs/E-CONF-98-CRP-60.pdf Best wishes, -Peter Viechnicki =20 -----Original Message----- From: spa...@li... = [mailto:spa...@li...] On Behalf Of = Justin Richer Sent: Thursday, August 30, 2007 2:32 PM To: spa...@li... Subject: [Spatialml-discussion] Format and syntax of geo-coordinates The SpatialML DTD defines the syntax of the overall XML language, but it = doesn't yet define the syntax of a few of the fields that would benefit = from being machine-readable. The first of these is the "latLong" field = of the PLACE tag. The guidelines currently state: When gazref is available, the coordinate from the gazetteer may be copied here. This effectively makes this field capable of holding any valid XML = string. Or more specifically: Any string, including strings with or without decimals that can be parsed into GML coordinates along with appropriate coordinate systems, including military coordinate systems. On the other hand, we have these GML guidelines: <gml:Point gml:name=3D"Macy's" gml:id=3D"3"=20 srsName=3D"urn:ogc:def:crs:EPSG:6.6:4326"> <gml:coordinates>40.45 - 73.59</gml:coordinates> </gml:Point> This GML tag for Macy's says that the reference coordinate system is CRS 4326 (which happens to be the geodetic model WGS-84). It presents the coordinates in the format latitude followed by longitude (in this case in decimal degrees), with southern latitudes and western longitudes being expressed by negative signs. A richer tag might provide height and internal structure for Macy's as well.=20 Which adds in the complexity of specifying a geodetic model and doesn't = clear up the format of the actual coordinates much (at least not in this = portion of the guidelines). I propose that we incorporate suggestions of several easily-parsable, = mutually-disambiguatable syntactical formats for this field. While a = user of the language COULD use a coordinate format different from this, = they SHOULD use one of these given formats to ensure interoperability. So this becomes a strong suggestion of use, with solid guidelines and = examples. The rules for lat/lon values: 1) Latitude value given before longitude value, separated by any = amount of whitespace. Both values are always given together. [should = there be a wildcard for a single "unknown" value?] 2) North/South latitude and East/West longitude designated by one of = the following: a) Negative sign (-) prepended to southern and western values, = optional positive sign (+) prepended to northern and eastern values, = with no intervening whitespace. b) Cardinal direction letter abbreviation (N, S, E, or W) appended = to each numeric value, with no intervening whitespace. 3) Whole integers are considered to be degrees. 4) Degrees may be in decimal form if there are neither minutes nor = seconds designated, with the whole and fractional parts of the degrees = separated by a single decimal (.) period. 5) Degrees and minutes separated by a single colon (:) with no = intervening whitespace. Minutes may be in decimal form if there are no = seconds designated, with whole and fractional parts separted by a single = decimal (.) period. 6) Minute and second separated by a single colon (:) with no = intervening whitespace. Seconds may be in decimal form, with whole and = fractional parts separted by a single decimal (.) period. 7) A degree character (=B0) may optionally follow all degree-based = designations such as whole degrees and decimal degrees, potentially = coming in between the numeric value and any cardinal direction, as in = 71=B0W. This gives us the following list of valid, parsable values for the = latLong field: 42=B0N 71=B0W 42.358 -71.060 42.358=B0 -71.060=B0 42.358=B0N 71.060=B0W 42.358N 71.060W 42:35.8N 71:6.13W 42:35:8N 71:6:7W 42:35:8.453 -71:6:4.343 ... among a few other possible combinations of the different parts. In = effect, this gives us Degree:Minute:DecimalSecond, Degree:DecimalMinute, = and DecimalDegree notational schemes with some bit of flexibility (but unambiguity) in handling the directional requirements. We could conceivably also add in guidelines for other coordinate = systems, such as MGRS and UTM. As long as the syntax could easily be = distinguished from what's allowable by the above rules, the various = systems could coexist in the same text field. Comments are welcome! -- Justin -------------------------------------------------------------------------= This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ = _______________________________________________ Spatialml-discussion mailing list Spa...@li... https://lists.sourceforge.net/lists/listinfo/spatialml-discussion |
From: Justin R. <jr...@mi...> - 2007-08-30 18:33:36
|
The SpatialML DTD defines the syntax of the overall XML language, but it doesn't yet define the syntax of a few of the fields that would benefit from being machine-readable. The first of these is the "latLong" field of the PLACE tag. The guidelines currently state: When gazref is available, the coordinate from the gazetteer may be copied here. This effectively makes this field capable of holding any valid XML string. Or more specifically: Any string, including strings with or without decimals that can be=20 parsed into GML coordinates along with appropriate coordinate systems, including military coordinate systems. On the other hand, we have these GML guidelines: <gml:Point gml:name=3D"Macy's" gml:id=3D"3"=20 srsName=3D"urn:ogc:def:crs:EPSG:6.6:4326"> <gml:coordinates>40.45 - 73.59</gml:coordinates> </gml:Point> This GML tag for Macy's says that the reference coordinate system is CRS 4326 (which happens to be the geodetic model WGS-84). It presents=20 the coordinates in the format latitude followed by longitude (in this=20 case in decimal degrees), with southern latitudes and western=20 longitudes being expressed by negative signs. A richer tag might=20 provide height and internal structure for Macy's as well.=20 Which adds in the complexity of specifying a geodetic model and doesn't clear up the format of the actual coordinates much (at least not in this portion of the guidelines). I propose that we incorporate suggestions of several easily-parsable, mutually-disambiguatable syntactical formats for this field. While a user of the language COULD use a coordinate format different from this, they SHOULD use one of these given formats to ensure interoperability. So this becomes a strong suggestion of use, with solid guidelines and examples. The rules for lat/lon values: 1) Latitude value given before longitude value, separated by any amount of whitespace. Both values are always given together. [should there be a wildcard for a single "unknown" value?] 2) North/South latitude and East/West longitude designated by one of the following: a) Negative sign (-) prepended to southern and western values, optional positive sign (+) prepended to northern and eastern values, with no intervening whitespace. b) Cardinal direction letter abbreviation (N, S, E, or W) appended to each numeric value, with no intervening whitespace. 3) Whole integers are considered to be degrees. 4) Degrees may be in decimal form if there are neither minutes nor seconds designated, with the whole and fractional parts of the degrees separated by a single decimal (.) period. 5) Degrees and minutes separated by a single colon (:) with no intervening whitespace. Minutes may be in decimal form if there are no seconds designated, with whole and fractional parts separted by a single decimal (.) period. 6) Minute and second separated by a single colon (:) with no intervening whitespace. Seconds may be in decimal form, with whole and fractional parts separted by a single decimal (.) period. 7) A degree character (=C2=B0) may optionally follow all degree-based designations such as whole degrees and decimal degrees, potentially coming in between the numeric value and any cardinal direction, as in 71=C2=B0W. This gives us the following list of valid, parsable values for the latLong field: 42=C2=B0N 71=C2=B0W 42.358 -71.060 42.358=C2=B0 -71.060=C2=B0 42.358=C2=B0N 71.060=C2=B0W 42.358N 71.060W 42:35.8N 71:6.13W 42:35:8N 71:6:7W 42:35:8.453 -71:6:4.343 ... among a few other possible combinations of the different parts. In effect, this gives us Degree:Minute:DecimalSecond, Degree:DecimalMinute, and DecimalDegree notational schemes with some bit of flexibility (but unambiguity) in handling the directional requirements. We could conceivably also add in guidelines for other coordinate systems, such as MGRS and UTM. As long as the syntax could easily be distinguished from what's allowable by the above rules, the various systems could coexist in the same text field. Comments are welcome! -- Justin |
From: Mani, I. <im...@mi...> - 2007-08-22 14:15:37
|
Hello list, Hope everyone is having a good summer/winter.=20 Just to let you know that MITRE has annotated more than 400 documents with SpatialML tags, intended for use by the community. The documents (covering English news, blogs, newsgroups) were originally obtained from the Linguistic Data Consortium (LDC) at the University of Pennsylvania, for use in the 2005 Automatic Content Extraction (ACE) technology evaluation. We are now arranging to release the annotated versions through the LDC. Also, if people have documents in other genres (especially documents that involve giving of directions) that they would like annotated, we might be able to help out, depending on the quantity, interest, language, and licensing requirements.=20 Stay tuned, - Inderjeet. |