From: Steven B. <sb...@un...> - 2002-08-24 15:28:23
|
Matthias Thomae <th...@ei...> wrote: > So apart from the problem above, wouldn't the topological sorting of the > anchors be a nice feature for the AG library? If some application needs this we should add it to the library. Note that there are some issues that would need to be addressed: Disconnected AGs: In general, AGs aren't required to be connected, and so there is no guarantee you can traverse an AG from some unique earliest node (and the earliest node may not be unique). What allows us to sort the disjoint sections of an AG relative to each other are the offsets (when they exist), but these offsets would be ignored in a purely topological sort. Ambiguity: In many AGs, such as the ones derived from trees using the chart construction, there is a total ordering on the anchors thanks to the total ordering of the word string on which the tree is based. In general though, there's no guaranteed unique ordering among anchors (e.g. consider the graph {(1,2), (1,3), (2,4), (3,4)} where the relative ordering of 2 and 3 can't be read off the graph). Would the mapping from partial to total order be arbitrary, or would a canonical order be defined somehow? Types: In general, we'd like our tools to be able to add new layers of annotation to existing annotations; tools shouldn't assume that they know about all of the annotations in the AG. The simplest way to handle this is to use an explicit type argument on calls to all of the getX functions. To support this style of application programming, we'd need a version of the topological sort on anchors that only considered anchors incident to an arc of type t. At some point in the future we may decide that particular kinds of AG need special treatment (e.g. connected AGs, totally anchored AGs, linear AGs). There's two approaches available: enriching the class structure or creating new high-level APIs. We could subclass our AG class and add extra methods (e.g. a getAnchorSeq method which did a topological sort on ConnectedAGs and a temporal sort on TotalAGs.) Some existing functions can be written more efficiently when we can make safe assumptions about the topology of the AG. The other approach is to build a new API on top of the AG API which exposes a subset of the AG functions and adds some new ones which can be implemented using the existing API (cf our treebank API). Anyway, to sum up, before adding new sort functions we should first analyze the stated needs carefully... -Steven |