From: William P. <wil...@ya...> - 2010-02-13 16:53:04
|
In terms of simplicity, it does make sense that the lat-long attach to an OTU -- that's certainly the clearest way to express the most simple possible use case: one specimen per OTU. However, realistically, is is very common for alignments to contain rows derived from several genes and several specimens. So strictly speaking, the lat-long is an attribute of a specimen; a set of characters or genes are observed from that specimen; and the tracing of evolutionary history is derived from the agglomerations of sets of specimen character/genes. So I vote to preserve the ability to attach attributes of specimens (e.g. darwin core metadata, genbank accession numbers, etc) to certain characters, but not others -- and hence not oblige this to be attached to an OTU. Would it be crazy to attached lat/longs to both locations? i.e., if a row has two pairs of lat longs, then the two partitions of that row each of a lat/long pair; but at the same time the OTU is annotated with both pairs. Lastly, TreeBASE's system of annotating row segments allows multiple annotations for the same character elements. e.g. you can attach, say, 20 lat/long pairs and specimen IDs to the same row segment. This is particularly valuable for morphological data, in which the character scorings are applied to a set of identifiable specimens that were examined. But this does not mean that all characters were scored from the same specimens (if for no other reason that some characters can only be scored from males, others only from females). I would hate to see this valuable chain of provenance lost through the effect of "homogenizing" at the point of serialization. bp On Feb 13, 2010, at 11:01 AM, Rutger Vos wrote: > OK, maybe it is better to attach it to the OTU, I hadn't really > considered that because in the db the coordinates are attached to row > segments, but I suppose this makes more sense. > > On Fri, Feb 12, 2010 at 11:38 PM, Hilmar Lapp <hl...@ne...> wrote: >> (cross-posting to cdao re: alignment partitions) >> >> On Feb 11, 2010, at 11:49 PM, Rutger Vos wrote: >> >>> I have added lat/lon as semantic annotations on individual cells using >>> DwC:DecimalLatitude and DwC:DecimalLongitude. [...] >>> >>> Conceivably, verbosity could be reduced by attaching the annotations >>> on <seq> elements instead, but i) although multiple <seq> elements are >>> allowed "in principle", I don't think any of our tools actually >>> support this; ii) cdao has no concept of row segments, so the >>> annotations would be lost if we transformed to nexml to rdf. >> >> If the alignment isn't concatenated with sequences from multiple specimens, >> couldn't (in fact, shouldn't) you attach the lat/long to the OTU, though? >> I.e., where do you attach the specimen currently, and it do you attach >> lat/long in a different fashion than the specimen? >> >> CDAOers: what are status and plans for describing the parts of an alignment >> right now, and is there support, current or planned, for partitions / >> segments of an alignment? >> >> -hilmar |