From: Rutger V. <rut...@gm...> - 2010-02-12 04:49:13
|
Hi all, as per issue 2948080, I have added lat/lon as semantic annotations on individual cells using DwC:DecimalLatitude and DwC:DecimalLongitude. There are presently actually no records at all that have geospatially annotated rowsegments in the database so it's hard to test this. Ryan, do you have any examples, or something specific in mind? Conceivably, verbosity could be reduced by attaching the annotations on <seq> elements instead, but i) although multiple <seq> elements are allowed "in principle", I don't think any of our tools actually support this; ii) cdao has no concept of row segments, so the annotations would be lost if we transformed to nexml to rdf. Rutger |
From: Hilmar L. <hl...@ne...> - 2010-02-12 14:39:11
|
(cross-posting to cdao re: alignment partitions) On Feb 11, 2010, at 11:49 PM, Rutger Vos wrote: > I have added lat/lon as semantic annotations on individual cells > using DwC:DecimalLatitude and DwC:DecimalLongitude. [...] > > Conceivably, verbosity could be reduced by attaching the annotations > on <seq> elements instead, but i) although multiple <seq> elements are > allowed "in principle", I don't think any of our tools actually > support this; ii) cdao has no concept of row segments, so the > annotations would be lost if we transformed to nexml to rdf. If the alignment isn't concatenated with sequences from multiple specimens, couldn't (in fact, shouldn't) you attach the lat/long to the OTU, though? I.e., where do you attach the specimen currently, and it do you attach lat/long in a different fashion than the specimen? CDAOers: what are status and plans for describing the parts of an alignment right now, and is there support, current or planned, for partitions / segments of an alignment? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Francisco P. <fra...@uc...> - 2010-02-12 20:20:45
|
Hi Hilmar, -----Mensagem original----- De: Hilmar Lapp [mailto:hl...@ne...] > If the alignment isn't concatenated with sequences from multiple > specimens, couldn't (in fact, shouldn't) you attach the lat/long to > the OTU, though? I.e., where do you attach the specimen currently, and > it do you attach lat/long in a different fashion than the specimen? If I understood well the problem... CDAO presents an OTU annotation concept, and users can instantiate it with any desirable information about the specimen, such as lat/long. One can also develop and inherit other ontologies into CDAO to provide more sophisticated concepts for specific applications and annotations. > CDAOers: what are status and plans for describing the parts of an > alignment right now, and is there support, current or planned, for > partitions / segments of an alignment? Current CDAO version has already incorporated Julie's Multiple Alignment Ontology (MAO; Thompson et al., 2005). MAO presents concepts such as "domain" and "subalignment", I suppose they can be used in this context. Ciao, Francisco -- Prof. Francisco Prosdocimi, PhD ----------------------------- Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia Universidade Católica de Brasília - UCB SGAN 916 Módulo B, Bloco C Sala 213 70.790-160 - Brasília / DF - Brasil Fone: +55 61 34487173 http://biotec.icb.ufmg.br/chicopros |
From: Arlin S. <sto...@um...> - 2010-02-16 15:57:41
|
If you folks could give the CDAO group a specific use-case or user scenario involving a segmented alignment, with data, we can try to represent it using current CDAO concepts, and if thats not possible, we'll explore possible revisions. I'm imagining that the use case is going to be something like this. There is a character-data-and-trees thing based on sequence data, in which the "alignment" is a hybrid in which, for a given OTU, the row that belongs to the OTU combines multiple data sources. A specific example would be that for a species called OTU1, the first part of the sequence comes from specimen (isolate, strain) A of that species, and the second part comes from specimen B which was sequenced in a different lab. For OTU2, it might be the case that the whole sequence comes from one specimen. To complete this use case, we would like to know what kinds of queries or operations need to be supported. Arlin ------- Arlin Stoltzfus (sto...@um...) Fellow, CARB; Adj. Assoc. Prof., UMBI; Research Biologist, NIST CARB, 9600 Gudelsky Drive, Rockville, MD tel: 240 314 6208; web: www.molevol.org On Feb 12, 2010, at 4:25 PM, Francisco Prosdocimi wrote: > Hi Hilmar, > > -----Mensagem original----- > De: Hilmar Lapp [mailto:hl...@ne...] >> If the alignment isn't concatenated with sequences from multiple >> specimens, couldn't (in fact, shouldn't) you attach the lat/long to >> the OTU, though? I.e., where do you attach the specimen currently, >> and >> it do you attach lat/long in a different fashion than the specimen? > > If I understood well the problem... > CDAO presents an OTU annotation concept, and users can instantiate > it with any desirable information about the specimen, such as lat/ > long. > One can also develop and inherit other ontologies into CDAO to > provide more sophisticated concepts for specific applications and > annotations. > > >> CDAOers: what are status and plans for describing the parts of an >> alignment right now, and is there support, current or planned, for >> partitions / segments of an alignment? > > Current CDAO version has already incorporated Julie's Multiple > Alignment Ontology (MAO; Thompson et al., 2005). > MAO presents concepts such as "domain" and "subalignment", I suppose > they can be used in this context. > > Ciao, > Francisco > > > > -- > Prof. Francisco Prosdocimi, PhD > ----------------------------- > Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia > Universidade Católica de Brasília - UCB > SGAN 916 Módulo B, Bloco C Sala 213 > 70.790-160 - Brasília / DF - Brasil > Fone: +55 61 34487173 > http://biotec.icb.ufmg.br/chicopros > > ------------------------------------------------------------------------------ > SOLARIS 10 is the OS for Data Centers - provides features such as > DTrace, > Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW > http://p.sf.net/sfu/solaris-dev2dev > _______________________________________________ > CDAO-discuss mailing list > CDA...@li... > https://lists.sourceforge.net/lists/listinfo/cdao-discuss |
From: Rutger V. <rut...@gm...> - 2010-02-13 16:02:00
|
OK, maybe it is better to attach it to the OTU, I hadn't really considered that because in the db the coordinates are attached to row segments, but I suppose this makes more sense. On Fri, Feb 12, 2010 at 11:38 PM, Hilmar Lapp <hl...@ne...> wrote: > (cross-posting to cdao re: alignment partitions) > > On Feb 11, 2010, at 11:49 PM, Rutger Vos wrote: > >> I have added lat/lon as semantic annotations on individual cells using >> DwC:DecimalLatitude and DwC:DecimalLongitude. [...] >> >> Conceivably, verbosity could be reduced by attaching the annotations >> on <seq> elements instead, but i) although multiple <seq> elements are >> allowed "in principle", I don't think any of our tools actually >> support this; ii) cdao has no concept of row segments, so the >> annotations would be lost if we transformed to nexml to rdf. > > If the alignment isn't concatenated with sequences from multiple specimens, > couldn't (in fact, shouldn't) you attach the lat/long to the OTU, though? > I.e., where do you attach the specimen currently, and it do you attach > lat/long in a different fashion than the specimen? > > CDAOers: what are status and plans for describing the parts of an alignment > right now, and is there support, current or planned, for partitions / > segments of an alignment? > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : > =========================================================== > > > > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: William P. <wil...@ya...> - 2010-02-13 16:53:04
|
In terms of simplicity, it does make sense that the lat-long attach to an OTU -- that's certainly the clearest way to express the most simple possible use case: one specimen per OTU. However, realistically, is is very common for alignments to contain rows derived from several genes and several specimens. So strictly speaking, the lat-long is an attribute of a specimen; a set of characters or genes are observed from that specimen; and the tracing of evolutionary history is derived from the agglomerations of sets of specimen character/genes. So I vote to preserve the ability to attach attributes of specimens (e.g. darwin core metadata, genbank accession numbers, etc) to certain characters, but not others -- and hence not oblige this to be attached to an OTU. Would it be crazy to attached lat/longs to both locations? i.e., if a row has two pairs of lat longs, then the two partitions of that row each of a lat/long pair; but at the same time the OTU is annotated with both pairs. Lastly, TreeBASE's system of annotating row segments allows multiple annotations for the same character elements. e.g. you can attach, say, 20 lat/long pairs and specimen IDs to the same row segment. This is particularly valuable for morphological data, in which the character scorings are applied to a set of identifiable specimens that were examined. But this does not mean that all characters were scored from the same specimens (if for no other reason that some characters can only be scored from males, others only from females). I would hate to see this valuable chain of provenance lost through the effect of "homogenizing" at the point of serialization. bp On Feb 13, 2010, at 11:01 AM, Rutger Vos wrote: > OK, maybe it is better to attach it to the OTU, I hadn't really > considered that because in the db the coordinates are attached to row > segments, but I suppose this makes more sense. > > On Fri, Feb 12, 2010 at 11:38 PM, Hilmar Lapp <hl...@ne...> wrote: >> (cross-posting to cdao re: alignment partitions) >> >> On Feb 11, 2010, at 11:49 PM, Rutger Vos wrote: >> >>> I have added lat/lon as semantic annotations on individual cells using >>> DwC:DecimalLatitude and DwC:DecimalLongitude. [...] >>> >>> Conceivably, verbosity could be reduced by attaching the annotations >>> on <seq> elements instead, but i) although multiple <seq> elements are >>> allowed "in principle", I don't think any of our tools actually >>> support this; ii) cdao has no concept of row segments, so the >>> annotations would be lost if we transformed to nexml to rdf. >> >> If the alignment isn't concatenated with sequences from multiple specimens, >> couldn't (in fact, shouldn't) you attach the lat/long to the OTU, though? >> I.e., where do you attach the specimen currently, and it do you attach >> lat/long in a different fashion than the specimen? >> >> CDAOers: what are status and plans for describing the parts of an alignment >> right now, and is there support, current or planned, for partitions / >> segments of an alignment? >> >> -hilmar |
From: Matt <dia...@gm...> - 2010-02-14 20:00:29
|
Will there be other nodes/places to add lat/long at? Lat/long on cells seems to be problematic in some ways, though it's preferable to attaching it to row segments. For one, it fails for practically all morphological matrices in which OTUs have many specimens. For sequences there are also scenarios where a cell may be derived from many specimens (consensus sequences). I suppose though that you could add many lat/longs to one cell (through specimens?)? Matt On Sat, Feb 13, 2010 at 11:01 AM, Rutger Vos <rut...@gm...> wrote: > OK, maybe it is better to attach it to the OTU, I hadn't really > considered that because in the db the coordinates are attached to row > segments, but I suppose this makes more sense. > > On Fri, Feb 12, 2010 at 11:38 PM, Hilmar Lapp <hl...@ne...> wrote: >> (cross-posting to cdao re: alignment partitions) >> >> On Feb 11, 2010, at 11:49 PM, Rutger Vos wrote: >> >>> I have added lat/lon as semantic annotations on individual cells using >>> DwC:DecimalLatitude and DwC:DecimalLongitude. [...] >>> >>> Conceivably, verbosity could be reduced by attaching the annotations >>> on <seq> elements instead, but i) although multiple <seq> elements are >>> allowed "in principle", I don't think any of our tools actually >>> support this; ii) cdao has no concept of row segments, so the >>> annotations would be lost if we transformed to nexml to rdf. >> >> If the alignment isn't concatenated with sequences from multiple specimens, >> couldn't (in fact, shouldn't) you attach the lat/long to the OTU, though? >> I.e., where do you attach the specimen currently, and it do you attach >> lat/long in a different fashion than the specimen? >> >> CDAOers: what are status and plans for describing the parts of an alignment >> right now, and is there support, current or planned, for partitions / >> segments of an alignment? >> >> -hilmar >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : >> =========================================================== >> >> >> >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading > RG6 6BX > United Kingdom > Tel: +44 (0) 118 378 7535 > http://www.nexml.org > http://rutgervos.blogspot.com > > ------------------------------------------------------------------------------ > SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, > Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW > http://p.sf.net/sfu/solaris-dev2dev > _______________________________________________ > CDAO-discuss mailing list > CDA...@li... > https://lists.sourceforge.net/lists/listinfo/cdao-discuss > |
From: Hilmar L. <hl...@ne...> - 2010-02-15 10:19:06
|
On Feb 14, 2010, at 8:00 PM, Matt wrote: > I suppose though that you could add many lat/longs to one cell > (through specimens)? Yes, attaching to OTUs should not preclude the ability to attach those to cells, and in principle one should be able to attach any number of geo-coordinates. It's an interesting question though whether the geo-coordinate as a property of the specimen should (automatically?) propagate to the cell (or OTU) that the specimen is attached to. In principle, I suppose the answer should be yes, and in a perfect Linked Data world a reasoning framework or phylo-data browser should be able to look up the specimen metadata if a specimen is attached, and infer the geo-coordinates for the cell or OTU. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <rut...@gm...> - 2010-02-15 11:44:08
|
I would like to hear what use cases Ryan (the original poster of the ticket) had in mind and what needs to happen to meet his requirements. Is there a dryad<->treebase interoperability use case you're trying to implement? On Mon, Feb 15, 2010 at 5:49 PM, Hilmar Lapp <hl...@ne...> wrote: > > On Feb 14, 2010, at 8:00 PM, Matt wrote: > >> I suppose though that you could add many lat/longs to one cell (through >> specimens)? > > Yes, attaching to OTUs should not preclude the ability to attach those to > cells, and in principle one should be able to attach any number of > geo-coordinates. > > It's an interesting question though whether the geo-coordinate as a property > of the specimen should (automatically?) propagate to the cell (or OTU) that > the specimen is attached to. In principle, I suppose the answer should be > yes, and in a perfect Linked Data world a reasoning framework or phylo-data > browser should be able to look up the specimen metadata if a specimen is > attached, and infer the geo-coordinates for the cell or OTU. > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : > =========================================================== > > > > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Arlin S. <sto...@um...> - 2010-02-15 16:57:55
|
On Feb 12, 2010, at 9:38 AM, Hilmar Lapp wrote: > CDAOers: what are status and plans for describing the parts of an > alignment right now, and is there support, current or planned, for > partitions / segments of an alignment? Right now there is not even a fully worked out concept of sequence, only states of characters. So, in the matrix: OTU1 TCAAG OTU2 TAAAG there is no "sequence" concept telling us that "T" and "C" in OTU1 are sequentially ordered residues. They are treated just like classical character states in that sense. One way to deal with this is via MAO, the multiple alignment ontology. We developed a mapping between MAO and CDAO last summer that might be useful for this. MAO has a concept of sub-alignments that might be useful here. Also, in CDAO there is a "coordinate system" concept that we intended to use to impose a mapping on characters in a sequence. The concept has not been fleshed out yet. If there is a coordinate system CS1, with "T" and "C" assigned coordinates "1" and "2", and another coordinate system CS2 with "A", "A", and "G" assigned coordinates "1", "2" and "3", then this would be a way to represent that the data from OTU1 come from 2 different sequences. Arlin |
From: Matt <dia...@gm...> - 2010-02-15 19:03:21
|
A coordinate system would be nice. It, or something similar, would be necessary for structural alignments which predict pairing of individual sites separated by some number of nucleotides (e.g. positions 2 and 34 pair). Many visualization programs also use co-ordinates on individual nucleotides. It might be better though to abstract coordinates given context, to only calculate the "absolute" coordinates on demand. A paired structure, as above, could be tied together via ids(?), then translated/transformed to particular coordinates based on the context. In real life MSAs will be truncated, extended, extracted from, concatenated etc. etc., coordinate management could be come a big overhead. Matt On Mon, Feb 15, 2010 at 11:57 AM, Arlin Stoltzfus <sto...@um...> wrote: > On Feb 12, 2010, at 9:38 AM, Hilmar Lapp wrote: > >> CDAOers: what are status and plans for describing the parts of an >> alignment right now, and is there support, current or planned, for >> partitions / segments of an alignment? > > Right now there is not even a fully worked out concept of sequence, > only states of characters. So, in the matrix: > > OTU1 TCAAG > OTU2 TAAAG > > there is no "sequence" concept telling us that "T" and "C" in OTU1 are > sequentially ordered residues. They are treated just like classical > character states in that sense. > > One way to deal with this is via MAO, the multiple alignment > ontology. We developed a mapping between MAO and CDAO last summer > that might be useful for this. MAO has a concept of sub-alignments > that might be useful here. > > Also, in CDAO there is a "coordinate system" concept that we intended > to use to impose a mapping on characters in a sequence. The concept > has not been fleshed out yet. > > If there is a coordinate system CS1, with "T" and "C" assigned > coordinates "1" and "2", and another coordinate system CS2 with "A", > "A", and "G" assigned coordinates "1", "2" and "3", then this would be > a way to represent that the data from OTU1 come from 2 different > sequences. > > Arlin > > > ------------------------------------------------------------------------------ > SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, > Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW > http://p.sf.net/sfu/solaris-dev2dev > _______________________________________________ > CDAO-discuss mailing list > CDA...@li... > https://lists.sourceforge.net/lists/listinfo/cdao-discuss > |