|
From: Chris M. <cj...@be...> - 2009-08-17 18:53:39
|
On Aug 11, 2009, at 10:14 PM, David Osumi-Sutherland wrote: > Hi all, > > I've been thinking about sending out this proposal for some time. I > think it fits well with the recent discussion of documentation (see > Subject: Re: [Obo-discuss] ontology term comments and provenance). > I'm sending it to both OBO format and OBO discuss. Note - I'm > interested in the views of OWL users as well as OBO format users. > > ---- > > = Proposal for standard syntax for marking up term names in textual > definitions and coments = > > It is desirable that term names within textual definitions and > comments should be used consistently. (I thought this was on foundry > principle or at least a proposed one, but I can't seem to find it). > However, as term names may change, it is easy for references to > other terms within the text of a definition to become inconsistent > with the standard names. Over time, if there are a number of > changes, multiple inconsistencies between definitions referring to > the same type can emerge, as well as differences with the official > name. This issue extends to comment fields as well. > > The problem could be solved if we had a standard markup for ontology > term mentions in text that included an ID/term name pair every time > a particular term was referred to. With this in place, it should be > easy to automatically update names, using the ID as lookup, via > scripts or systems built into the major ontology editing software. > Such a system could also be used to generate hyperlinks allowing > clicking from defintions to the terms referred to (both actual > hyperlinks in web display, and some equivelent in editing tools). > > Such a markup could also be useful in notes written as part of > public discussion of term definitions, for example on a wiki. It > should be easy to develop term-picking systems to allow users to > easily generated this markup. The markup could also serve as an > indexing system for external comments. > > Another possible use is in the auto-generation of textual > definitions from relationships. I don't think it's a good idea to generate textual definitions from plain necessary and sufficient relationships. But the text definition could certainly be auto-generated from the logical definition in many cases. But there is still the descriptive text that accompanies the formal textual definition - this is very useful, but it has to be manually crafted. This means it is prone to go stale, by referencing terms that may become obsolete or by using terms that become non-exact synonyms. I see a big motivation for your proposal here. > So, how should the markup work? I'm probably not the right person > to specify this, but it seems to me there are two major options: > > 1. a simple system involving special characters to delimit term/ID > pairs + a standard syntax for the term ID pair itself. e.g.- > @termname;ID:1234567@. > - Seems like a rather hacky option, although does have the advantage > of being simple, easy to do by hand, and unobtrusive enough to leave > the text readable without further processing. I think it's important to have the default rendering look like something that isn't perl code. "Catalysis of the reaction: glycolaldehyde + NAD+ + H2O = glycolate + NADH + H+." ==> "Catalysis of the reaction: @glycolaldehyde;CHEBI:17071@ + @NAD +:CHEBI:15846@ + @H2O:CHEBI:15377@ = @glycolate: CHEBI:29805@ + @NADH:CHEBI:16908@ + @H+: CHEBI:15378@." Another option is to adopt an existing citation style. E.g. "Catalysis of the reaction: glycolaldehyde[CHEBI:17071] + NAD+[CHEBI: 15846] + H2O[CHEBI:15377] = glycolate[CHEBI:29805] + NADH[CHEBI:16908] + H+[CHEBI:15378]." The default rendering is still a bit busy but I did choose a dense example. It's reasonably close to both journal styles and wiki styles. The downside is that the computer has to do a bit more work; e.g: "Any process that activates or increases the frequency, rate or extent of cardiac muscle cell proliferation[GO:0060038]." Simple string matching can be used to figure out that the start of the cited term is 'cardiac...', and that this is the portion that should be auto-replaced should the primary label for GO:0060038 change. You can imagine this going wrong in certain circumstances, so you would not completely automate this. But even so I think it's better to err on the side of the human here. It would also be possible to produce ambiguous syntax: "Catalysis of the reaction: (R)-3-[(R)-3- hydroxybutanoyloxy]butanoate[CHEBI:10979]" The square brackets mean different things in both contexts. > 2. An embedded XML tag. This would be less hacky - it could > potentially extend existing standards for XML representation of > ontologies and would be easy to mine using standard tools. It has > the disadvantage of being verbose and a pain to do by hand. I'm > worried it also may screw with OWL-XML standards, but don't know > enough about these to say. You would have to encode the xml. This would make the definition string very ugly when displayed by a tool that is not aware of the fact the string contains encoded xml in a particular format. > - - I'm sure others on these lists are better placed than I am to > make good suggestions regarding the ideal format for this markup. > Whatever is chosen should work with (or at least not break) both OWL > and OBO formats and their major editors. > > One final suggestion: it might be useful to extend this to allow > standard markup of references with text definitions and comments. > > Cheers, > > David > > > David Osumi-Sutherland, PhD > Ontologist / Curator > Virtual Fly Brain / FlyBase > Department of Genetics > University of Cambridge > Downing Street > Cambridge, CB2 3EH > UK > +44 (0)1223 333 963 > > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day > trial. Simplify your report design, integration and deployment - and > focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july_______________________________________________ > Obo-format mailing list > Obo...@li... > https://lists.sourceforge.net/lists/listinfo/obo-format |