From: Phillip L. <phi...@ne...> - 2009-08-18 15:18:30
|
Chris Mungall <cj...@be...> writes: >> It is desirable that term names within textual definitions and >> comments should be used consistently. (I thought this was on foundry >> principle or at least a proposed one, but I can't seem to find it). >> However, as term names may change, it is easy for references to >> other terms within the text of a definition to become inconsistent >> with the standard names. Over time, if there are a number of >> changes, multiple inconsistencies between definitions referring to >> the same type can emerge, as well as differences with the official >> name. This issue extends to comment fields as well. >> >> The problem could be solved if we had a standard markup for ontology >> term mentions in text that included an ID/term name pair every time >> a particular term was referred to. I think that this is a great idea. It's something that I thought to add to the documentation system that I was thinking of a day or two ago; in latex it's easy -- you define a macro automatically from the OWL (or OBO). If the term name is changed, the macro breaks and the document won't compile. >> Another possible use is in the auto-generation of textual >> definitions from relationships. > > I don't think it's a good idea to generate textual definitions from > plain necessary and sufficient relationships. But the text definition > could certainly be auto-generated from the logical definition in many > cases. I'd agree with these concerns. > Another option is to adopt an existing > citation style. E.g. > > "Catalysis of the reaction: glycolaldehyde[CHEBI:17071] + NAD+[CHEBI: > 15846] + H2O[CHEBI:15377] = glycolate[CHEBI:29805] + NADH[CHEBI:16908] > + H+[CHEBI:15378]." > > The default rendering is still a bit busy but I did choose a dense > example. It's reasonably close to both journal styles and wiki styles. > > The downside is that the computer has to do a bit more work; e.g: > > "Any process that activates or increases the frequency, rate or extent > of cardiac muscle cell proliferation[GO:0060038]." > > Simple string matching can be used to figure out that the start of the > cited term is 'cardiac...', and that this is the portion that should > be auto-replaced should the primary label for GO:0060038 change. You > can imagine this going wrong in certain circumstances, so you would > not completely automate this. I wouldn't do an auto replace myself. Checking would be better -- so if [GO:0060038] occurs and is NOT preceeded by the term name, it would be an error which needs fixing. This might cause some stilled English. > > But even so I think it's better to err on the side of the human here. > > It would also be possible to produce ambiguous syntax: > > "Catalysis of the reaction: (R)-3-[(R)-3- > hydroxybutanoyloxy]butanoate[CHEBI:10979]" > > The square brackets mean different things in both contexts. You could match on [*:*]. > You would have to encode the xml. This would make the definition > string very ugly when displayed by a tool that is not aware of the > fact the string contains encoded xml in a particular format. > >> - - I'm sure others on these lists are better placed than I am to >> make good suggestions regarding the ideal format for this markup. >> Whatever is chosen should work with (or at least not break) both OWL >> and OBO formats and their major editors. >> >> One final suggestion: it might be useful to extend this to allow >> standard markup of references with text definitions and comments. The other option would be to move the documentation aside from the logic definition; you'd then end up with two files. In the end, I suspect, this might be the right way to go, but it's going to break a lot of tools. Phil |