[Obo-discuss] Proposal for standard syntax for marking up term names in textual definitions and com

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all,

I've been thinking about sending out this proposal for some time.  I  
think it fits well with the recent discussion of documentation (see 	 
Subject: 	Re: [Obo-discuss] ontology term comments and provenance).   
I'm sending it to both OBO format and OBO discuss.  Note - I'm  
interested in the views of OWL users as well as OBO format users.

----

= Proposal for standard syntax for marking up term names in textual  
definitions and coments =

It is desirable that term names within textual definitions and  
comments should be used consistently. (I thought this was on foundry  
principle or at least a proposed one, but I can't seem to find it).   
However, as term names may change, it is easy for references to other  
terms within the text of a definition to become inconsistent with the  
standard names.  Over time, if there are a number of changes, multiple  
inconsistencies between definitions referring to the same type can  
emerge, as well as differences with the official name. This issue  
extends to comment fields as well.

The problem could be solved if we had a standard markup for ontology  
term mentions in text that included an ID/term name pair every time a  
particular term was referred to. With this in place, it should be easy  
to automatically update names, using the ID as lookup, via scripts or  
systems built into the major ontology editing software.   Such a  
system could also be used to generate hyperlinks allowing clicking  
from defintions to the terms referred to (both actual hyperlinks in  
web display, and some equivelent in editing tools).

  Such a markup could also be useful in notes written as part of  
public discussion of term definitions, for example on a wiki.  It  
should be easy to develop term-picking systems to allow users to  
easily generated this markup.  The markup could also serve as an  
indexing system for external comments.

Another possible use is in the auto-generation of textual definitions  
from relationships.

So, how should the markup work?  I'm probably not the right person to  
specify this, but it seems to me there are two major options:

1. a simple system involving special characters to delimit term/ID  
pairs + a standard syntax for the term ID pair itself.  e.g.-  
@termname;ID:1234567@.
- Seems like a rather hacky option, although does have the advantage  
of being simple, easy to do by hand, and unobtrusive enough to leave  
the text readable without further processing.

2. An embedded XML tag. This would be less hacky - it could  
potentially extend existing standards for XML representation of  
ontologies and would be easy to mine using standard tools. It has the  
disadvantage of being verbose and a pain to do by hand.  I'm worried  
it also may screw with OWL-XML standards, but don't know enough about  
these to say.

- - I'm sure others on these lists are better placed than I am to make  
good suggestions regarding the ideal format for this markup. Whatever  
is chosen should work with (or at least not break) both OWL and OBO  
formats and their major editors.

One final suggestion:  it might be useful to extend this to allow  
standard markup of references with text definitions and comments.

Cheers,

David

David Osumi-Sutherland, PhD
Ontologist / Curator
Virtual Fly Brain / FlyBase
Department of Genetics
University of Cambridge
Downing Street
Cambridge, CB2 3EH
UK
+44 (0)1223 333 963

[Obo-discuss] Proposal for standard syntax for marking up term names in textual definitions and com

[Obo-discuss] Proposal for standard syntax for marking up term names in textual definitions and coments