From: Alexander D. <ad...@in...> - 2009-08-24 18:06:02
|
Karen, For better or worse, these display issues have be a feature of other ontologies for a long time. Try looking at the Sequence Ontology for instance. Furthermore, as we build cross-products within the GO, between the GO and other ontologies, and among other ontologies for different use, we simply have to have a consistent display of relationships. After all, when relating items in orthologous ontologies, how do we decide which term is "broader" or "narrower." When I first encountered this display issue some time ago, I was confused and taken aback, but upon continued exposure, I have gotten used to it and appreciate its logic. OBO-Edit is not just a tool for the GO, but indeed is in use by many people for many ontologies. As ontologies grow more complicated, ontology visualization grows more complicated -- we've been spared this in the GO for a long time by limiting ourselves to is_a and part_of, and the three regulates relationships, which themselves do not even represent child-parent relationships or narrower-broader relationships in the way that is_a or part_of relationships work. It may well be true that introducing has_part has caused the loss of information from versions of the GO that do not contain this relationship. However, the implementation of has_part was a decision of the GO Consortium that solves more problems than it creates. Ideally, MODs that cannot support internal software development for tools to handle this relationship should switch to AmiGO in the long term for display of the ontology and term annotations, and continued development of AmiGO to display and handle these relationships should be a priority. Thanks, Alex Karen Christie wrote: > Hi Chris, > > Sorry so long to get back to you on this. I was sick half of last week > and there was a lot to go through in your email. Comments inline. > > -Karen > > On Fri, 14 Aug 2009, Chris Mungall wrote: > > >> [answering on ontology-editors list as this relates mostly to GO at the >> moment. Context for the discussion here: >> https://sourceforge.net/mailarchive/forum.php?forum_name=geneontology-oboedi >> t-working-group&max_rows=25&style=ultimate&viewmonth=200908 ] >> >> On Aug 14, 2009, at 2:37 PM, Karen Christie wrote: >> >> >>> To me "deeply uniintuitive" is an enormous problem, regardless of whether >>> you call it a bug. >>> >> I agree unintuitive is a problem, which is why we are restricting this to >> the editors file and the gene_ontology_ext for now. The reason the display >> is unintuitive is because the relation is unintuitive, at least to people >> used to the GO. There is in fact no completely satisfactory solution to the >> display problem. >> >> >>> Personally, I do not want to see the input-output relationship determine >>> the hierarchy of display of the terms. >>> >> I'm afraid that's how every single tool that displays ontologies works, so >> you are out of luck in a big way. What you are advocating is not a single >> change in oboedit but an across-the-board modification of everyone's >> ontology display tools. >> > > I thought we started GO because biologists needed something to > annotate, and that we started making our own tool because existing > ontology tools did not meet our needs, so I don't see why the fact > that other ontology tools display something in a particular way should > automatically override the needs of biologists to see something > biologically sensible. > > >>> It does NOT make sense when viewing the terms linked by has_part and will >>> be an even bigger problem when we need to display this relationship to >>> users. >>> >> As far as denormalized-tree type displays such as the oboedit ontology tree >> editor I agree, and we need to constructively come up with solutions. >> >> >>> I also don't see why you say that parent-child doesn't apply to has_part. >>> >> I am saying the terminology is confusing and we should abandon it. See below >> for reasons. >> >> >>> Our documentation states that the has_part relationship is the inverse of >>> the is_part relationship. >>> >> This is not quite correct, and should be corrected. They are only inverses >> on the instance level. I'll explain this further below. >> >> >>> For the biological examples so far, both the spliceosome ones in component >>> that are already in and also a has_part relationship that I plan to add in >>> process (see go/scratch/RNAsurveillance.obo), it makes sense to still >>> define a parent-child relationship, just in the opposite direction of >>> part_of. For part_of, one can can say that the smaller/more granular/child >>> thing is part_of the larger/less granular/parent. For has_part, the >>> converse, one can say that the larger/less granular/parent has the >>> smaller/more granular/child thing as a part. >>> >> But be aware that in the first case the subject of the relationship is the >> child and the target/object is the parent. In the second case the subject is >> the parent and the target/object is the child. This reversal is bound to >> confuse people used to equating the two. This is why I advocate abandoning >> the terminology. That and the fact that there is no intuitive mappings of >> parent/child to other relations. >> > > I am completely aware that has_part reverses the direction of the > relationship as compared to is_a and part_of. However, it seems to me > that the parent/child terminology and the subject/target terminology > describe different things. Parent/child seems to convey information > about the granularity, i.e. the child is smaller, or more specific > than than the parent. subject/target conveys information about the > direction of the relationship. > > By only using the subject/target information to draw the relationship, > the terms related by has_part relationships all appear inverted, in > both trees and in graphs, in a way that appears biologically wrong to > people used to reading outline form or hierarchy graphs. It seems that > if we could use both parent/child and subject/target information, we > should show the information in a meaningful way that made sense to > biologists and still showed the direction of the has_part > relationship. > > >> Furthermore, the fact that we choose to make this terminological distinction >> has absolutely no bearing on the behavior of all tools which will draw >> things like this >> >> target >> [rel] subject >> >> in denormalized tree displays >> >> and >> >> target >> ^ >> | >> subject >> >> in graph displays >> > > Yes, that's the crux of it isn't it. For has_part, I don't think this > is the correct way to draw the terms. You've talked a bunch about > sentences and order, but there are other analogies from English that > people are also used to looking at, such as outline form, where the > more specific thing is below and to the right of the more general > thing. For has_part, I would rather see this: > > subject > | > \/ > target > > and invert the direction of the arrow rather than the biologically > meaningful parent/child relationship. > > >>> Hiding the has_part relationship so that only ontology editors see it is >>> also not an acceptable permanent solution. Eurie and Mike are not pleased >>> that the introduction of the has_part relationship, and its absence from >>> the main file, means that SGD has lost information from our displays that >>> used to be present. So, we need to be displaying has_part to users, and in >>> a way that makes sense. >>> >> I'm sorry that you're not happy with the introduction of the has_part >> relationship. It was discussed for some time on this list prior to its >> introduction, as was the plan regarding gene_ontology_ext. I'm not sure why >> you, Mike or Eurie have not mentioned this until now. >> > > I was intimately involved in the introduction of the has_part > relationship and the development of the spliceosomal complex terms > that brought it in; I may have even suggested it in the SF > item. Personally, I think it is far better than the horrible kludgy > terms we made for a similar situation in the TFIIH complex terms. > > However, I do not think that we fully considered the impact of > introducing this relationship, and using it to replace existing > part-of relationships. While it is true that we have not introduced > something new and potentially confusing into the the main users' file; > we have also effectively removed information that used to be present > in that file. > > >> From the above, I'm not quite sure what it is SGD is not happy about - the >> introduction of the has_part relationship or its absence from the main GO >> file? >> > > >> If the latter then SGD is free to use gene_ontology_ext. Be aware you will >> have to modify your software to avoid the unintuitive display seen in >> oboedit. You will also have to make sure the software developers understand >> the semantics of ontology relationships. With the introduction of has_part >> it's untenable to carry on using basic DAG traversal algorithms. I'm very >> happy to talk with the software developers of the various MODs and tool >> developers to help them understand this (I've already done this with a few). >> I also appreciate that we could do with more extensive documentation here. >> > > Not exactly either of those things. Mike and Eurie are not happy with > the fact that SGD's displays are now missing information that used to > be present via part_of relationships. Our displays and tools no longer > show a connection between things like the U5 snRNP and any > spliceosomal complex. > > However, we are not yet in a position to use gene_ontology_ext because > we are aware that there is a lot of software that would need to be > modified in order for us to load that file and we have other > priorities at the moment. > > >>> So, I still contend that the fact that the has_part relationship is >>> inverting the display of terms from the way that makes sense is NOT OK. >>> >> - we knew the introduction of has_part would contradict assumptions and >> confuse people, which is why its introduction was delayed for so long and it >> was introduced vary carefully >> - all software constructed according to assumptions surrounding the original >> GO will display has_part in an unintuitive way in tree-type displays. it >> will also produce the wrong answers to queries. >> - no one is saying this is a good thing, that's just how it is. >> - this is why we only expose it in gene_ontology_ext, for sufficiently >> advanced tools >> - this solution is not ideal for GO ontology developers such as yourself >> - we should constructively work towards better solutions >> >> One improvement is to simply not show the has_part relationship in ontology >> tree editor like denormalized tree displays. I'm not saying this is a >> panacea or that it is 100% perfect. It's just a practical, simple, >> achievable step that doesn't involve completely rewriting display >> algorithms. That's all. >> > > While I don't like the way it's displaying now, I'm not sure I would > consider removing has_part relationships from displays in OE to be an > improvement, personally. It seems that would make it even harder to > detect that there is an existing relationship between something like > the U2 snRNP and a spliceosomal complex term. > > >> Let's consider other options that may involve some software rewriting (for >> everyone, not just OE). I presume you would prefer something that preserves >> a visual structure with the broader entity at the top and the narrower at >> the bottom, such as: >> >> U2-type spliceosomal complex >> U2-type prespliceosome >> U1 snRNP >> >> (my assumptions may be wrong, correct me if they are) >> >> Here are the transformation steps required to get tools to display things in >> this way: >> >> Given the ontology contains the relationships: >> >> (all) U2-type prespliceosome [has_part] (some) U1 snRNP >> (all) U2-type prespliceosome [is_a] U2-type spliceosomal complex >> the tool has to first infer from the relationship: >> >> (all) U2-type prespliceosome [has_part] (some) U1 snRNP >> >> the inverse relationship: >> >> U1 snRNP [part_of_all] U2-type prespliceosome >> >> ****note the relation**** We haven't explicitly named the type-level >> inverse of has_part until now. I'm using [part_of_all] here to indicate the >> unusual inverted all-some direction but am open to other names. The >> semantics are: >> >> X part_of_all Y <-> every instance of Y (instance level)has_part >> some X >> >> Hopefully everyone understands why part_of and has_part are not inverses on >> the type level: >> >> (all) U2-type prespliceosome [has_part] (some) U1 snRNP -- TRUE >> (all) U1 snRNP [part_of] (some) U2-type prespliceosome -- FALSE >> >> If the tool is then configured to hide has_part but to show the inferred >> inverse then default display algorithms will show: >> >> U2-type spliceosomal complex >> [is_a] U2-type prespliceosome >> [part_of_all] U1 snRNP >> >> Is this the kind of thing you are getting at? If not some diagrams would >> help me. It's also easy for me to come down to SGD and discuss this with a >> whiteboard to help. >> >> Assuming that it is -- with a bit of work it would be possible to get OE to >> show things this way. It would be considerably more work to do this across >> the board for all tools that visualize the GO in some way. >> >> I would strongly advocate that even given sufficient developer hours it is >> better *not* to display the ontology in this way at all. It perhaps looks >> more comforting, but people will make the same comforting assumptions that >> no longer hold. For example, it looks like there is some kind of transitive >> relationship between U1 snRNP and U2-type spliceosomal complex ***which >> there is not***. It looks like the true path rule might hold. ***it does >> not*** . Queries for U2-type spliceosomal complex should ***not*** return >> gene products localized to U1 snRNP complex. >> > > Why should "Queries for U2-type spliceosomal complex should ***not*** > return gene products localized to U1 snRNP complex."? I would have > thought that they should. If I wanted to know the parts of "U2-type > spliceosomal complex" I would want to know all the things that compose > the series of complexes that are all considered to be a "U2-type > spliceosomal complex". > > [Note that we need to revise the defs of "U2-type spliceosomal complex > ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; > GO:0005689") to be consistent with the def of the parent term > "spliceosomal complex ; GO:0005681" and specify that these terms > represent series of complexes. I'll submit a SF item for this.] > > >> I'm happy to be overruled by majority vote here and consider recommending >> this sort of display for all tools. I've had this discussion a few times >> before with others, most people start off wanting to retain the comforting >> broad/narrow visual structure, but on understanding the semantics of >> has_part change their mind. >> > > I prefer to call the broad/narrow structure sensible with respect to > the biology. It seems that we can display the fact that the has_part > relationship goes in the opposite direction by changing the direction > of the arrow rather than by destoying the broad/narrow display that > conveys different information. > > >> There are some alternatives for tree type displays. One is something like >> this: >> >> nucleus >> [p] nuclear part >> [i] small nuclear ribonucleoprotein complex >> [i] U1 snRNP [part_of_all : U2-type spliceosome, >> penta-snRNP complex, ...] >> [i] U2 snRNP [part_of_all : U2-type spilceosome, >> penta-snRNP complex, ...] >> >> This maintains the convention of having the direction of implication flow >> from bottom right to top left, i.e. the true path rule. every U1 snRNP is >> part_of the nucleus. >> >> I think it does make sense to show has_part in graph displays without >> introducing any kind of relation transformation. It might be possible to >> change the layout algorithm such that vertical layout correlates with >> relative size yet the arrows still accurately depict the ontology >> relationships. I include an example at the end of this file (although line >> crossing will always be a massive problem here). >> >> Hopefully this is at least partly convincing you that this is not a bug and >> that oboedit and other tools are just accurately depicting the relationships >> in a consistent manner, and that solutions to the fact that this is >> unintuitive are non-trivial. >> > > Oh, I believe you that it's not a bug, and that solutions are > non-trivial. I would consider this a significant design flaw that it > is very important that we fix. Not just for OE, we are going to have > to solve this issue in order to present reasonable displays to users, > many of whom will probably never understand ontology design. > > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Geneontology-oboedit-working-group mailing list > Gen...@li... > https://lists.sourceforge.net/lists/listinfo/geneontology-oboedit-working-group > -- Alexander D. Diehl, Ph.D. Senior Scientific Curator Mouse Genome Informatics The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609 email: ad...@in... work: +1 (207) 288-6427 fax: +1 (207) 288-6131 |