From: Werner C. <wer...@ec...> - 2005-09-03 09:02:22
|
----- Original Message ----- From: "Chris Mungall" <cj...@fr...> To: <obo...@li...> > So far, we have two relations that are relevant. > The relation "has_participant" between a process and an object, > defined as follows: > "P has_participant C if and only if: given any process p that > instantiates P there is some continuant c, and some time t, such > that: c instantiates p at t and c participates in p at t" > Our ontology also includes has_agent: > "As for has_participant, but with the additional condition that the > component instance is causally active in the relevant process" > Are these two relations enough to define certain GO process classes in > terms of the continuants they involve? No they don't, although the issue is not to introduce relations in terms of the continuants they involve, but rather in terms of the roles played by participants in the processes. > The GO class "T cell proliferation" can be defined using the genus > class "proliferation", with the differentia being the characteristic > of (informally) "operating upon populations of T cells". What should > the formal relation replacing "operating upon populations of" be? No. It is bad practice to introduce relationships in which continuants such as "population" are embedded. The solution is to introduce additional relations in the style of has_agent. First, there need to be made some further differentation in the type of processes, primarily according to the number of participants they necessarily take in order to exist. Some processes require only one participant, such as the active motion of a cell from one place to another. This one participant takes however two sorts of participation: has_agent, because the actively moving cell is causally involved in its motion, but also has_theme, because it undergoes the process. Compare this with throwing, where one entity (as agent) causes another entity (the theme) to move. Another example is the "patient" relationship where the participant undergoes some change during the process without itself being "actively" involved in it. So the first task is to study the various processes in GO to find the different sorts of participation they involve. This topic is well studied in linguistics under theories such as "thematic roles", "theta-roles" , ... but they concentrate primarily on how processes are verbalised by means of language, thereby ignoring ontological facts. So linguists say that the ergative version of verbs take at least 2 roles (He slammed the door shut), and the non-ergative just one (the door slammed shut). But ontologically, also in the non-ergative version some force must have caused the door to slam shut. > Here is the existing GO, non-computable natural language definition of > T-cell proliferation: > > "The rapid expansion of a T cell population by cell division. Follows > T cell activation." > > One could therefore create a relation "acts_on_population_of". Before > doing so, we should consider if an existing relation will suffice, or > if we can come up with a more re-usable relation. There is no new relation needed here: the has_agent does fine: T-cell proliferation has_agent T-cell population. > Consider an existing relation: "has_participant" - is this relation > correct to use here; and if it is correct, is it specific enough to > make use to make our definition of "T cell proliferation" unambiguous > to a computer? > > Is it correct? Well, every instance of T cell proliferation does > indeed have a T cell as a participant, so the class level definition > is not violated. But it seems the actual participant is the > population, not any one individual cell. Should we create a new > relation (something like "has_participants_from_population_of"), or > should we create an entirely new continuant class "T cell population" > defined in terms of "T cell"? Surely, the latter. > The latter is logically sound and can be implemented without actually > generating a seperate parallel ontology of aggregate classes by using > anonymous or skolem classes. But I would prefer we kept things simple > and went with a relation between a process and a non-aggregate > continuant. I don't see why this would be more complex. > Secondly, should the relation be more specific? Yes, that's why I used agency. But of course, this only does not capture everything (in fact, it just captures a little bit more than nothing) than what biologists immediate comes to mind when they think of T-cell proliferation. > Does it need to state > the nature of the participation - for example, that the population at > time t1 is necessarily the same or lower than the time at t2, where > t2>t1 and t1 and t2 are within the temporal boundaries of the process? > I believe the relation should not state this, since this should be > defined in the class "proliferation". If you want to have more, then you need at least a relation, let's call it member_of between a collection and a member, such as between a T-cell population and a T-cell. Then you can describe T-cell proliferation as the process by which members of the population at t2 are derived from (using the third type derives_from relation) members of the population at t1< t2. > Another example illustrating the population-instance vs unary-instance > distinction is GO classes relating to "homeostasis". Instances of the > GO process "cell homeostasis" have a single cell as participant. With > "T-cell homeostasis" it is a participation relation between a process > and a population of cells. We cannot state both: > > cell homeostasis <=> homeostasis, has_participant = cell > T cell homeostasis <=> homeostasis, has_participant = T cell > > Since a reasoner will (erroneously) believe that T cell homeostasis is > a kind of cell homeostasis. We either need to introduce a new > relation, or explicitly represent the concept of aggregates of > continuants. You just have to state T cell homeostasis <=> homeostasis, has_participant = T cell population But here also, I would use has_agent > Let's look at an example illustrating a different principle: cell > differentiation. Clearly the process of cell differentiation has > "cell" as participant. An instance of a cell undergoing > differentiation will instantiate a class C at some time t and a class > C' at some later time t'. The class relation between C' and C is > derives_from, but what about the instance level relation between the > cell and the process? Is it participating at both t and t' > inclusively? Or does it cease participating prior to t'? That is described in our GeneBio paper, isn't it ? > Taking a practical example illustrating this, we want to form a > computable definition of "plasmatocye differentiation". The natural > language definition is: > > "The processes by which a hemocyte precursor cell acquires the > characteristics of the phagocytic blood-cell type, the > plasmatocyte. Plasmatocytes are a class of arthropod hemocytes > important in the cellular defense response." > > Here, C is "hematocyte precursor" and C' is "phagocyte". The > derives_from relation between the latter and the former is trivially > obtainable (plasmatocyte is_a hemocyte, X derives_from X precursor). > > But what is the relation between the process class "plasmatocyte > differentiation" and the continuant class "plasmatocyte"? Is it one of > participation, or do we consider the process of differentiation to be > complete as soon as the plasmatocyte has fully differentiated? If so, > we must choose another relation in place of has_participant. Everything is captured by the derives-from relationship, i.e. after completion. It is a totally different story if you want to describe the relationships that obtain between the various involved entities while the derivation is taking place. > If we consider it to be a participant, then surely the precursor is > also a participant. In this case our class level relation should > discriminate between these variations of participation (although we > can actually obtain this from the derives_from relation between the > two continuant classes, this is a little awkward) No, it is not awkward. The same entity can enjoy different sorts of participation in the same process. See my example above. > An analagous example would be "cysteine biosynthesis", which has less > debatable temporal boundaries. Is cysteine a participant in its own > manufacture? If not, what is the appropriate relation to use? There is some process through which a cysteine molecule is created. The molecule starts to exist at a certain time, build out of other components. Again, I think that here the transformation and derivation relationships can be used, but to be sure, I must first check what cysteine is composed of. > I don't think has_agent is appropriate for any of the above cases - > the agent would be some external factor, cell, reactant. Well, has_agent is still a very general relationship. There are more detailed causally involved participatient kinds such as "instrument", "enabler", "catalyst", "hamperer", "preventer", ... (I'm just giving some names to make the issues clear). > An example illustrating the agency/non-agency participation > distinction: "neuroblast activation". Here the neuroblast is a > participant in the activation, but without agency (it is the thing > being activated). hence "patient" > The activating factor/ligand is the agent. or the "enabler" or "trigger" ? It might do just nothing else than starting the activation, in the same way as in a race, the guy with the gun gives the sign that the runners may start doing their thing. Mind again the very tricky role language plays in all this. > Do we > need an additional relation has_inactive_participant where we > specifically wish to indicate that the role of the participant is > patient? Sure ! And "patient", as said above, is thetype of relationship you are after here. > It seems that agency invokes causation which is difficult. > Fortunately, in the majority of relations we would like to add between > process and cell the active participant can be left implicit. > > These examples illustrate the most problematic cases. Well, it does not seem to be difficult at all. > Below is > what I think is the full list of GO biological processes that have > have some relation to cell or cells that is important to state in the > computable definition. I have marked every class where I think the > relation between the process and the cell is clearly one where the > has_participant relation will be both correct and specific enough. > > * chemotaxis > differentiation (see above) > similar classes: > development > morphogenesis > formation > generation > germination > * division > * degeneration > * death > * growth > homeostasis (see above) > activation (see above) > proliferation (see above) > * maturation > * migration > * anchoring (eg oocyte nucleus anchoring) > * branching (eg trichome branching) > * construction (eg oocyte construction) > * elongation (eg spermatid nuclear elongation) > * fusion > * polarization > * positioning > * transition (eg epithelial to mesenchymal transition) > commitment (eg cone cell fate commitment) > > We will be adding these relations fairly soon, so I'd favour a > solution that would be simple to implement in the short term leaving > open the option for adding extra rigour later. > > Cheers > Chris > > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA > Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf > _______________________________________________ > Obo-relations mailing list > Obo...@li... > https://lists.sourceforge.net/lists/listinfo/obo-relations > |