From: Chris M. <cj...@fr...> - 2005-09-02 23:57:59
|
I have some questions regarding relations between processes and continuants, specifically cells. There is some urgency to these questions, as we are about to embark on the integration of GO biological process with ontologies of continuants, commencing with the OBO Cell ontology then moving on to the EBI chemical ontology. So far, we have two relations that are relevant. The relation "has_participant" between a process and an object, defined as follows: "P has_participant C if and only if: given any process p that instantiates P there is some continuant c, and some time t, such that: c instantiates p at t and c participates in p at t" The definition does not state much in the way of the nature of this participation. The paper has the following examples: cell transport has_participant cell death has_participant organism breathing has_participant thorax. I'm assuming familiarity with the discussion of instance level and class level relations in the paper. Our ontology also includes has_agent: "As for has_participant, but with the additional condition that the component instance is causally active in the relevant process" Are these two relations enough to define certain GO process classes in terms of the continuants they involve? The GO class "T cell proliferation" can be defined using the genus class "proliferation", with the differentia being the characteristic of (informally) "operating upon populations of T cells". What should the formal relation replacing "operating upon populations of" be? Here is the existing GO, non-computable natural language definition: "The rapid expansion of a T cell population by cell division. Follows T cell activation." One could therefore create a relation "acts_on_population_of". Before doing so, we should consider if an existing relation will suffice, or if we can come up with a more re-usable relation. Consider an existing relation: "has_participant" - is this relation correct to use here; and if it is correct, is it specific enough to make use to make our definition of "T cell proliferation" unambiguous to a computer? Is it correct? Well, every instance of T cell proliferation does indeed have a T cell as a participant, so the class level definition is not violated. But it seems the actual participant is the population, not any one individual cell. Should we create a new relation (something like "has_participants_from_population_of"), or should we create an entirely new continuant class "T cell population" defined in terms of "T cell"? The latter is logically sound and can be implemented without actually generating a seperate parallel ontology of aggregate classes by using anonymous or skolem classes. But I would prefer we kept things simple and went with a relation between a process and a non-aggregate continuant. Secondly, should the relation be more specific? Does it need to state the nature of the participation - for example, that the population at time t1 is necessarily the same or lower than the time at t2, where t2>t1 and t1 and t2 are within the temporal boundaries of the process? I believe the relation should not state this, since this should be defined in the class "proliferation". Another example illustrating the population-instance vs unary-instance distinction is GO classes relating to "homeostasis". Instances of the GO process "cell homeostasis" have a single cell as participant. With "T-cell homeostasis" it is a participation relation between a process and a population of cells. We cannot state both: cell homeostasis <=> homeostasis, has_participant = cell T cell homeostasis <=> homeostasis, has_participant = T cell Since a reasoner will (erroneously) believe that T cell homeostasis is a kind of cell homeostasis. We either need to introduce a new relation, or explicitly represent the concept of aggregates of continuants. Let's look at an example illustrating a different principle: cell differentiation. Clearly the process of cell differentiation has "cell" as participant. An instance of a cell undergoing differentiation will instantiate a class C at some time t and a class C' at some later time t'. The class relation between C' and C is derives_from, but what about the instance level relation between the cell and the process? Is it participating at both t and t' inclusively? Or does it cease participating prior to t'? Taking a practical example illustrating this, we want to form a computable definition of "plasmatocye differentiation". The natural language definition is: "The processes by which a hemocyte precursor cell acquires the characteristics of the phagocytic blood-cell type, the plasmatocyte. Plasmatocytes are a class of arthropod hemocytes important in the cellular defense response." Here, C is "hematocyte precursor" and C' is "phagocyte". The derives_from relation between the latter and the former is trivially obtainable (plasmatocyte is_a hemocyte, X derives_from X precursor). But what is the relation between the process class "plasmatocyte differentiation" and the continuant class "plasmatocyte"? Is it one of participation, or do we consider the process of differentiation to be complete as soon as the plasmatocyte has fully differentiated? If so, we must choose another relation in place of has_participant. If we consider it to be a participant, then surely the precursor is also a participant. In this case our class level relation should discriminate between these variations of participation (although we can actually obtain this from the derives_from relation between the two continuant classes, this is a little awkward) An analagous example would be "cysteine biosynthesis", which has less debatable temporal boundaries. Is cysteine a participant in its own manufacture? If not, what is the appropriate relation to use? I don't think has_agent is appropriate for any of the above cases - the agent would be some external factor, cell, reactant. An example illustrating the agency/non-agency participation distinction: "neuroblast activation". Here the neuroblast is a participant in the activation, but without agency (it is the thing being activated). The activating factor/ligand is the agent. Do we need an additional relation has_inactive_participant where we specifically wish to indicate that the role of the participant is patient? It seems that agency invokes causation which is difficult. Fortunately, in the majority of relations we would like to add between process and cell the active participant can be left implicit. These examples illustrate the most problematic cases. Below is what I think is the full list of GO biological processes that have have some relation to cell or cells that is important to state in the computable definition. I have marked every class where I think the relation between the process and the cell is clearly one where the has_participant relation will be both correct and specific enough. * chemotaxis differentiation (see above) similar classes: development morphogenesis formation generation germination * division * degeneration * death * growth homeostasis (see above) activation (see above) proliferation (see above) * maturation * migration * anchoring (eg oocyte nucleus anchoring) * branching (eg trichome branching) * construction (eg oocyte construction) * elongation (eg spermatid nuclear elongation) * fusion * polarization * positioning * transition (eg epithelial to mesenchymal transition) commitment (eg cone cell fate commitment) We will be adding these relations fairly soon, so I'd favour a solution that would be simple to implement in the short term leaving open the option for adding extra rigour later. Cheers Chris |