From: Chris M. <cj...@fr...> - 2007-10-31 00:20:50
|
I think this deserves elevating to list discussion. There is ongoing concern and confusion on what to do with food and bits of organisms. Ideally food, animal parts and derivatives can be described compositionally, possibly recursively, using simpler classes; for example: termite gut = gastrointestinal_tract that part_of termite ear gunk = bodily_substance that derived_from ear bird feces = feces that excreted_from bird wool = fiber that derived_from (fur that part_of Ovis) armpit sweat = sweat that located_in armpit cheese = substance that derived_from milk blue cheese = cheese that has_part penicilin spoiled blue cheese = blue_cheese that has_quality spoiled bone meal = bones that has_quality crushed and has_role animal_food and has_role fertilizer These compositions can easily be made at ontology development time (this is true in general of oboedit, protege etc), or they can be made at annotation time (phenote makes this simple). That part is easy. However, the hard part is where we find the simpler terms to use in order to compose the more complex terms. Let's take anatomical entities first. If all host environments were human then this problem would be solved, we would simply use classes from the FMA. If host environments were also located in fruitflies, zebrafish, mice, frogs we'd still be in the clear. But this is not the case. Take termite gut. It may be tempting to take the closest terms from a species specific AO, eg the fly anatomy. But this would be wrong. Note that CARO will not help us here, as CARO has stalled at the high upper level and goes no more specific than organ. I see no alternative here other than to simply bite the bullet and start the bottom-up creation of a species-neutral anatomical ontology. It doesn't have to be huge - only as large as is needed for annotation. It doesn't have to be as rigorous as the FMA (as that would bring the whole enterprise to a halt). It does have to have clear definitions (otherwise we may as well use free text). At some point in the future development could and should be handed off to the appropriate people - but these people don't have funding yet! We can then use the simple classes in this ontology (and others) to make more complex classes such as ear gunk or bird feces. It isn't so important whether the descriptions are pre- or post- composed. For those who don't have the informatics capacity to deal with post-composition we can have an application ontology off to the side of ENVO. We face a similar problem with food. Ideally we would use some external food ontology. This would presumably have to meet some minimal requirements, such as for example a basis in biology/ chemistry rather than culture or culinary arts. I don't know exactly what those requirements are, it depends on the questions we want to ask and useful ways of clustering data for the purposes of discovery. For example, before arguing about what to do with terms like "peanut butter and jam sandwich" we should think about what questions the ontology should be able to answer: - what are the parts? (bread, peanut butter, jam) -- recursively? (sugar content, oil viscosity, ubiquinone content) - what is it derived from? (legume, fruit, cereal) -- means of derivation (preservation, grinding, leavening) - does it spoil easily? -- what does it decompose into? - is it industrially prepared? presumably some of these are important, otherwise we can dispense with this issue and have a single term "food" and ignore all other distinctions. Once we have prioritized these and come up with realistic scenarios we should have a better idea how to proceed. I think then our course of action will fall somewhere between two extremes: 1) collect somewhere off to the side of ENVO an unorganized list of food terms on an as-needed basis, with a view to eventually handing this off or merging with the food ontology. Or if the food ontology never transpires, gradually adding N+S conditions to the classes and classifying mostly automatically. 2) post-composition of all food descriptions. This could be from an ultra-minimalist list from the multi-species anatomy ontology: organism substance blood milk portion of tissue oil Then we can compose descriptions easily using existing ontologies: RO, PATO, GO, ... + a taxonomy. We can really go to town and specify the N+S conditions for a PBJ sandwich based on mereological constraints (PBJ must be surrounded_by a bilayer of white bread) and derivation relations (PBJ derived_from peanut, oil, substance derived_from fruit+pectin). But this probably wouldn't be the best use of our time. The middle ground is to have a minimal food ontology containing the basics, such as bread, preserved food etc, and allowing people to post-compose descriptions to the extent which they feel worthwhile. There is also the issue of what is a food and what isn't. I don't think 'food' is such a good top level. There's organic matter and it's derivatives. Some substances have roles as foods for some organisms (or cultures), other substances are toxic to other organisms. We can do this with a has_role relation. |