Re: [SO-devel] [Gmod-schema] Really basic question about mRNA and gene true path relation

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Jim,

Here are a few of points to keep in mind in your studies
of Chado structure and SO/SOFA relations:

Existing chado databases are often structured by history and
necessity more than current best ontology information, due to cost of
updating data structures.  FlyBase's chado structure over the last few
years has not been current with SO ontologies and relations, in some
cases.

You observed> ".. sample data from Flybase it seems like it's common 
[for FlyBase data managers] to use part_of when mRNAs...".  FlyBase
also uses an obsolete SO term "so", and other historical terms.

The mapping between SO's feature relations and Chado's feature
tables is not one-to-one. SO should be biologically correct while
Chado aims for computational usefulness. Storage of features
among the various tables in the database is guided by database needs.  
However, use of vocabulary terms in a chado database should reflect the
ontology schema rather than the database schema.

On this,
" I'm working on mapping various E. coli annotation sets to Chado.."

it is helpful to follow existing examples, but not too far if they
don't match current ontology relations. The main reason in following
example databases would be to keep software working. Software *should
be* adaptable to use different ontology relations.

At one point Chado software had 5 or 6 different choices for what
people called the SO or SOFA ontology in the CV table, and that wasn't
enough.  This may still be the case, and gives you a simple idea where
lack of standard practices affect the software.

The GMODTools chado output software I've written has various
choices for configuring ontology terms and gene model structures
for different database sources.  E.g. yeast chado lacks mRNA features
between gene and protein levels.

-- Don Gilbert
-- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
-- gil...@in...--http://marmot.bio.indiana.edu/