We are missing relationships in SO that would be required to fully capture the relationships in a BSML feature-group describing a tRNA/rRNA.
Using the BSML output from tRNAscan-SE as an example, we have:
<Feature-group id="Bsml37" group-set="bid17393.gene.845554351.1">
<Feature-group-member feature-type="gene" featref="bid17393.gene.845554351.1"></Feature-group-member>
<Feature-group-member feature-type="CDS" featref="bid17393.CDS.845554354.1"></Feature-group-member>
<Feature-group-member feature-type="tRNA" featref="bid17393.tRNA.845554352.1"></Feature-group-member>
<Feature-group-member feature-type="exon" featref="bid17393.exon.845554353.1"></Feature-group-member>
</Feature-group>
However, no feature_relationships are created linking in the CDS or the exon; the only relationship is between the tRNA and the gene. This is because the requisite entries in the cvterm_relationship table are absent. Querying a database created by the initdb component:
SELECT t.name, s.name, o.name
FROM cvterm_relationship cr, cvterm t, cvterm s, cvterm o
WHERE cr.type_id = t.cvterm_id
AND cr.subject_id = s.cvterm_id
AND cr.object_id = o.cvterm_id
AND s.cvterm_id IN (SELECT cvterm_id FROM cvterm where name IN ('CDS', 'transcript', 'polypeptide', 'gene', 'rRNA', 'tRNA', 'exon'))
AND o.cvterm_id IN (SELECT cvterm_id FROM cvterm where name IN ('CDS', 'transcript', 'polypeptide', 'gene', 'rRNA', 'tRNA', 'exon'))
ORDER BY t.name, s.name, o.name;
name | name | name
--------------+-------------+------------
derives_from | CDS | transcript
derives_from | polypeptide | CDS
derives_from | rRNA | gene
derives_from | transcript | gene
derives_from | tRNA | gene
part_of | exon | transcript
part_of | polypeptide | transcript
The proposed solution is to add relationships for CDS and exon that link them to the other members of the feature-group.
Information on GMOD's recommendation can be found here: http://gmod.org/wiki/Chado_Best_Practices#Noncoding_Genes
An old TIGR bug case addressed the tRNAscan-SE encoding issue here: http://sangiuoli-lx.igs.umaryland.edu:8080/bugzilla/show_bug.cgi?id=3711
Added the following relationships to so.aux.obo in revision 4160:
tRNA derives_from exon
rRNA derives_from exon
CDS derives_from gene
After discussion with Joshua, changing model again so that tRNA and rRNA replace the role of transcript in the gene graph. All cvterm relationships after new change are:
SELECT t.name, s.name, o.name
FROM cvterm_relationship cr, cvterm t, cvterm s, cvterm o
WHERE cr.type_id = t.cvterm_id
AND cr.subject_id = s.cvterm_id
AND cr.object_id = o.cvterm_id
AND s.cvterm_id IN (SELECT cvterm_id FROM cvterm where name IN ('CDS', 'transcript', 'polypeptide', 'gene', 'rRNA', 'tRNA', 'exon'))
AND o.cvterm_id IN (SELECT cvterm_id FROM cvterm where name IN ('CDS', 'transcript', 'polypeptide', 'gene', 'rRNA', 'tRNA', 'exon'))
ORDER BY t.name, s.name, o.name;
name | name | name
--------------+-------------+------------
derives_from | CDS | rRNA
derives_from | CDS | transcript
derives_from | CDS | tRNA
derives_from | polypeptide | CDS
derives_from | rRNA | gene
derives_from | transcript | gene
derives_from | tRNA | gene
part_of | exon | rRNA
part_of | exon | transcript
part_of | exon | tRNA
part_of | polypeptide | transcript