Might be worth to investigate.
Especially I think it can happen that a gene entry might not be attached
to any genomic sequence, but such a gene should be attached to an organism.
Also I've been working on speeding up the queries and I noticed that
replacing the index on NASEQUENCEIMP.taxon_id by a bitmap index makes
the queries MUCH faster.
I think it's worth replacing in the file, dots-indexes.sql :
=> create index @oracle_dots@.NASEQUENCEIMP_IND16 on
@oracle_dots@.NASEQUENCEIMP (TAXON_ID) TABLESPACE
@oracle_dotsIndexTablespace@;
By:
=> create bitmap index @oracle_dots@.NASEQUENCEIMP_IND16 on
@oracle_dots@.NASEQUENCEIMP (TAXON_ID) TABLESPACE
@oracle_dotsIndexTablespace@;
Arnaud
---------------------------------
Benchmark - Bear in mind I've used the gusdev schema
*** With Bitmap index:
select count (g.name)
from gene g, ExternalNaSequence ena, Taxon t, GeneFeature gf,
GeneSequence gs
where t.taxon_id = 150
and t.taxon_id = ena.taxon_id
and ena.na_sequence_id = gf.na_sequence_id
and gf.na_feature_id = gs.na_feature_id
and gs.gene_id = g.gene_id;
COUNT(G.NAME)
-------------
4957
Elapsed: 00:00:02.76
*** Without:
select count (g.name)
from gene g, ExternalNaSequence ena, Taxon t, GeneFeature gf,
GeneSequence gs
where t.scientific_name = 'Schizosaccharomyces pombe'
and t.taxon_id = ena.taxon_id
and ena.na_sequence_id = gf.na_sequence_id
and gf.na_feature_id = gs.na_feature_id
and gs.gene_id = g.gene_id;
COUNT(G.NAME)
-------------
4957
Elapsed: 00:00:12.06
Steve Fischer wrote:
> folks-
>
> i think that we might want to add taxon_id to Gene.
>
> As it stands now i believe that you need to go through RNA,
> RNAInstance, RNAFeature, NASequence to find this out.
>
> Steve
>
|