Re: [Gmod-schema] Re: genetic module

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello,

On Feb 23, 2005, at 6:06 PM, Chris Mungall wrote:

> On Wed, 23 Feb 2005, Pinglei Zhou wrote:
>
>>
>>> Hi Pinglei,
>>>
>>> This schema looks good (it solves the unique observable_id problem,=20=

>>> and
>>> much more).  But, I still have two questions (maybe it is one
>>> question):
>>>
>>> Where do you put genetic interactions (suppresion, enhancement,=20
>>> etc.)?
>>
>> it should involve phenstatement/phenotype_comparison tables.
>>
>>> What is the purpose of the phenotype_comparison table?
>>
>> phenotype_comparison will replace the old
>> interaction/interactionobject/interaction/subject table
>
> I don't think this is an adequate explanation! The
> interaction/interactionobject/interaction/subject tables were removed=20=

> from
> the genetic module at the end of 2003.
>
> I was under the impression that the version of the genetic module that=20=

> was
> in cvs was the latest one, and relatively stable.

So did we, which is why we started working with it.  The fact that the=20=

xml dtd was for a different schema was the tip off that something had=20
changed, but we didn't know when (i.e. was the dtd older or newer than=20=

the current chado genetic schema).

> This is the version
> based around 'gcontext's, and reasonably well commented.

Yes! The comments are extremely useful.

> It seems that
> this has been abandoned and there is now a new version. It would=20
> therefore
> seem like a good idea to commit this new one to cvs and to add some
> comments to it

We would really appreciate that.

Let me make a few comments and raise a few questions based on our=20
comparison (since yesterday afternoon) of the old(gcontext) version and=20=

the current flybase version that Pinglei kindly sent us.  We would like=20=

to understand the design before deciding to modify it! And we would=20
prefer not to modify at all (other than by adding a stock sub-module=20
which we are happy to share) if we can handle our current (small=20
compared to drosophila) stock collection and our rnai data.

gcontext table has become genotype table. OK

phenstatement table has been replaced by phenotype and phenstatement=20
tables, this is much better (for us) because there is better=20
normalisation.

Environment was previously in gcontextprop, at least according to the=20
comments in the sql. There are now two tables, environment and=20
env_cvterm, and there is no longer a gcontextprop table.  We have a=20
problem with this for the following reason.: EITHER there is one=20
environment property (for example temperature_degrees_c) in which case=20=

there is no need for env_cvterm table, it would be sufficient to use=20
the environment table, add type_id for the cvterm and use description=20
for the value, OR an environment involves more than one property (for=20
example temperature_degree_c and humidity_percent) but in that case,=20
there should be a value (text) in the env_cvterm table!  Because it is=20=

not logical to put two values in the description field in the=20
environment table.   In this case, it would seem more logical for=20
env_cvterm to be called environmentprop (with key, value pairs) :

table environmentprop
environmentprop_id   [U,PK]
environment_id           [U]
type_id                          [U]
value	            text

Two tables now contain type_id, phenstatement and phentoype_comparison.

We are guessing that the type_id in phenstatement could be used for=20
'dominant, semi-dominant, recessive etc.'.  Is this the intention of=20
the design? If so (and even though this is **not** a problem for our=20
Paramecium data) how would you include something like=20
penetrance_pct=3D80? Again, where would the value go? Or are we missing=20=

the point on the type_id?

We are guessing that the type_id in phenotype_comparison is for genetic=20=

interactions (suppression, complementation, enhancement etc.). The=20
problem we have here, is that we do not understand why there is=20
environment1 and environment2, since the genetic interaction (between=20
genotype1,phenotype1 and genotype2,phenotype2) is being evaluated in a=20=

given environment.

Linda

>
>> Pinglei
>>
>>
>>>
>>> On Feb 23, 2005, at 4:45 PM, Pinglei Zhou wrote:
>>>
>>>> Hi Linda,
>>>>
>>>> Not sure if I mentioned before, genetic module is still under
>>>> development, we (FlyBase) are testing on a new
>>>> design which is much different from the one in cvs. Here test I=20
>>>> means
>>>> we try to feed Drosophila genetic to see
>>>> if it will break the schema. welcome to give any suggestions.
>>>>
>>>>
>>>> Pinglei
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> genetic module currently tested by FlyBase
>>>>
>>>> create table genotype (
>>>>        genotype_id serial not null,
>>>>        primary key (genotype_id),
>>>>        uniquename text not null,
>>>>        description varchar(255),
>>>>        unique(uniquename)
>>>> );
>>>> create index genotype_idx1 on genotype(uniquename);
>>>> GRANT ALL on genotype_genotype_id_seq to PUBLIC;
>>>> GRANT ALL on genotype to PUBLIC;
>>>>
>>>>
>>>>
>>>> create table feature_genotype (
>>>> 	feature_genotype_id serial not null,
>>>> 	primary key (feature_genotype_id),
>>>> 	feature_id int not null,
>>>> 	foreign key (feature_id) references feature (feature_id) on =
delete
>>>> cascade,
>>>> 	genotype_id int not null,
>>>> 	foreign key (genotype_id) references genotype (genotype_id) on=20=

>>>> delete
>>>> cascade,
>>>> 	chromosome_id int,
>>>> 	foreign key (chromosome_id) references feature (feature_id) on=20=

>>>> delete
>>>> set null,
>>>> 	rank int not null,
>>>> 	cgroup 	int not null,
>>>> 	cvterm_id int not null,
>>>> 	foreign key (cvterm_id) references cvterm (cvterm_id) on delete
>>>> cascade,
>>>> 	unique(feature_id, genotype_id, cvterm_id)
>>>> );
>>>> create index feature_genotype_idx1 on feature_genotype =
(feature_id);
>>>> GRANT ALL on feature_genotype to PUBLIC;
>>>>
>>>> create index feature_genotype_idx2 on feature_genotype=20
>>>> (genotype_id);
>>>> GRANT ALL on feature_genotype to PUBLIC;
>>>>
>>>> create table environment (
>>>> 	environment_id serial not NULL,
>>>> 	primary key  (environment_id),
>>>>         uniquename text not null,
>>>> 	description 	text,
>>>> 	unique(uniquename)
>>>> );
>>>> create index environment_idx1 on environment(uniquename);
>>>>
>>>>
>>>> create table env_cvterm (
>>>> 	env_cvterm_id serial not null,
>>>> 	primary key  (env_cvterm_id),
>>>> 	environment_id 	int not null,
>>>> 	foreign key (environment_id) references environment=20
>>>> (environment_id)
>>>> on delete cascade,
>>>> 	cvterm_id int not null,
>>>> 	foreign key (cvterm_id) references cvterm (cvterm_id) on delete
>>>> cascade,
>>>>
>>>> 	unique(environment_id, cvterm_id)
>>>> );
>>>> create index env_cvterm_idx1 on env_cvterm (environment_id);
>>>> create index env_cvterm_idx2 on env_cvterm (cvterm_id);
>>>>
>>>>
>>>> create table phenotype (
>>>> 	phenotype_id serial not null,
>>>> 	primary key (phenotype_id),
>>>>         uniquename text not null,
>>>> 	observable_id int,
>>>> 	foreign key (observable_id) references cvterm (cvterm_id) on =
delete
>>>> cascade,
>>>> 	attr_id int,
>>>> 	foreign key (attr_id) references cvterm (cvterm_id) on delete =
set
>>>> null,
>>>> 	value text,
>>>> 	cvalue_id int,
>>>> 	foreign key (cvalue_id) references cvterm (cvterm_id) on delete =
set
>>>> null,
>>>> 	assay_id int,
>>>> 	foreign key (assay_id) references cvterm (cvterm_id) on delete =
set
>>>> null,
>>>> 	unique( uniquename )
>>>> );
>>>> create index phenotype_idx1 on phenotype (cvalue_id);
>>>> create index phenotype_idx2 on phenotype (observable_id);
>>>> create index phenotype_idx3 on phenotype (attr_id);
>>>>
>>>> GRANT ALL on phenotype_phenotype_id_seq to PUBLIC;
>>>> GRANT ALL on phenotype to PUBLIC;
>>>>
>>>> create index phenotype_idx1 on phenotype (type_id);
>>>> GRANT ALL on phenotype to PUBLIC;
>>>>
>>>> create index phenotype_idx2 on phenotype (pub_id);
>>>> GRANT ALL on phenotype to PUBLIC;
>>>>
>>>> create index phenotype_idx3 on phenotype (background_genotype_id);
>>>> GRANT ALL on phenotype to PUBLIC;
>>>>
>>>> create table phenotype_cvterm (
>>>> 	phenotype_cvterm_id serial not null,
>>>> 	primary key (phenotype_cvterm_id),
>>>> 	phenotype_id int not null,
>>>> 	foreign key (phenotype_id) references phenotype (phenotype_id) =
on
>>>> delete cascade,
>>>> 	cvterm_id int not null,
>>>> 	foreign key (cvterm_id) references cvterm (cvterm_id) on delete
>>>> cascade,
>>>> 	unique(phenotype_id, cvterm_id)
>>>> );
>>>>
>>>> GRANT ALL on phenotype_cvt_phenotype_cvt_seq to PUBLIC;
>>>> GRANT ALL on phenotype_cvterm to PUBLIC;
>>>>
>>>> create index phenotype_cvterm_idx1 on phenotype_cvterm=20
>>>> (phenotype_id);
>>>> GRANT ALL on phenotype_cvterm to PUBLIC;
>>>>
>>>> create index phenotype_cvterm_idx2 on phenotype_cvterm (cvterm_id);
>>>> GRANT ALL on phenotype_cvterm to PUBLIC;
>>>>
>>>>
>>>>
>>>> create table phenstatement (
>>>> 	phenstatement_id serial not null,
>>>> 	primary key (phenstatement_id),
>>>> 	genotype_id int not null,
>>>> 	foreign key (genotype_id) references genotype (genotype_id) on=20=

>>>> delete
>>>> cascade,
>>>> 	environment_id int,
>>>> 	foreign key (environment_id) references environment=20
>>>> (environment_id)
>>>> on delete cascade,
>>>> 	phenotype_id int not null,
>>>> 	foreign key (phenotype_id) references phenotype (phenotype_id) =
on
>>>> delete cascade,
>>>> 	type_id int,
>>>> 	foreign key (type_id) references cvterm (cvterm_id) on delete=20
>>>> cascade,
>>>> 	pub_id int not null,
>>>> 	foreign key (pub_id) references pub (pub_id) on delete cascade,
>>>> 	unique(genotype_id, environment_id, phenotype_id, type_id, =
pub_id)
>>>> );
>>>> create index phenstatement_idx1 on phenstatement (genotype_env_id);
>>>> create index phenstatement_idx2 on phenstatement (phenotype_id);
>>>>
>>>>
>>>>
>>>> create table feature_phenotype (
>>>>        feature_phenotype_id serial not null,
>>>>        primary key (feature_phenotype_id),
>>>>        feature_id int not null,
>>>>        foreign key (feature_id) references feature (feature_id) on
>>>> delete cascade,
>>>>        phenotype_id int not null,
>>>>        foreign key (phenotype_id) references phenotype=20
>>>> (phenotype_id)
>>>> on delete cascade,
>>>>        unique(feature_id,phenotype_id)
>>>> );
>>>> GRANT ALL on feature_pheno_feature_pheno_seq to PUBLIC;
>>>> GRANT ALL on feature_phenotype to PUBLIC;
>>>>
>>>> create index feature_phenotype_idx1 on feature_phenotype=20
>>>> (feature_id);
>>>> GRANT ALL on feature_phenotype to PUBLIC;
>>>>
>>>> create index feature_phenotype_idx2 on feature_phenotype
>>>> (phenotype_id);
>>>> GRANT ALL on feature_phenotype to PUBLIC;
>>>>
>>>>
>>>> create table phendesc (
>>>> 	phendesc_id serial not null,
>>>> 	primary key (phendesc_id),
>>>> 	genotype_id int not null;
>>>>         foreign key (genotype_id) references genotype (genotype_id)=20=

>>>> on
>>>> delete cascade,
>>>> 	environment_id int;
>>>>         foreign key (environment_id) references environment (
>>>> environment_id) on delete cascade,
>>>> 	description text not null,
>>>> 	pub_id int not null,
>>>> 	foreign key (pub_id) references pub (pub_id) on delete cascade,
>>>>
>>>> 	unique(genotype_id, environment_id, pub_id)
>>>> );
>>>> create index phendesc_idx1 on phendesc (genotype_id);
>>>> create index phendesc_idx2 on phendesc (environment_id);
>>>> create index phendesc_idx3 on phendesc (pub_id);
>>>>
>>>>
>>>>
>>>> create table phenotype_comparison (
>>>> 	phenotype_comparison_id serial not null,
>>>> 	primary key (phenotype_comparison_id),
>>>> 	genotype1_id int not null,
>>>>         foreign key (genotype1_id) references genotype =
(genotype_id)
>>>> on delete cascade,
>>>> 	environment1_id int,
>>>>         foreign key (environment1_id) references environment
>>>> (environment_id) on delete cascade,
>>>> 	genotype2_id int not null,
>>>>         foreign key (genotype2_id) references genotype =
(genotype_id)
>>>> on delete cascade,
>>>> 	environment2_id int,
>>>>         foreign key (environment2_id) references environment
>>>> (environment_id) on delete cascade,
>>>> 	phenotype1_id int not null,
>>>>         foreign key (phenotype1_id) references phenotype
>>>> (phenotype_id) on delete cascade,
>>>> 	phenotype2_id int,
>>>>         foreign key (phenotype2_id) references phenotype
>>>> (phenotype_id) on delete cascade,
>>>> 	type_id int not null,
>>>>         foreign key (type_id) references cvterm (cvterm_id) on=20
>>>> delete
>>>> cascade,
>>>> 	pub_id 	int not null,
>>>>         foreign key (pub_id) references pub (pub_id) on delete=20
>>>> cascade,
>>>> 	unique(genotype1_id, environment1_id, genotype2_id,=20
>>>> environment2_id,
>>>> phenotype1_id, phenotype2_id,
>>>> type_id,
>>>> pub_id)
>>>> )
>>>>
>>> Linda Sperling
>>> Centre de G=E9n=E9tique Mol=E9culaire
>>> CNRS
>>> Avenue de la Terrasse
>>> 91198 Gif-sur-Yvette CEDEX
>>> FRANCE
>>>
>>> spe...@cg...
>>> +33 (0)1 69 82 32 09 	(telephone)
>>> +33 (0)1 69 82 31 50	(fax)
>>> http://paramecium.cgm.cnrs-gif.fr
>>>
>>> SAUVONS LA RECHERCHE
>>> http://recherche-en-danger.apinc.org
>>>
>>
>>
>>
>> -------------------------------------------------------
>> SF email is sponsored by - The IT Product Guide
>> Read honest & candid reviews on hundreds of IT Products from real=20
>> users.
>> Discover which products truly live up to the hype. Start reading now.
>> http://ads.osdn.com/?ad_ide95&alloc_id=14396&op=3Dclick
>> _______________________________________________
>> Gmod-schema mailing list
>> Gmo...@li...
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
Linda Sperling
Centre de G=E9n=E9tique Mol=E9culaire
CNRS
Avenue de la Terrasse
91198 Gif-sur-Yvette CEDEX
FRANCE

spe...@cg...
+33 (0)1 69 82 32 09 	(telephone)
+33 (0)1 69 82 31 50	(fax)
http://paramecium.cgm.cnrs-gif.fr

SAUVONS LA RECHERCHE
http://recherche-en-danger.apinc.org