|
From: Arnaud K. <ax...@sa...> - 2003-05-08 09:30:10
|
Hi Steve
What about genes which have synonyms but don't have an approved primary
name yet ?
See below the complete list of gene names we are using. This list
doesn't only include the primary name and its synonyms which are, as far
as I understood, infered from the function of the gene. It also includes
systematic names, assigned by the sequencing centres.
Arnaud
Additional qualifiers to be used in place of /gene for GeneDB
purposes
* /systematic_id - final systematic name for when chromosome is
finished or stable sequence is submitted, will be title for gene
page in abscence of standard name. Could be /locus_tag, the EMBL
equivalent (to be discussed??)
* /temporary_systematic_id - for temporary systematic name used
during projects where sequence is unfinished, i.e temporary name
for the shotgun sequences.
* /previous_systematic_id - for systematic names no longer in use.
* /synonym - used for other gene names still in use and to be
displayed on the gene page
* /obsolete_name - redundant gene names that can be queried but are
not visible on gene page eg. errors
* /primary_name - for published or agreed unique user friendly gene
name, following the convention set out for kinetoplastids, will be
the title for gene page. NB. this is an EMBL-compliant qualifier
so it should be used "to give full gene name, but use /gene to
give gene symbol".
* /reserved_name - pre publication names that will, presumably,
become the standard_name
Steve Fischer wrote:
> folks-
>
> right now in GUS, we have a bunch of tables and attribute that relate
> to gene symbols, names and aliases:
>
> Dots::Gene.name
> Dots::Gene.gene_symbol
> Dots::GeneAlias
> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is
> intended to store references to external database entries. it is
> hackish to encode in the schema that we assume that such entries are
> gene records. they could easily be proteins or journals, whatever)
>
> This schema is being used by the DoTS project to hold both automated
> assignments of gene_symbol (Sres::DbRef) and manual assignments. The
> problem for the DoTS project is that these disparate ways of making
> assignments are not managed as a coherent whole. The manual and
> automated assignments are not queried together.
> I am thinking that we should consider a different approach, one
> modeled on how we store GO assignments. It seems that Gene symbols
> and GO terms are very similar. they are both amenable to contolled
> vocabs, and are both assigned by automated and manual operations.
> This pattern may apply to other types of annotation as well.
>
>
> 1. introduce a GeneName table:
> GeneName.gene_name_id
> GeneName.name --- the full name
> GeneName.symbol -- the symbol
>
> 2. introduce a GeneSynonym table:
> GeneSynonym.gene_name_id -- the GeneName it is a synonym for
> GeneSynonym.name -- the full name of the synonym
> GeneSynonym.symbol -- the symbol
>
> these tables are treated as controlled vocabularies, downloaded from
> sites such as HUGO and MGI.
>
>
> 3. introduce a GeneNameAssociation table -- a mapping between Gene and
> GeneName (better name for this??)
> GeneNameAssociation.gene_id
> GeneNameAssociation.gene_name_id
> GeneNameAssociaction.review_status_id
> GeneNameAssociaction.is_not
> probably adopt here an instance and evidence mechanism similar to go
> assocation.
>
> note that this implies a m-m relationship between gene and gene name.
> while this might not be true in the ideal sense, it may well be true
> for tentative data, which is what we often have. so, this model
> accepts that unfortunate fact, and does the best to preserve as much
> info as we can.
>
>
|