From: Arnaud K. <ax...@sa...> - 2004-10-08 16:32:00
|
Hi While loading our whole set of data into the Feature-Land/Central Dogma part of GUS, we've got some issues with the items below. I would like to know what other groups using GUS think about them. * Ontologies: => parsing the OBO format. As the OBO format has become the standard for ontologies, is there any plan to update the GO loading plugin to be able to parse this format ? => GO and sequence ontology are in different tables, Could we generalize the schema for handling ontologies, by designing a unique set of tables that would be used for loading any ontology. In future we will need extra ontologies, I'm thinking of the Evidence Code Ontology for example but that should apply to any ontology part of the OBO project (http://obo.sourceforge.net). * Gene Naming nomenclature: There was a thread a while ago about redesigning the naming nomenclature of genes in GUS, I don't think anything has been implemented yet. This is something important for us and it is not fully covered yet by GUS. This is our requirements: * /systematic_id - final systematic name for when chromosome is finished or stable sequence is submitted, will be title for gene page in absence of a primary_name. * /temporary_systematic_id - for temporary systematic name used during projects where sequence is unfinished, i.e temporary name for the shotgun sequences. * /previous_systematic_id - for systematic names *no longer in use*. Can be seen on geneDB pages and queried. When systematic_id is assigned, the temporary_systematic_id can be assigned to this qualifier. * /synonym - used for other gene names still in use and to be displayed on the gene page * /obsolete_name - redundant gene names that can be queried but *are not visible on gene page* eg. errors * /primary_name - for published or agreed unique *user friendly gene name*, following the convention set out for kinetoplastids, will be the title for gene page. NB. This is *equivalent to* the EMBL-qualifier *standard_name* so it should be used "to give full gene name, but use /gene to give gene symbol". * /reserved_name - pre-publication names that will, presumably, become the primary_name. * Curation and comments: We can't at the moment store in GUS our curation and note qualifiers. There is a DoTS.Comments table but how can we attach comments to feature objects ? * Sres.BibliographicReference: How can they be associated with sequence and feature objects ? * Core.Algorithm table: Is this table only for registering plugins ? Can we use it as a controlled vocabulary table for storing software entries or it is not meant for this ? We'd like to store software entries, for example a GeneFeature predicted by geneid v1.2 or a SignalPeptide Feature generated by SignalP v3.0 * Similarity data: Can it be used for loading any similarity data, e.g. Fasta, BLAST, exonerate, Blat etc. I think it is possible but how can we store which software has been used (that refers to the former item). * Curation of homology data: Is it possible to specify that a orthologous or paralogous group has been manually curated (by adding a review_status_id attribute ?). cheers Arnaud |