From: Chris S. <sto...@pc...> - 2007-02-21 23:09:36
|
Hi, We are loading a new set of OrthoMCL generated ortholog groups into an instance of GUS and found that the current SequenceGroup parent table has several specifically-named attributes (i.e., no generic attributes) that we can use for our purposes without renaming. The attributes we want to capture are quality metrics for an ortholog group. For example we'd like to capture the ave e-value rather that min and max as currently provided in SequenceGroup and all of its views including OrthologGroup. I'd like to make the following proposal, get feedback, and put in the GUS schema bug tracker. 1) Create a OrthoMCL view. This would be specific to our GUS instances but could be used as a template for renaming attributes in OrthologGroup. On the right I've indicated with <- the parent attribute that might be used where the name differs. Column nulls? type SEQUENCE_GROUP_ID NO NUMBER SUBCLASS_VIEW NO STRING(100) NAME STRING(500) DESCRIPTION STRING(2000) NUMBER_OF_MEMBERS NO NUMBER NUMBER_OF_TAXA NUMBER AVE_PERCENT_IDENTITY FLOAT <-MIN_MATCH_IDENTITY AVE_PERCENT_COVERAGE FLOAT <- MIN_PERCENT_MATCH AVE_E_VALUE FLOAT <- MIN_PVALUE_MANT AVE_DOMAIN_SHARING FLOAT <- MAX_SCORE BLAST_MATCH_RATIO FLOAT <- MIN_SCORE SEQ_GROUP_EXPERIMENT_ID FOREIGN KEY plus the usual housekeeping columns In this proposal, the OrthoMCLGroup attributes would have the same type but different semantics as implied by the names of attributes in the parent class and queries can easily avoid this by use of subclass view. 2) Rename the SequenceGroup attributes to be more generic. Suggested changes are: MAX_MATCH_IDENTITY -> GROUP_IDENTITY_METRIC1 MIN_MATCH_IDENTITY -> GROUP_IDENTITY_METRIC2 MAX_PERCENT_MATCH -> GROUP_MATCH_METRIC1 MIN_PERCENT_MATCH -> GROUP_MATCH_METRIC2 MAX_PVALUE_MANT -> GROUP_PVALUE_MANT MAX_PVALUE_EXP -> GROUP_PVALUE_EXP MAX_PVALUE_MANT -> GROUP_EVALUE_MANT MAX_PVALUE_EXP -> GROUP_EVALUE_EXP MAX_SCORE -> GROUP_QUALITY_METRIC1 MIN_SCORE -> GROUP_QUALITY_METRIC2 Thoughts? Chris Chris Stoeckert, Ph.D. Research Professor, Dept. of Genetics 1415 Blockley Hall, Center for Bioinformatics 423 Guardian Dr., University of Pennsylvania Philadelphia, PA 19104 Ph: 215-573-4409 FAX: 215-573-3111 http://www.cbil.upenn.edu |