From: Brian B. <br...@pc...> - 2007-02-07 22:02:51
|
I would like to raise for discussion an issue that we have been faced with here at Penn in the last couple of weeks. It turns out that in GUS3.5 all named attributes of imp tables are now included in the superclass view. Following is the aafeature view: column nulls? type AA_Feature_ID no NUMBER(10,0) AA_SEQUENCE_ID no DoTS::AASequence (NUMBER(10,0) ) Feature_NAME_ID DoTS::FeatureName (NUMBER(10,0) ) PARENT_ID DoTS::AAFeature (NUMBER(10,0) ) NA_Feature_ID DoTS::NAFeature (NUMBER(10,0) ) SUBCLASS_VIEW STRING(30) SEQUENCE_ONTOLOGY_ID SRes::SequenceOntology (NUMBER(10,0) ) DESCRIPTION STRING(4000) PFAM_ENTRY_ID DoTS::PfamEntry (NUMBER(10,0) ) MOTIF_AA_SEQUENCE_ID DoTS::AASequence (NUMBER(10,0) ) REPEAT_TYPE_ID DoTS::RepeatType (NUMBER(10,0) ) EXTERNAL_DATABASE_RELEASE_ID SRes::ExternalDatabaseRelease (NUMBER (10,0) ) SOURCE_ID STRING(50) PREDICTION_ALGORITHM_ID Core::Algorithm (NUMBER(5,0) ) IS_PREDICTED no NUMBER(1,0) REVIEW_STATUS_ID SRes::ReviewStatus (NUMBER(12,0) ) MODIFICATION_DATE no DATE My understanding of the intent of superclass views is that they have attributes that are common to all subclasses in a typical class hierarchy. From a biological perspective, this would be attributes that are common to all (or at least nearly all) features. Because all named attributes are now in this view, there are a number of attributes that are not relevant for all aafeatures (or even the majority of aafeatures). This is because GUS also has a requirement that all fields used in foreign key references must be named attributes. This is for clarity and makes obvious sense. This means, however, that now these foreign key attributes are included in the superclass view (and inherited by all the subclasses). The above table has three such attributes that I think would be much better served only in the specific views that need them such as DoTS::RepeatRegionAAFeature for repeat_type_id. I don't see any views where it is obvious to me one would want to use pfam_entry_id or motif_aa_sequence_id. PFAM_ENTRY_ID DoTS::PfamEntry (NUMBER(10,0) ) MOTIF_AA_SEQUENCE_ID DoTS::AASequence (NUMBER(10,0) ) REPEAT_TYPE_ID DoTS::RepeatType (NUMBER(10,0) ) The motif_aa_sequence_id is particularly problematic as this causes there to be two foreign key references into the aasequence table. The PERL object layer only supports having two foreign key references to a table via manual entries into the special cases file. This works fine for a limited number of tables (views) but is overwhelming when dealing with a large number such as for all the aafeature views. I would propose that we drop these extra named attributes from the superclass views in the next release of GUS. This will clean up all the views and make them easier to understand. It will also alleviate problems with the PERL object layer related to multiple foreign key references. For those views that contain multiple references such as RepeatRegionAAFeature, we should enter the necessary lines in the special_cases file so that the objects work without coding around the limitation (which is what I suspect persons are doing now). Comments? -Brian Brian P. Brunk, Ph.D. ApiDB Senior Manager 1424 Blockley Hall Penn Center For Bioinformatics University of Pennsylvania Philadelphia PA 19104-6021 Tel: 215-573-3118 Fax: 215-573-3111 |