From: Jonathan C. <cra...@sn...> - 2003-01-17 05:41:27
|
Arnaud - > > Which DNA/RNA features do you mean (other than those mentioned above)? > > The file I sent you should include views on the top of NAFeatureImp > table. Here the list : Yes, you're absolutely right; there was a period when I wasn't paying very close attention to the schema mailing list, and I'm afraid I misplaced a couple of the files you sent, at least temporarily. I believe I've now added all the views and tables that you originally proposed, with some minor modifications to take into account discussions we've had since then. See the attached text file for a complete list of the changes I've made this time around. > Yes we had! So regarding chromosome regions, shall we keep > TelomereFeature and CentromereFeature ? No, I think we should use ChromosomeElementFeature instead; I've created this view based on the ChromosomeElement view you suggested, but with a couple of additional columns to handle the data currently in gusdev.TelomereFeature and gusdev.CentromereFeature. > > At > > the other extreme, we could continue what we're doing now, i.e. using > > an ad-hoc classification of features based on the data we actually have > > available, and just make sure that every feature is tagged with the > > correct sequence ontology term. Any thoughts? > > It makes sense as SO may undergo revisions this year. OK, as noted in the attachment, I've added sequence_ontology_id to *all* views of NAFeatureImp and AAFeatureImp. > >> A controlled vocabulary table with the four attributes you've > >> mentioned is fine. Done; it's called ProteinPropertyType, and the schema/contents are described in the attached list of changes. > >> As you're going to add a extra attribute sequence_ontology_id to the > >> NA Features, could you do the same to any AA Features ? OK, done. > The way the SignalPeptideFeature is designed make difficult the > annotation of localization signal features. We can leave > SignalPeptideFeature as it is as it fits with SignalP software > prediction and in the future create a new feature LocalizationSignalFeature. OK, based on our discussion today the only change I've made to SignalPeptideFeature is to add the sequence_ontology_id, which can be used to reference the different localization ontology terms that you mentioned. A column has been added to SequenceOntology to let us store multiple ontologies (and versions thereof) in the same table. Experimental evidence, references, and annotator's comments can be linked to SignalPeptideFeature (or a future LocalizationSignalFeature view) using DoTS.Evidence. > >> I reckon they could be merged. (This comment was in reference to incorporating TM domain features into the DomainFeature view.) I've added a "number_of_domains" column to DomainFeature to permit this. We will *not* have a separate view specifically for TM domain features. > > I also realized belatedly that I could have left the Interaction table > > unchanged, rather than introducing specific references to RowSet. This > > would have allowed us to represent either singleton effectors/targets or > > set-valued effectors/targets, without having to always join through > > RowSet > > in the singleton case. On the other hand, if we do associate some > > additional information with the RowSets, then the current representation > > is correct. > > It depends if we want to represent many-to-many relationship between > interaction and members of this interaction. Without the RowSet table, > we can't assign a set of several effectors/targets, right ? Unless we > consider that this set of effectors are being part of a complex and act > as the whole. It's true that without the RowSet table we can't assign a set of several effectors or targets. What I was trying to say was that I replaced the following rows in DoTS.Interaction-- effector_table_id effector_row_id (or something to that effect) using instead a single row that references a RowSet: effector_row_set_id However, I could have left the Interaction table unchanged, and used the effector_table_id and effector_row_id to reference entries in the RowSet table (in the case where there are multiple effectors.) With this approach one would have the choice of either using or not using the RowSet table on a case-by-case basis. I don't think it's too important which way we do this; on the one hand you save a join when you only need to reference a single effector/target (using the table_id/row_id approach) but on the other hand with the row_set_id approach you can write uniform code and also have an enforceable referential integrity constraint. So barring any strong objection, I'll leave the table as it is now (i.e., with explicit references to RowSet, meaning that you always have to have a RowSet even when the effector or target is a single object.) > A case we came across here for Tbrucei is nested repeat regions (at the > DNA level). Each repeat region has coordinates and is annotated with a > unique repeat unit type. This repeat region can be within a bigger > repeat region annotated with a different repeat unit type. > ... which is in other words your suggestion with parent_id as an extra > attribute ... I haven't added the parent_id yet, but I'll do so. > Regarding transposon repeat types, if we have a TransposableElement > feature and its type is given as an attribute, a repeat feature will > just be useful to locate the LTRs within a given a transposable element. > Can we keep this functionality ? Then the feature will be simple, just a > repeat_type, and a parent_id atributes. Are you saying that we still need the two tables/features, one for RepeatFeature, the other for RepeatRegionFeature? Could you give me a specific example of how you would envision using these tables (and also these tables in conjunction with the TransposableElement view, under the assumption that they're all equipped with parent_ids)? > Let's leave the design as it is for now. Curators are not going to > curate interactions data in the short term. We shall come back later > with more precise ideas/use cases about them. Sounds good. Let me know if there's anything I've missed. I'll try to generate updated SQL scripts tomorrow, and also update the schema browser so that everyone can review the changes one last time. Cheers, Jonathan |