From: Arnaud K. <ax...@sa...> - 2002-10-08 13:00:36
|
Hi I've attached the SQL statements for new views/tables in GUS3, as well as updates of existing views/tables. It covers a new sequence object and new DNA, RNA and protein features that we would like to use. Some of them have been designed to go along Sequence Ontology classification (see below). Here a summary of the list of new views or tables: Updated views/tables: * NAFeatureImp - modified: * name from varchar2(30) to varchar2(50) * RestrictionFragmentFeature - added: * type_of_cut (sticky or blunt) * SignalPeptideFeature - added: * targetting * GeneSynonym - added: * is_obsolete New views/tables: A new sequence object on the top of NASequenceImp: * GenomicSequence New views on the top of NAFeatureImp: * ChromosomeElement NB : (centromere, telomere => SO) * InflectionPointFeature * RepeatFeature NB : repeat type => SO * RepeatRegionFeature * ReplicationFeature * TransposableElementFeature NB: SO would give the type * RNARegulatory * RNASecondaryStructure * SpliceSiteFeature New views on the top of AAFeatureImp: * AASecondaryStructure * AATertiaryStructure * DomainFeature * PeptideProperty * PostTranslationModification * TransmembraneDomainFeature A new table: * PeptidePropertyType Note that the design takes into account the use of the Sequence Ontology (SO) to refine the types of some the features, eg to differentiate the different types of transposable elements, of repeats or of chromosome elements (centromere, telomere ...). e.g. Transposable Elements annotations: The different types of transposable elements would be given by specific SO terms. Bear in mind the procaryotes transposable elements are not covered by SO, but we are working on addings SO procaryotes specific terms. Here the current SO tree for transposable elements: Transposable Element ---> Non Retrotransposon ---> TIR Element ----> Terminal Inverted Repeat ---> Foldback Element ---> Retrotransposon ---> LTR Retrotransposon ----> Long Terminal Repeat ---> non LTR Retrotransposon ----> LINE Element ----> SINE Element LTRs, as well as genes, part of a transposable element would be features attached to a TransposableElement Feature. These genes would have the following SO term : transposable element gene, SO0000111. Regarding LTRs, they will be considered as Repeat Feature, annotated with the right Sequence Ontology terms. Let me know if you have any comments. cheers Arnaud PS: Sequence Ontology URL => http://www.geneontology.org/gobo/sequence.ontology/sequence.ontology |
From: Chris S. <sto...@SN...> - 2002-10-09 21:02:30
|
Hi Arnaud, Many thanks for compiling this. Just wanted to let you know that both Jonathan Crabtree and I plan to go through them to see if there are any proposals that require further discussion. Everyone else is encouraged to do so as well. Cheers, Chris On Tuesday, October 8, 2002, at 09:00 AM, Arnaud Kerhornou wrote: > Hi > > I've attached the SQL statements for new views/tables in GUS3, as well > as updates of existing views/tables. It covers a new sequence object > and new DNA, RNA and protein features that we would like to use. Some > of them have been designed to go along Sequence Ontology > classification (see below). > > Here a summary of the list of new views or tables: |
From: Chris S. <sto...@SN...> - 2002-10-16 22:45:49
|
Hi Arnaud, I finally went through your list. These will certainly enrich GUS! Some questions/issues though. First a general request for documentation of the tables and attributes to explain what they are to be used for. We have a plug-in that takes a file in the format: TableName\t\tdescription TableName\tAttributeName\tDescription In particular, I am curious as to what InflectionPointFeature and ReplicationFeature are. For the NAFeature views you propose, are you using "source_id" to point to SRes:SequenceOntology? If so, why not call the attribute "so_id"? Similarly, for GenomeSequence as a view of NASequence, is this what "source_id" is for? The AAFeature views have "name" attributes and I wonder whether we should have a table in SRes for controlled vocabulary terms for protein features that we can point to (as with sequence ontology). This would avoid the uncontrolled use of "name." I notice that PeptideProperty has been given a controlled vocabulary table PeptidePropertyType in this regard. Rather than have a table for each, we could centralize them. Any choices for the resource to use for these names? SWISS-PROT? Cheers, Chris On Tuesday, October 8, 2002, at 09:00 AM, Arnaud Kerhornou wrote: > From: Arnaud Kerhornou <ax...@sa...> > Date: Tue Oct 8, 2002 9:00:32 AM US/Eastern > To: gusdev-gusdev <gus...@li...>, > gen...@li... > Subject: [Gusdev-gusdev] DNA, RNA and Protein GUS Features + > PeptidePropertyType Table > > Hi > > I've attached the SQL statements for new views/tables in GUS3, as well > as updates of existing views/tables. It covers a new sequence object > and new DNA, RNA and protein features that we would like to use. Some > of them have been designed to go along Sequence Ontology > classification (see below). > > Here a summary of the list of new views or tables: > > Updated views/tables: > * NAFeatureImp - modified: > * name from varchar2(30) to varchar2(50) > * RestrictionFragmentFeature - added: > * type_of_cut (sticky or blunt) > * SignalPeptideFeature - added: > * targetting > * GeneSynonym - added: > * is_obsolete > > New views/tables: > > A new sequence object on the top of NASequenceImp: > * GenomicSequence > > New views on the top of NAFeatureImp: > * ChromosomeElement > NB : (centromere, telomere => SO) > * InflectionPointFeature > * RepeatFeature > NB : repeat type => SO > * RepeatRegionFeature > * ReplicationFeature > * TransposableElementFeature > NB: SO would give the type > * RNARegulatory > * RNASecondaryStructure > * SpliceSiteFeature > > New views on the top of AAFeatureImp: > * AASecondaryStructure > * AATertiaryStructure > * DomainFeature > * PeptideProperty > * PostTranslationModification > * TransmembraneDomainFeature > > A new table: > * PeptidePropertyType > > > Note that the design takes into account the use of the Sequence > Ontology (SO) to refine the types of some the features, eg to > differentiate the different types of transposable elements, of repeats > or of chromosome elements (centromere, telomere ...). |
From: Arnaud K. <ax...@sa...> - 2002-10-18 13:33:13
|
Hi Chris Chris Stoeckert wrote: > Hi Arnaud, > I finally went through your list. These will certainly enrich GUS! > Some questions/issues though. > First a general request for documentation of the tables and attributes > to explain what they are to be used for. We have a plug-in that takes > a file in the format: > > TableName\t\tdescription > TableName\tAttributeName\tDescription Sorry for the lack of documentation, I'm going to prepare a doc file. > > In particular, I am curious as to what InflectionPointFeature and > ReplicationFeature are. In Leishmania, but more generally for any organism which has polycistronic transcription, the inflection point represents the start of the transcription. There are some studies trying to find out whether or not it corresponds to a conserved sequence. If so, it might interesting for curator to annotate them. ReplicationFeature represents origins of replication. ReplicationFeature sounds more generic but they will be given a more specific SO term. > > For the NAFeature views you propose, are you using "source_id" to > point to SRes:SequenceOntology? If so, why not call the attribute "so_id"? > Similarly, for GenomeSequence as a view of NASequence, is this what > "source_id" is for? The source_id is not related to Sequence Ontology. The main point with my proposal is to replace controlled vocabularies specifying the type of a feature with SO. But to do so, I think we need a many to many relationship between feature views and SO. Could it be done by using the "DoTS::GOTermAssociation <http://www.cbil.upenn.edu/cgi-bin/GUS30/schemaBrowser.pl?db=GUS30&table=DoTS::GOTermAssociation&path=DoTS::GOTermAssociation>" table or cloning it ? As I realise it's an important point for the GUS design, please let me know if you agree or if you want to propose something else. I added a source_id to the sequence and NAfeature views because I can see that all feature objects have this attribute. What is this attribute for in GUS ? > > The AAFeature views have "name" attributes and I wonder whether we > should have a table in SRes for controlled vocabulary terms for > protein features that we can point to (as with sequence ontology). > This would avoid the uncontrolled use of "name." I notice that > PeptideProperty has been given a controlled vocabulary table > PeptidePropertyType in this regard. Rather than have a table for each, > we could centralize them. Any choices for the resource to use for > these names? SWISS-PROT? SO doesn't cover protein features but would eventually. Anyway in the meantime, it makes sense to have a controlled vocabulary. I'm not aware of such controlled vocabulary though. Shall I replace the PeptidePropertyType table by a more generic one, AAFeatureName ? > > Cheers, > Chris > > On Tuesday, October 8, 2002, at 09:00 AM, Arnaud Kerhornou wrote: > >> From: Arnaud Kerhornou <ax...@sa...> >> Date: Tue Oct 8, 2002 9:00:32 AM US/Eastern >> To: gusdev-gusdev <gus...@li...>, >> gen...@li... >> Subject: [Gusdev-gusdev] DNA, RNA and Protein GUS Features + >> PeptidePropertyType Table >> >> Hi >> >> I've attached the SQL statements for new views/tables in GUS3, as >> well as updates of existing views/tables. It covers a new sequence >> object and new DNA, RNA and protein features that we would like to >> use. Some of them have been designed to go along Sequence Ontology >> classification (see below). >> >> Here a summary of the list of new views or tables: >> >> Updated views/tables: >> * NAFeatureImp - modified: >> * name from varchar2(30) to varchar2(50) >> * RestrictionFragmentFeature - added: >> * type_of_cut (sticky or blunt) >> * SignalPeptideFeature - added: >> * targetting >> * GeneSynonym - added: >> * is_obsolete >> >> New views/tables: >> >> A new sequence object on the top of NASequenceImp: >> * GenomicSequence >> >> New views on the top of NAFeatureImp: >> * ChromosomeElement >> NB : (centromere, telomere => SO) >> * InflectionPointFeature >> * RepeatFeature >> NB : repeat type => SO >> * RepeatRegionFeature >> * ReplicationFeature >> * TransposableElementFeature >> NB: SO would give the type >> * RNARegulatory >> * RNASecondaryStructure >> * SpliceSiteFeature >> >> New views on the top of AAFeatureImp: >> * AASecondaryStructure >> * AATertiaryStructure >> * DomainFeature >> * PeptideProperty >> * PostTranslationModification >> * TransmembraneDomainFeature >> >> A new table: >> * PeptidePropertyType >> >> >> Note that the design takes into account the use of the Sequence >> Ontology (SO) to refine the types of some the features, eg to >> differentiate the different types of transposable elements, of >> repeats or of chromosome elements (centromere, telomere ...). > > > |
From: Arnaud K. <ax...@sa...> - 2002-10-21 13:50:28
Attachments:
gus.draft_proposal.doc
|
Chris please find attached the documentation file. Let me know if the syntax is not correct. This doc file gives information about the proposal I sent. It also includes some comments, following the feed-back you sent. The proposal needs now to incorporate your feed-back, in particular regarding the controlled vocabulary tables. cheers Arnaud Chris Stoeckert wrote: > Hi Arnaud, > I finally went through your list. These will certainly enrich GUS! > Some questions/issues though. > First a general request for documentation of the tables and attributes > to explain what they are to be used for. We have a plug-in that takes > a file in the format: > > TableName\t\tdescription > TableName\tAttributeName\tDescription > > In particular, I am curious as to what InflectionPointFeature and > ReplicationFeature are. > > For the NAFeature views you propose, are you using "source_id" to > point to SRes:SequenceOntology? If so, why not call the attribute "so_id"? > Similarly, for GenomeSequence as a view of NASequence, is this what > "source_id" is for? > > The AAFeature views have "name" attributes and I wonder whether we > should have a table in SRes for controlled vocabulary terms for > protein features that we can point to (as with sequence ontology). > This would avoid the uncontrolled use of "name." I notice that > PeptideProperty has been given a controlled vocabulary table > PeptidePropertyType in this regard. Rather than have a table for each, > we could centralize them. Any choices for the resource to use for > these names? SWISS-PROT? > > Cheers, > Chris > > On Tuesday, October 8, 2002, at 09:00 AM, Arnaud Kerhornou wrote: > >> From: Arnaud Kerhornou <ax...@sa...> >> Date: Tue Oct 8, 2002 9:00:32 AM US/Eastern >> To: gusdev-gusdev <gus...@li...>, >> gen...@li... >> Subject: [Gusdev-gusdev] DNA, RNA and Protein GUS Features + >> PeptidePropertyType Table >> >> Hi >> >> I've attached the SQL statements for new views/tables in GUS3, as >> well as updates of existing views/tables. It covers a new sequence >> object and new DNA, RNA and protein features that we would like to >> use. Some of them have been designed to go along Sequence Ontology >> classification (see below). >> >> Here a summary of the list of new views or tables: >> >> Updated views/tables: >> * NAFeatureImp - modified: >> * name from varchar2(30) to varchar2(50) >> * RestrictionFragmentFeature - added: >> * type_of_cut (sticky or blunt) >> * SignalPeptideFeature - added: >> * targetting >> * GeneSynonym - added: >> * is_obsolete >> >> New views/tables: >> >> A new sequence object on the top of NASequenceImp: >> * GenomicSequence >> >> New views on the top of NAFeatureImp: >> * ChromosomeElement >> NB : (centromere, telomere => SO) >> * InflectionPointFeature >> * RepeatFeature >> NB : repeat type => SO >> * RepeatRegionFeature >> * ReplicationFeature >> * TransposableElementFeature >> NB: SO would give the type >> * RNARegulatory >> * RNASecondaryStructure >> * SpliceSiteFeature >> >> New views on the top of AAFeatureImp: >> * AASecondaryStructure >> * AATertiaryStructure >> * DomainFeature >> * PeptideProperty >> * PostTranslationModification >> * TransmembraneDomainFeature >> >> A new table: >> * PeptidePropertyType >> >> >> Note that the design takes into account the use of the Sequence >> Ontology (SO) to refine the types of some the features, eg to >> differentiate the different types of transposable elements, of >> repeats or of chromosome elements (centromere, telomere ...). > > > |