From: Chris S. <sto...@SN...> - 2002-10-02 12:00:28
|
Hi Arnaud, Yes please make the proposal in the form of CREATE TABLE statements. BTW, some of these properties are included in mass spec data that we got for PlasmoDB, so we may want to use or call the view PeptideProperty rather than ProteinProperty where the latter can be construed as referring to the entire protein. Either way, the place to start is the SQL. Also thanks for the FlyBase info. Am just starting to take a serious look at it - looks pretty interesting. Cheers, Chris On Wednesday, October 2, 2002, at 06:12 AM, Arnaud Kerhornou wrote: > Chris > > I realise I did some propositions about new protein features but I've > never done their formalisation in SQL statements. > Shall I do that to prepare their incorporation into GUS ? > > cheers > Arnaud > > Arnaud Kerhornou wrote: > >> Hi everyone >> >> I would like to report a new table, ProteinProperty and new views on >> the top of AAFeatureImp table for protein features such as domains. >> >> * Protein properties : >> >> There are 4 protein properties : >> * Isoelectric point (1), >> * Molecular mass (2), >> * Charge (3), >> * Average residue mass (4). >> >> The 3 first ones may have several values as they can be characterized >> experimentally. >> >> From a design point of view, we can have a unique ProteinProperty >> table or a view foreach proterty (a ProteinPropertyImp table and >> three views: IsoElectricPointProperty, MolecularMassProperty and >> ChargeProperty). >> The number of properties may not changed in the future so I may be >> simpler to create a unique ProteinProperty table. >> >> Specification => A property would behave like a feature, ie : >> * it is attached to a sequence modulo the fact it doesn't have a >> location within it, >> * it can be supported by evidences such as an experiment, published >> or from a personal communication. >> * have external db refs. >> >> ProteinProperty table: >> * protein_property_id : number >> * property_name : varchar2(50) >> * property_value : number (5) >> * & stuff common to any GUS table: modification_date ... >> >> The 4th property, average residue mass, could be an extra attribute >> in the proteinSequence or TranslatedAASequence view. >> >> ****************** >> * Protein Features : >> >> Features attached to a protein sequence. >> >> The new features objects are: >> (1) Signal Peptide Feature : >> It's already a view in GUS, but we will store curated data, such as >> targetting information. >> >> (2) Domains: >> It can be: >> * a Leucine Zipper domain, >> * a coiled-coil domain, >> * a Pfam, Smart or Prosite domain. >> >> DomainFeature view: >> * aa_feature_id : number (10), >> * aa_sequence_id : number (10), >> * name : varchar2 (50), >> * description : varchar2 (100), >> * score : number (4) >> * e_value : number (10), >> + external database link entries and a location object. >> >> (3) Transmenbrane domain feature: >> Question : PlasmoDB web site shows hydrophobicity graphics, where is >> it stored in GUS ? >> >> (4) Post-translational modification feature: >> * type : varchar2 (50) (e.g. glycosylation, phosphorylation ...) >> * modified_by : use of the Interaction table ? >> * Coordinates of the phosphorylation site in a AALocation object. >> >> (5) Repeat Features, should be the same design that at the DNA >> level : >> * RepeatRegionFeature as a set of RepeatUnitFeatures, >> * RepeatUnitFeature, with the consensus sequence, name and size >> * RepeatType table >> >> Another question : What about 2D structures (beta-sheet and >> alpha-helice) in GUS ? >> >> Let me know if you have any comments. I'll send another email for >> extra features at the DNA/RNA level. >> >> Cheers >> Arnaud >> > |