From: Chris S. <sto...@pc...> - 2005-08-04 22:18:04
|
It doesn't work as a feature as these are tied to one sequence. Depending on how much you want to load, you can just go with Aaron's solution and load base information in DBRef and associate sequences to it with AASequenceDBRef. If you need to store more than what fits in DBRef, you might propose creating a new table along the lines of DoTS.PfamEntry or DoTS.NRDBEntry. Cheers, Chris On Aug 4, 2005, at 3:13 PM, Aaron J. Mackey wrote: > A Prosite entry is a (regular expression or profile-based) sequence > motif. So it's none of the above. > > I believe we are going to handle Prosite (and other InterProScan- > related datasets) simply as DbRefs, and not try to store their > actual definitions (thus linking out to expasy, etc. as necessary). > > -Aaron > > On Aug 4, 2005, at 2:56 PM, Chris Stoeckert wrote: > > >> Hi Sanjeev, >> I guess the first thing to decide is if this entry represents a >> sequence, a feature, or an annotation. Do you (or anyone else) >> have strong opinions on this? Can you send an example entry? >> Thanks, >> Chris >> >> On Aug 4, 2005, at 12:35 PM, Kumar, Sanjeev (Contr) wrote: >> >> >> >>> Hi, >>> Now let us figure out which GUS table to used to store >>> PrositeDB master data. >>> Can any one help me in that please? >>> Following type of information it contains: >>> ID Identification (Begins each entry; 1 per entry) >>> AC Accession number (1 per entry) >>> DT Date (1 per entry) >>> DE Short description (1 per entry) >>> PA Pattern (>1 per entry) >>> MA Matrix/profile (>1 per entry) >>> RU Rule (>1 per entry) >>> NR Numerical results (>1 per entry) >>> CC Comments (>=1 per entry) >>> DR Cross-references to Swiss-Prot (>1 per entry) >>> 3D Cross-references to PDB (>1 per entry) >>> DO Pointer to the documentation file (1 per entry) >>> >>> Any help on this will be appreciated. >>> >>> >>> Thanks >>> Sanjeev >>> >>> -----Original Message----- >>> From: gus...@li... >>> [mailto:gus...@li...]On Behalf Of >>> gus...@li... >>> Sent: Tuesday, August 02, 2005 11:09 PM >>> To: gus...@li... >>> Subject: Gusdev-gusdev digest, Vol 1 #637 - 1 msg >>> >>> >>> Send Gusdev-gusdev mailing list submissions to >>> gus...@li... >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> or, via email, send a message with subject or body 'help' to >>> gus...@li... >>> >>> You can reach the person managing the list at >>> gus...@li... >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of Gusdev-gusdev digest..." >>> >>> >>> Today's Topics: >>> >>> 1. RE: Loading Prosite DB (Kumar, Sanjeev (Contr)) >>> >>> --__--__-- >>> >>> Message: 1 >>> Subject: RE: [GUSDEV] Loading Prosite DB >>> Date: Tue, 2 Aug 2005 17:04:05 -0400 >>> From: "Kumar, Sanjeev \(Contr\)" <San...@ng...> >>> To: "Aaron J. Mackey" <am...@pc...> >>> Cc: "Jian Lu" <jl...@vb...>, >>> <gus...@li...> >>> >>> So, The PlugIn which you are writing will not be taking care >>> detail = >>> Prosite data, right? >>> If yes then I will write a separate plugin to load detail Prosite >>> master = >>> data. >>> >>> Thanks >>> Sanjeev >>> >>> -----Original Message----- >>> From: Aaron J. Mackey [mailto:am...@pc...] >>> Sent: Tuesday, August 02, 2005 5:01 PM >>> To: Kumar, Sanjeev (Contr) >>> Cc: Jian Lu; gus...@li... >>> Subject: Re: [GUSDEV] Loading Prosite DB >>> >>> >>> Yes, InterProScan only provides domain analysis results, not the =20 >>> actual domain/pattern/motif databases themselves. >>> >>> -Aaron >>> >>> On Aug 2, 2005, at 4:54 PM, Kumar, Sanjeev (Contr) wrote: >>> >>> >>> >>> >>>> Hi Aaron/Jian, >>>> The Interproscan data has only PrositeID and description in >>>> it. =20 >>>> But to load other information for a prosite ID, we need to load >>>> the =20 >>>> Prosite data which comes in different format than Interpro. >>>> That is what I found, Do you copy? >>>> >>>> Thanks >>>> Sanjeev >>>> >>>> -----Original Message----- >>>> From: Aaron J. Mackey [mailto:am...@pc...] >>>> Sent: Tuesday, August 02, 2005 4:43 PM >>>> To: Jian Lu >>>> Cc: Kumar, Sanjeev (Contr) >>>> Subject: Re: [GUSDEV] Loading Prosite DB >>>> >>>> >>>> >>>> From http://www.ebi.ac.uk/interpro/README1.html >>>> >>>> PROSITE patterns. >>>> >>>> Some biologically significant amino acid patterns can be summarised >>>> in the form of regular expressions. >>>> >>>> ScanRegExp (by Wol...@eb...), Ppsearch (Fuchs, R. >>>> 1994) . >>>> >>>> PROSITE profile. >>>> >>>> There are a number of protein families as well as functional or >>>> structural domains that cannot be detected using patterns due to >>>> their extreme sequence divergence; the use of techniques based on >>>> weight matrices (also known as profiles) allows the detection of >>>> such >>>> domains. >>>> >>>> pfscan from thePftools package (by >>>> Phi...@is...). >>>> >>>> PRINTS. >>>> The PRINTS database houses a collection of protein family >>>> fingerprints. These are groups of motifs that together are >>>> diagnostically more potent than single motifs by making use of the >>>> biological context inherent in a multiple-motif method. >>>> >>>> FingerPRINTScan (Scordis, P. et al. 1999) . >>>> >>>> PFAM. >>>> Pfam is a database of protein domain families. Pfam contains >>>> curated >>>> multiple sequence alignments for each family and corresponding >>>> profile hidden Markov models (HMMs). >>>> >>>> hmmpfam from theHMMER2.1 package (by Sean Eddy, >>>> ed...@ge..., http://hmmer.wustl.edu), >>>> DeCypher=99 (TimeLogic) implementation of HMM search. >>>> >>>> PRODOM. >>>> ProDom families are built by an automated process based on a >>>> recursive use ofPSI-BLAST homology searches. >>>> >>>> BlastProDom.pl (by Florence Servant, >>>> fse...@to...) >>>> =96 a filter on top of theBlast package (Altschul, S. F. et al. >>>> 1997) = >>>> >>>> >>>> >>> . >>> >>> >>> >>>> >>>> SMART. >>>> SMART domains are extensively annotated with respect to phyletic >>>> distributions, functional class, tertiary structures and >>>> functionally >>>> important residues. SMART alignments are optimised manually and >>>> following construction of corresponding hidden Markov models >>>> (HMMs). >>>> >>>> hmmpfam from theHMMER2.1 package. >>>> >>>> TIGRFAMs. >>>> TIGRFAMs are a collection of protein families featuring curated >>>> multiple sequence alignments, Hidden Markov Models (HMMs) and >>>> associated information designed to support the automated functional >>>> identification of proteins by sequence homology. Classification by >>>> equivalog family (see below), where achievable, complements >>>> classification by orthologs, superfamily, domain or motif. It >>>> provides the information best suited for automatic assignment of >>>> specific functions to proteins from large scale genome sequencing >>>> projects >>>> >>>> =D8 hmmpfam from theHMMER2.1 package. >>>> >>>> Optionally, predictions for coiled-coil, signal peptide cleavage >>>> sites (SignalP v2) and TM helices (TMHMM v2) are supported. >>>> >>>> >>>> >>>> On Aug 2, 2005, at 4:32 PM, Jian Lu wrote: >>>> >>>> >>>> >>>> >>>> >>>>> I don't think so. Here is the data sheet from InterProScan. >>>>> >>>>> Kumar, Sanjeev (Contr) wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> Hi Aaron/Jian, >>>>>> What all types of data we are talking in IterProScan plugin? >>>>>> Does it include Prosite data. >>>>>> Thanks >>>>>> Sanjeev >>>>>> >>>>>> -----Original Message----- >>>>>> From: Jian Lu [mailto:jl...@vb...] >>>>>> Sent: Tuesday, August 02, 2005 2:02 PM >>>>>> To: Aaron J. Mackey >>>>>> Cc: Kumar, Sanjeev (Contr); gus...@li... >>>>>> Subject: Re: [GUSDEV] Loading Prosite DB >>>>>> >>>>>> >>>>>> Aaron, >>>>>> >>>>>> We are also working on InterProScan and other analysis tools. But >>>>>> we haven't got a plugin yet. If your plugin is ready, I would >>>>>> like >>>>>> to play it. Here is the view that we created for InterProScan. >>>>>> Please comment it. Thanks. >>>>>> >>>>>> -- >>>>>> -- VIEW DOTS.INTERPROSCAN >>>>>> -- used to store outputs from InterProScan >>>>>> -- June 29, 2005 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> <InterProScan_OUTPUT.pdf> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Aaron J. Mackey, Ph.D. >>>> Project Manager, ApiDB Bioinformatics Resource Center >>>> Penn Genomics Institute, University of Pennsylvania >>>> email: am...@pc... >>>> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) >>>> fax: 215-746-6697 >>>> postal: Penn Genomics Institute >>>> Goddard Labs 212 >>>> 415 S. University Avenue >>>> Philadelphia, PA 19104-6017 >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>> Strategies >>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>> informative Webcasts and more! Get everything you need to get up to >>>> speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id=16492&op=CCk >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>>> >>>> >>>> >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >>> >>> >>> >>> >>> --__--__-- >>> >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >>> End of Gusdev-gusdev Digest >>> >>> >>> ------------------------------------------------------- >>> SF.Net email is Sponsored by the Better Software Conference & EXPO >>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >>> Practices >>> Agile & Plan-Driven Development * Managing Projects & Teams * >>> Testing & QA >>> Security * Process Improvement & Measurement * http://www.sqe.com/ >>> bsce5sf >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >>> >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * >> Testing & QA >> Security * Process Improvement & Measurement * http://www.sqe.com/ >> bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > > |