From: Kumar, S. \(Contr\) <San...@ng...> - 2005-08-02 20:55:09
|
Hi Aaron/Jian, The Interproscan data has only PrositeID and description in it. But = to load other information for a prosite ID, we need to load the Prosite = data which comes in different format than Interpro. That is what I found, Do you copy? Thanks Sanjeev=20 -----Original Message----- From: Aaron J. Mackey [mailto:am...@pc...] Sent: Tuesday, August 02, 2005 4:43 PM To: Jian Lu Cc: Kumar, Sanjeev (Contr) Subject: Re: [GUSDEV] Loading Prosite DB From http://www.ebi.ac.uk/interpro/README1.html PROSITE patterns. Some biologically significant amino acid patterns can be summarised =20 in the form of regular expressions. ScanRegExp (by Wol...@eb...), Ppsearch (Fuchs, R. =20 1994) . PROSITE profile. There are a number of protein families as well as functional or =20 structural domains that cannot be detected using patterns due to =20 their extreme sequence divergence; the use of techniques based on =20 weight matrices (also known as profiles) allows the detection of such =20 domains. pfscan from thePftools package (by Phi...@is...). PRINTS. The PRINTS database houses a collection of protein family =20 fingerprints. These are groups of motifs that together are =20 diagnostically more potent than single motifs by making use of the =20 biological context inherent in a multiple-motif method. FingerPRINTScan (Scordis, P. et al. 1999) . PFAM. Pfam is a database of protein domain families. Pfam contains curated =20 multiple sequence alignments for each family and corresponding =20 profile hidden Markov models (HMMs). hmmpfam from theHMMER2.1 package (by Sean Eddy, =20 ed...@ge..., http://hmmer.wustl.edu), DeCypher=99 (TimeLogic) implementation of HMM search. PRODOM. ProDom families are built by an automated process based on a =20 recursive use ofPSI-BLAST homology searches. BlastProDom.pl (by Florence Servant, fse...@to...) =20 =96 a filter on top of theBlast package (Altschul, S. F. et al. 1997) . SMART. SMART domains are extensively annotated with respect to phyletic =20 distributions, functional class, tertiary structures and functionally =20 important residues. SMART alignments are optimised manually and =20 following construction of corresponding hidden Markov models (HMMs). hmmpfam from theHMMER2.1 package. TIGRFAMs. TIGRFAMs are a collection of protein families featuring curated =20 multiple sequence alignments, Hidden Markov Models (HMMs) and =20 associated information designed to support the automated functional =20 identification of proteins by sequence homology. Classification by =20 equivalog family (see below), where achievable, complements =20 classification by orthologs, superfamily, domain or motif. It =20 provides the information best suited for automatic assignment of =20 specific functions to proteins from large scale genome sequencing =20 projects =D8 hmmpfam from theHMMER2.1 package. Optionally, predictions for coiled-coil, signal peptide cleavage =20 sites (SignalP v2) and TM helices (TMHMM v2) are supported. On Aug 2, 2005, at 4:32 PM, Jian Lu wrote: > I don't think so. Here is the data sheet from InterProScan. > > Kumar, Sanjeev (Contr) wrote: > > >> Hi Aaron/Jian, >> What all types of data we are talking in IterProScan plugin? >> Does it include Prosite data. >> Thanks >> Sanjeev >> >> -----Original Message----- >> From: Jian Lu [mailto:jl...@vb...] >> Sent: Tuesday, August 02, 2005 2:02 PM >> To: Aaron J. Mackey >> Cc: Kumar, Sanjeev (Contr); gus...@li... >> Subject: Re: [GUSDEV] Loading Prosite DB >> >> >> Aaron, >> >> We are also working on InterProScan and other analysis tools. But =20 >> we haven't got a plugin yet. If your plugin is ready, I would like =20 >> to play it. Here is the view that we created for InterProScan. =20 >> Please comment it. Thanks. >> >> -- >> -- VIEW DOTS.INTERPROSCAN >> -- used to store outputs from InterProScan >> -- June 29, 2005 >> >> > > > <InterProScan_OUTPUT.pdf> > -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |