From: Kumar, S. \(Contr\) <San...@ng...> - 2005-08-04 19:02:24
|
Hi Chris, Here is the sample example, It doesn't have any sequence.=20 ID ASX_HYDROXYL; PATTERN. AC PS00010; DT APR-1990 (CREATED); APR-1990 (DATA UPDATE); APR-2005 (INFO UPDATE). DE Aspartic acid and asparagine hydroxylation site. PA C-x-[DN]-x(4)-[FY]-x-C-x-C. NR /RELEASE=3D46.6,180652; NR /TOTAL=3D1358(269); /POSITIVE=3D1342(260); /UNKNOWN=3D0(0); = /FALSE_POS=3D16(9); NR /FALSE_NEG=3D0; /PARTIAL=3D0; CC /TAXO-RANGE=3D??E??; /MAX-REPEAT=3D43; CC /SITE=3D3,hydroxylation; CC /VERSION=3D1; DR P31696, AGRN_CHICK , T; Q90404, AGRN_DISOM , T; P13497, BMP1_HUMAN = , T; DR P98063, BMP1_MOUSE , T; P98070, BMP1_XENLA , T; P98069, BMPH_STRPU = , T; DR Q9NPY3, C1QR1_HUMAN, T; O89103, C1QR1_MOUSE, T; Q9ET61, C1QR1_RAT = , T; DR P00736, C1R_HUMAN , T; Q8CG16, C1R_MOUSE , T; Q5R1W3, C1R_PANTR = , T; DR P09871, C1S_HUMAN , T; P15156, CASP_MESAU , T; Q8SQA4, CD97_BOVIN = , T; DR P48960, CD97_HUMAN , T; Q9Z0M6, CD97_MOUSE , T; Q9NYQ6, = CELR1_HUMAN, T; DR O35161, CELR1_MOUSE, T; Q9HCU4, CELR2_HUMAN, T; Q9R0M0, = CELR2_MOUSE, T; DR Q9QYP2, CELR2_RAT , T; Q9NYQ7, CELR3_HUMAN, T; Q91ZI0, = CELR3_MOUSE, T; DR O88278, CELR3_RAT , T; Q86T13, CN027_HUMAN, T; Q8VCP9, = CN027_MOUSE, T; DR P10040, CRB_DROME , T; P82279, CRUM1_HUMAN, T; P81282, = CSPG2_BOVIN, T; DR Q90953, CSPG2_CHICK, T; P13611, CSPG2_HUMAN, T; Q28858, = CSPG2_MACNE, T; DR Q62059, CSPG2_MOUSE, T; Q9ERB4, CSPG2_RAT , T; O14594, = CSPG3_HUMAN, T; DR P55066, CSPG3_MOUSE, T; Q5IS41, CSPG3_PANTR, T; P55067, CSPG3_RAT = , T; DR P80370, DLK_HUMAN , T; Q09163, DLK_MOUSE , T; O00548, DLL1_HUMAN = , T; DR Q61483, DLL1_MOUSE , T; P97677, DLL1_RAT , T; Q9NR61, DLL4_HUMAN = , T; DR Q9JI71, DLL4_MOUSE , T; P10041, DL_DROME , T; O43854, = EDIL3_HUMAN, T; DR O35474, EDIL3_MOUSE, T; O75095, EGFL3_HUMAN, T; O88281, EGFL3_RAT = , T; DR Q7Z7M0, EGFL4_HUMAN, T; P60882, EGFL4_MOUSE, T; Q9QYP0, EGFL4_RAT = , T; DR Q9UHF1, EGFL7_HUMAN, T; Q6AZ60, EGFL7_RAT , T; Q99944, = EGFL8_HUMAN, T; DR Q6GUQ1, EGFL8_MOUSE, T; Q6MG84, EGFL8_RAT , T; Q6UY11, = EGFL9_HUMAN, T; DR Q8K1E3, EGFL9_MOUSE, T; Q9BEA0, EGF_CANFA , T; Q95ND4, EGF_FELCA = , T; DR P01133, EGF_HUMAN , T; P01132, EGF_MOUSE , T; Q00968, EGF_PIG = , T; DR P07522, EGF_RAT , T; Q9HBW9, ELTD1_HUMAN, T; Q923X1, = ELTD1_MOUSE, T; DR Q9ESC1, ELTD1_RAT , T; Q14246, EMR1_HUMAN , T; Q61549, EMR1_MOUSE = , T; DR Q9UHX3, EMR2_HUMAN , T; Q9BY15, EMR3_HUMAN , T; Q86SQ3, EMR4_HUMAN = , T; DR Q91ZE5, EMR4_MOUSE , T; P00743, FA10_BOVIN , T; P25155, FA10_CHICK = , T; DR P83370, FA10_HOPST , T; P00742, FA10_HUMAN , T; O88947, FA10_MOUSE = , T; DR O19045, FA10_RABIT , T; Q63207, FA10_RAT , T; P81428, FA10_TROCA = , T; DR P22457, FA7_BOVIN , T; P08709, FA7_HUMAN , T; P70375, FA7_MOUSE = , T; DR P98139, FA7_RABIT , T; Q8K3U6, FA7_RAT , T; P00741, FA9_BOVIN = , T; DR P19540, FA9_CANFA , T; Q804X6, FA9_CHICK , T; Q6SA95, FA9_FELCA = , T; DR P00740, FA9_HUMAN , T; P16294, FA9_MOUSE , T; Q95ND7, FA9_PANTR = , T; DR P16293, FA9_PIG , T; Q9VW71, FAT2_DROME , T; Q14517, FATH_HUMAN = , T; DR O42182, FBLN1_BRARE, T; O77469, FBLN1_CAEEL, T; Q8MJJ9, = FBLN1_CERAE, T; DR O73775, FBLN1_CHICK, T; P23142, FBLN1_HUMAN, T; Q08879, = FBLN1_MOUSE, T; DR P98095, FBLN2_HUMAN, T; P37889, FBLN2_MOUSE, T; Q12805, = FBLN3_HUMAN, T; DR Q7YQD7, FBLN3_MACFA, T; Q8BPB5, FBLN3_MOUSE, T; O35568, FBLN3_RAT = , T; DR O55058, FBLN4_CRIGR, T; O95967, FBLN4_HUMAN, T; Q9WVJ9, = FBLN4_MOUSE, T; DR Q9UBX5, FBLN5_HUMAN, T; Q9WVH9, FBLN5_MOUSE, T; Q9WVH8, FBLN5_RAT = , T; DR P98133, FBN1_BOVIN , T; P35555, FBN1_HUMAN , T; Q61554, FBN1_MOUSE = , T; DR Q9TV36, FBN1_PIG , T; P35556, FBN2_HUMAN , T; Q61555, FBN2_MOUSE = , T; DR Q75N90, FBN3_HUMAN , T; P10079, FBP1_STRPU , T; P49013, FBP3_STRPU = , T; DR Q25464, FP2_MYTGA , T; Q14393, GAS6_HUMAN , T; Q61592, GAS6_MOUSE = , T; DR Q63772, GAS6_RAT , T; P13508, GLP1_CAEEL , T; Q90Y57, = JAG1A_BRARE, T; DR Q90Y54, JAG1B_BRARE, T; P78504, JAG1_HUMAN , T; Q9QXX0, JAG1_MOUSE = , T; DR Q63722, JAG1_RAT , T; Q9Y219, JAG2_HUMAN , T; Q9QYE5, JAG2_MOUSE = , T; DR P97607, JAG2_RAT , T; Q99087, LDLR1_XENLA, T; Q99088, = LDLR2_XENLA, T; DR P35950, LDLR_CRIGR , T; P01130, LDLR_HUMAN , T; P35951, LDLR_MOUSE = , T; DR Q28832, LDLR_PIG , T; P20063, LDLR_RABIT , T; P35952, LDLR_RAT = , T; DR P14585, LIN12_CAEEL, T; Q9NZR2, LRP1B_HUMAN, T; Q9JI18, = LRP1B_MOUSE, T; DR P98157, LRP1_CHICK , T; Q07954, LRP1_HUMAN , T; P98164, LRP2_HUMAN = , T; DR P98158, LRP2_RAT , T; O75096, LRP4_HUMAN , T; Q8VI56, LRP4_MOUSE = , T; DR Q9QYP1, LRP4_RAT , T; Q98931, LRP8_CHICK , T; Q14114, LRP8_HUMAN = , T; DR Q924X6, LRP8_MOUSE , T; Q04833, LRP_CAEEL , T; Q14766, = LTB1L_HUMAN, T; DR Q8CG19, LTB1L_MOUSE, T; P22064, LTB1S_HUMAN, T; Q8CG18, = LTB1S_MOUSE, T; DR Q00918, LTBP1_RAT , T; Q28019, LTBP2_BOVIN, T; Q14767, = LTBP2_HUMAN, T; DR O08999, LTBP2_MOUSE, T; O35806, LTBP2_RAT , T; Q9NS15, = LTBP3_HUMAN, T; DR Q61810, LTBP3_MOUSE, T; Q8K4G1, LTBP4_MOUSE, T; P48740, = MASP1_HUMAN, T; DR P98064, MASP1_MOUSE, T; O00187, MASP2_HUMAN, T; O00339, = MATN2_HUMAN, T; DR O08746, MATN2_MOUSE, T; O95460, MATN4_HUMAN, T; O89029, = MATN4_MOUSE, T; DR P34576, MUA3_CAEEL , T; Q20176, NAS39_CAEEL, T; Q92832, = NELL1_HUMAN, T; DR Q62919, NELL1_RAT , T; Q99435, NELL2_HUMAN, T; Q61220, = NELL2_MOUSE, T; DR Q62918, NELL2_RAT , T; Q90827, NEL_CHICK , T; Q14112, NID2_HUMAN = , T; DR O88322, NID2_MOUSE , T; P14543, NIDO_HUMAN , T; P10493, NIDO_MOUSE = , T; DR P08460, NIDO_RAT , T; P46530, NOTC1_BRARE, T; P46531, = NOTC1_HUMAN, T; DR Q01705, NOTC1_MOUSE, T; Q07008, NOTC1_RAT , T; Q04721, = NOTC2_HUMAN, T; DR O35516, NOTC2_MOUSE, T; Q9QW30, NOTC2_RAT , T; Q9UM47, = NOTC3_HUMAN, T; DR Q61982, NOTC3_MOUSE, T; Q9R172, NOTC3_RAT , T; Q99466, = NOTC4_HUMAN, T; DR P31695, NOTC4_MOUSE, T; P07207, NOTCH_DROME, T; P21783, = NOTCH_XENLA, T; DR Q28146, NRX1A_BOVIN, T; Q9DDD0, NRX1A_CHICK, T; Q9ULB1, = NRX1A_HUMAN, T; DR Q63372, NRX1A_RAT , T; Q9Y4C0, NRX3A_HUMAN, T; Q07310, NRX3A_RAT = , T; DR Q8HYB7, PERT_CANFA , T; P07202, PERT_HUMAN , T; P35419, PERT_MOUSE = , T; DR P14650, PERT_RAT , T; P13608, PGCA_BOVIN , T; Q28343, PGCA_CANFA = , T; DR P07898, PGCA_CHICK , T; P00745, PROC_BOVIN , T; Q28278, PROC_CANFA = , T; DR P04070, PROC_HUMAN , T; P33587, PROC_MOUSE , T; Q9GLP2, PROC_PIG = , T; DR Q28661, PROC_RABIT , T; P31394, PROC_RAT , T; P07224, PROS_BOVIN = , T; DR P07225, PROS_HUMAN , T; Q28520, PROS_MACMU , T; Q08761, PROS_MOUSE = , T; DR P98118, PROS_RABIT , T; P53813, PROS_RAT , T; P00744, PROZ_BOVIN = , T; DR P22891, PROZ_HUMAN , T; Q9CQW3, PROZ_MOUSE , T; P18168, SERR_DROME = , T; DR O94813, SLIT2_HUMAN, T; P24014, SLIT_DROME , T; Q07929, SP63_STRPU = , T; DR Q26627, SUREJ_STRPU, T; P25723, TLD_DROME , T; O57460, TLL1_BRARE = , T; DR P06579, TRBM_BOVIN , T; Q5W7P8, TRBM_CANFA , T; P07204, TRBM_HUMAN = , T; DR P15306, TRBM_MOUSE , T; Q71U07, TRBM_SAISC , T; P48733, UROM_BOVIN = , T; DR Q862Z3, UROM_CANFA , T; P07911, UROM_HUMAN , T; Q91X17, UROM_MOUSE = , T; DR P27590, UROM_RAT , T; P98165, VLDLR_CHICK, T; P98155, = VLDLR_HUMAN, T; DR P98156, VLDLR_MOUSE, T; P35953, VLDLR_RABIT, T; P98166, VLDLR_RAT = , T; DR P41950, YLK2_CAEEL , T; P98163, YL_DROME , T; DR O15943, CADN_DROME , F; Q13201, MMRN1_HUMAN, F; O75093, = SLIT1_HUMAN, F; DR Q80TR4, SLIT1_MOUSE, F; O88279, SLIT1_RAT , F; Q9R1B9, = SLIT2_MOUSE, F; DR O75094, SLIT3_HUMAN, F; Q9WVB4, SLIT3_MOUSE, F; O88280, SLIT3_RAT = , F; 3D 1APO; 1APQ; 1AUT; 1BF9; 1CCF; 1DAN; 1DVA; 1DX5; 1EDM; 1EMN; 1EMO; = 1F7E; 3D 1F7M; 1FAK; 1FF7; 1FFM; 1GL4; 1HJ7; 1HZ8; 1I0U; 1IXA; 1J9C; 1LMJ; = 1N7D; 3D 1NL8; 1NZI; 1O5D; 1PFX; 1QFK; 1SZB; 1TOZ; 1UZJ; 1UZP; 1UZQ; 1W0Y; = 1WHE; 3D 1WHF; 1XFE; 1XKA; 1XKB; DO PDOC00010; // =20 Thanks Sanjeev -----Original Message----- From: Chris Stoeckert [mailto:sto...@pc...] Sent: Thursday, August 04, 2005 2:56 PM To: Kumar, Sanjeev (Contr) Cc: gus...@li... Subject: Re: [GUSDEV] RE: Gusdev-gusdev digest, Vol 1 #637 - 1 msg Hi Sanjeev, I guess the first thing to decide is if this entry represents a =20 sequence, a feature, or an annotation. Do you (or anyone else) have =20 strong opinions on this? Can you send an example entry? Thanks, Chris On Aug 4, 2005, at 12:35 PM, Kumar, Sanjeev (Contr) wrote: > Hi, > Now let us figure out which GUS table to used to store PrositeDB =20 > master data. > Can any one help me in that please? > Following type of information it contains: > ID Identification (Begins each entry; 1 per entry) > AC Accession number (1 per entry) > DT Date (1 per entry) > DE Short description (1 per entry) > PA Pattern (>1 per entry) > MA Matrix/profile (>1 per entry) > RU Rule (>1 per entry) > NR Numerical results (>1 per entry) > CC Comments (>=3D1 per entry) > DR Cross-references to Swiss-Prot (>1 per entry) > 3D Cross-references to PDB (>1 per entry) > DO Pointer to the documentation file (1 per entry) > > Any help on this will be appreciated. > > > Thanks > Sanjeev > > -----Original Message----- > From: gus...@li... > [mailto:gus...@li...]On Behalf Of > gus...@li... > Sent: Tuesday, August 02, 2005 11:09 PM > To: gus...@li... > Subject: Gusdev-gusdev digest, Vol 1 #637 - 1 msg > > > Send Gusdev-gusdev mailing list submissions to > gus...@li... > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > or, via email, send a message with subject or body 'help' to > gus...@li... > > You can reach the person managing the list at > gus...@li... > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Gusdev-gusdev digest..." > > > Today's Topics: > > 1. RE: Loading Prosite DB (Kumar, Sanjeev (Contr)) > > --__--__-- > > Message: 1 > Subject: RE: [GUSDEV] Loading Prosite DB > Date: Tue, 2 Aug 2005 17:04:05 -0400 > From: "Kumar, Sanjeev \(Contr\)" <San...@ng...> > To: "Aaron J. Mackey" <am...@pc...> > Cc: "Jian Lu" <jl...@vb...>, > <gus...@li...> > > So, The PlugIn which you are writing will not be taking care detail = =3D > Prosite data, right? > If yes then I will write a separate plugin to load detail Prosite =20 > master =3D > data. > > Thanks > Sanjeev > > -----Original Message----- > From: Aaron J. Mackey [mailto:am...@pc...] > Sent: Tuesday, August 02, 2005 5:01 PM > To: Kumar, Sanjeev (Contr) > Cc: Jian Lu; gus...@li... > Subject: Re: [GUSDEV] Loading Prosite DB > > > Yes, InterProScan only provides domain analysis results, not the =3D20 > actual domain/pattern/motif databases themselves. > > -Aaron > > On Aug 2, 2005, at 4:54 PM, Kumar, Sanjeev (Contr) wrote: > > >> Hi Aaron/Jian, >> The Interproscan data has only PrositeID and description in it. =20 >> =3D20 >> But to load other information for a prosite ID, we need to load =20 >> the =3D20 >> Prosite data which comes in different format than Interpro. >> That is what I found, Do you copy? >> >> Thanks >> Sanjeev >> >> -----Original Message----- >> From: Aaron J. Mackey [mailto:am...@pc...] >> Sent: Tuesday, August 02, 2005 4:43 PM >> To: Jian Lu >> Cc: Kumar, Sanjeev (Contr) >> Subject: Re: [GUSDEV] Loading Prosite DB >> >> >> >> From http://www.ebi.ac.uk/interpro/README1.html >> >> PROSITE patterns. >> >> Some biologically significant amino acid patterns can be summarised >> in the form of regular expressions. >> >> ScanRegExp (by Wol...@eb...), Ppsearch (Fuchs, R. >> 1994) . >> >> PROSITE profile. >> >> There are a number of protein families as well as functional or >> structural domains that cannot be detected using patterns due to >> their extreme sequence divergence; the use of techniques based on >> weight matrices (also known as profiles) allows the detection of such >> domains. >> >> pfscan from thePftools package (by Phi...@is...). >> >> PRINTS. >> The PRINTS database houses a collection of protein family >> fingerprints. These are groups of motifs that together are >> diagnostically more potent than single motifs by making use of the >> biological context inherent in a multiple-motif method. >> >> FingerPRINTScan (Scordis, P. et al. 1999) . >> >> PFAM. >> Pfam is a database of protein domain families. Pfam contains curated >> multiple sequence alignments for each family and corresponding >> profile hidden Markov models (HMMs). >> >> hmmpfam from theHMMER2.1 package (by Sean Eddy, >> ed...@ge..., http://hmmer.wustl.edu), >> DeCypher=3D99 (TimeLogic) implementation of HMM search. >> >> PRODOM. >> ProDom families are built by an automated process based on a >> recursive use ofPSI-BLAST homology searches. >> >> BlastProDom.pl (by Florence Servant, fse...@to...) >> =3D96 a filter on top of theBlast package (Altschul, S. F. et al. =20 >> 1997) =3D >> > . > >> >> SMART. >> SMART domains are extensively annotated with respect to phyletic >> distributions, functional class, tertiary structures and functionally >> important residues. SMART alignments are optimised manually and >> following construction of corresponding hidden Markov models (HMMs). >> >> hmmpfam from theHMMER2.1 package. >> >> TIGRFAMs. >> TIGRFAMs are a collection of protein families featuring curated >> multiple sequence alignments, Hidden Markov Models (HMMs) and >> associated information designed to support the automated functional >> identification of proteins by sequence homology. Classification by >> equivalog family (see below), where achievable, complements >> classification by orthologs, superfamily, domain or motif. It >> provides the information best suited for automatic assignment of >> specific functions to proteins from large scale genome sequencing >> projects >> >> =3DD8 hmmpfam from theHMMER2.1 package. >> >> Optionally, predictions for coiled-coil, signal peptide cleavage >> sites (SignalP v2) and TM helices (TMHMM v2) are supported. >> >> >> >> On Aug 2, 2005, at 4:32 PM, Jian Lu wrote: >> >> >> >>> I don't think so. Here is the data sheet from InterProScan. >>> >>> Kumar, Sanjeev (Contr) wrote: >>> >>> >>> >>> >>>> Hi Aaron/Jian, >>>> What all types of data we are talking in IterProScan plugin? >>>> Does it include Prosite data. >>>> Thanks >>>> Sanjeev >>>> >>>> -----Original Message----- >>>> From: Jian Lu [mailto:jl...@vb...] >>>> Sent: Tuesday, August 02, 2005 2:02 PM >>>> To: Aaron J. Mackey >>>> Cc: Kumar, Sanjeev (Contr); gus...@li... >>>> Subject: Re: [GUSDEV] Loading Prosite DB >>>> >>>> >>>> Aaron, >>>> >>>> We are also working on InterProScan and other analysis tools. But >>>> we haven't got a plugin yet. If your plugin is ready, I would like >>>> to play it. Here is the view that we created for InterProScan. >>>> Please comment it. Thanks. >>>> >>>> -- >>>> -- VIEW DOTS.INTERPROSCAN >>>> -- used to store outputs from InterProScan >>>> -- June 29, 2005 >>>> >>>> >>>> >>>> >>> >>> >>> <InterProScan_OUTPUT.pdf> >>> >>> >>> >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> >> >> >> >> ------------------------------------------------------- >> SF.Net email is sponsored by: Discover Easy Linux Migration =20 >> Strategies >> from IBM. Find simple to follow Roadmaps, straightforward articles, >> informative Webcasts and more! Get everything you need to get up to >> speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id=3D16492&op=3DCCk >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > > > > > > --__--__-- > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > End of Gusdev-gusdev Digest > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle =20 > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * =20 > Testing & QA > Security * Process Improvement & Measurement * http://www.sqe.com/=20 > bsce5sf > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |