From: Thomas O. <ot...@fi...> - 2004-03-12 17:20:33
|
Hi, I think I give kegg a try... now I have an idea of the relations and first steps. The moment I am done, I will provide a new namespace, ga - module for uploading and the link for the data. Cheers, Thomas Terry Clark wrote: >Angel, > >This sounds good. I am not working on the KEGG tables >at the moment so will that and OMIM to Thomas, but > "I'll be back..." (spoken with thick Austrian accent). > >I would be tempted also to ask for > > 1. test data for the tables and a plugin to load them > if not to query them, (maybe that's the "data migration scripts") > > 2. an entity-relationship or UML like diagram > as part of the documentation you mention > >The one problem I see is that after someone goes through >all the trouble of creating new stuff, the GUS core group >tells them, hey, look over there... Maybe some preliminary >review, but gusdev could be that. > >Terry > > >On Fri, Mar 12, 2004 at 11:20:44AM -0500, Angel Pizarro wrote: > > >>Thomas and Terry, >>I think that both OMIM and KEGG are good candidates for integration into >>GUS. Here is my advise on how to proceed and get you guys running as >>quick as possible: >> >>1) Create new schema spaces OMIM and KEGG. Enter this data into >>Core.DatabaseInfo. >>2) Make the tables as GUS compliant and enter the Core.TableInfo information >>3) Update the timestamp of $PROJETC_HOME/GUS/Model/schema/VERSION by: >>%> touch $PROJETC_HOME/GUS/Model/schema/VERSION >>4) rebuild the objects by: >> >>%> build GUS/Model install -append >> >>This should create new objects for these schema spaces for use in plugins >> >>Document your schema and send it to the gusdev list once you have the >>bugs worked and and THEN we can look to see if existing GUS tables >>already fill that role or if these are new tables that should be folded >>into GUS propoer or if these new tables actually do a better job that >>the current GUS tables that holds this data. >>I think this is a good model for further GUS development, and we can >>release these types of development efforts as "contributed" modules of >>code. The contributed modules can be upgraded to "official" after they >>pass a review process and provide data migration scripts. >> >>Any comments? >>Angel >> >>Thomas Otto wrote: >> >> >> >>>ftp://ftp.genome.ad.jp/pub/ >>>ftp://ftp.genome.ad.jp/pub/kegg/tarfiles >>> >>> >>>from the secound directory I download and parsed the tar files. This >> >> >>>suppose to be the files, like the kegg-engine are using them. >>> >>>Thomas >>> >>>Terry Clark wrote: >>> >>> >>> >>>>Thomas, >>>>Could you send a pointer to the KEGG FTP URL you are >>>>accessing and the names (or type of) files that you are using? I >>>>would appreciate this to see how >>>>others are using KEGG. >>>> >>>>OMIM is orthogonal; I just described it and the >>>>parts of KEGG I addressed to explain the material >>>>on my web site. >>>> >>>>Terry >>>> >>>> >>>>On Fri, Mar 12, 2004 at 11:08:29AM -0300, Thomas Otto wrote: >>>> >>>> >>>> >>>> >>>>>Terry, >>>>> >>>>>first, I parse the information from the files of the ftp-server. I >>>>>do not use the xml stuff. >>>>> >>>>>True, there is a lot of stuff in kegg, but I want to represent it in >>>>>GUS, so... >>>>> >>>>>... I think a link to OMIM makes sense, also to Motif/prosite. For >>>>>me it is important to have all the relations without redundend. >>>>> >>>>>Give me some time, I will see, which parts might help me, >>>>> >>>>>cheers, >>>>>Thomas >>>>> >>>>> >>>>>Terry Clark wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>Thomas, >>>>>> >>>>>>A few months ago I worked out a preliminary set of tables >>>>>>for *some* KEGG pathway data. >>>>>>(By the way, those reactions are not complete in some of >>>>>>the XML files.) I expect you know, there are two representations >>>>>>in the XML: one with proteins as nodes; the other with reaction >>>>>>products as nodes and proteins as edges. I focused on explicitly >>>>>>representing the latter (the <reaction> tag part of the XML) but >>>>>>have toyed with the idea to do a comprehensive KEGG representation >>>>>>in GUS unless someone else does it first :-) >>>>>> >>>>>>I also have a plugin to load the tables. All I am discussing has >>>>>>been implemented as a site-specific/local extension to GUS. I put >>>>>>the database files (not the plugin - we can talk >>>>>>about that later if you are interested) on the web page >>>>>> >>>>>>http://flora.uchicago.edu/gus/keggschemadraft/ >>>>>> >>>>>>One file you might check first at the URL is ** kegg-tables.sql >>>>>>*** >>>>>> >>>>>>I am not satisfied with some of the design, and there may be some >>>>>>uninspired things there - caveat emptor. I was waiting for KEGG to >>>>>>fix some problems in the reaction tags, >>>>>>and put together this operational db prototype in the meantime. >>>>>>(For example, check the missing - as of three weeks ago still - >>>>>>reaction tags in the XML for phenylalanine metabolism.) >>>>>> >>>>>>Anyway the files I point to above might be useful to start with >>>>>>rather than from scratch. >>>>>> >>>>>>I also hacked together some table(s) for EnzymeCommission >>>>>>numbers; there might be something in GUS for that, but at >>>>>>the time what I did seemed faster for my needs than otherwise. >>>>>>The OMIM parts there are not related to KEGG but part of a project >>>>>>I am working on. >>>>>> >>>>>>Terry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>On Thu, Mar 11, 2004 at 04:58:48PM -0300, Thomas Otto wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>I think I will all the relation in one table... sort of >>>>>>>EnzymeRelations... >>>>>>> >>>>>>>Thomas >>>>>>>Thomas Otto wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>Okay, an example: >>>>>>>>To every enzyme are several reaction associated. i.e. Enzymea, >>>>>>>>has reaction1, ..., reaction n. >>>>>>>>(1) I could be to save the name of the reaction in the table >>>>>>>>EnzymeClassAttribute >>>>>>>>(2) would be, to save nothing of this in the >>>>>>>>EnzymeClassAttribute, but in a table (to create) >>>>>>>>EnzymeReactionRelation. >>>>>>>> >>>>>>>>This will mean a lot of new table, because just for the enzymes >>>>>>>>there are pathways, reaction, compound, substrate, product... >>>>>>>> >>>>>>>>But indeed, I also prefer #2, who is less redundend. >>>>>>>> >>>>>>>>Thanks, >>>>>>>>Thomas >>>>>>>> >>>>>>>>Angel Pizarro wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>In the absence of an example of data a definite schema, >>>>>>>>>definitely #2 as the data model of choice. You can always >>>>>>>>>implement materialized view to speed-up/ease queries. >>>>>>>>> >>>>>>>>>Cheers >>>>>>>>>Angel >>>>>>>>> >>>>>>>>>Thomas Otto wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>That's what I was thinking about. So I want to know, which >>>>>>>>>>representation of datas you prefer. (1 or 2, it is more a >>>>>>>>>>general question). >>>>>>>>>> >>>>>>>>>>Thomas >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>Angel Pizarro wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>OK, I don't see where you are getting EnzymCompoundRelation & >>>>>>>>>>>CompoundClassAttribute from. Our current version of GUS does >>>>>>>>>>>not have these tables. The SRes.EnzymeClass table is also >>>>>>>>>>>looking very wacky to me. Propose a nice structure for this >>>>>>>>>>>and we can put it in the next release of GUS. >>>>>>>>>>> >>>>>>>>>>>Angel >>>>>>>>>>> >>>>>>>>>>>Thomas Otto wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>Hello, >>>>>>>>>>>> >>>>>>>>>>>>I am uploading some kegg data in the GUS system... >>>>>>>>>>>> >>>>>>>>>>>>So I looked which tables are existing, and which we are >>>>>>>>>>>>needing i.e. for compound, reactions. Now I am not sure how >>>>>>>>>>>>is the convention to relate the data. >>>>>>>>>>>> >>>>>>>>>>>>Example: >>>>>>>>>>>>The Enzymes are related to the compounds. >>>>>>>>>>>>(1) In EzymeClassAttribute I write the names of the compound >>>>>>>>>>>>related to each enzyme. In CompoundClassAttribute I write >>>>>>>>>>>>also all EC-numbers, related to it. >>>>>>>>>>>>(2) Other possibilty would be to utilize a thrid table >>>>>>>>>>>>EnzymeCompoundRelation, where I put the primkeys of the >>>>>>>>>>>>relations. >>>>>>>>>>>> >>>>>>>>>>>>(1) would be faster, easier to query, but redundend. >>>>>>>>>>>>(2) Cleaner >>>>>>>>>>>> >>>>>>>>>>>>What are you thinking I should use? >>>>>>>>>>>> >>>>>>>>>>>>Thanks, >>>>>>>>>>>>Thomas >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>------------------------------------------------------- >>>>>>>>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials >>>>>>>>>>>>Free Linux tutorial presented by Daniel Robbins, President >>>>>>>>>>>>and CEO of >>>>>>>>>>>>GenToo technologies. Learn everything from fundamentals to >>>>>>>>>>>>system >>>>>>>>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >>>>>>>>>>>> >>>>>>>>>>>>_______________________________________________ >>>>>>>>>>>>Gusdev-gusdev mailing list >>>>>>>>>>>>Gus...@li... >>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>------------------------------------------------------- >>>>>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials >>>>>>>>>Free Linux tutorial presented by Daniel Robbins, President and >>>>>>>>>CEO of >>>>>>>>>GenToo technologies. Learn everything from fundamentals to system >>>>>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >>>>>>>>> >>>>>>>>>_______________________________________________ >>>>>>>>>Gusdev-gusdev mailing list >>>>>>>>>Gus...@li... >>>>>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>------------------------------------------------------- >>>>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials >>>>>>>>Free Linux tutorial presented by Daniel Robbins, President and >>>>>>>>CEO of >>>>>>>>GenToo technologies. Learn everything from fundamentals to system >>>>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >>>>>>>> >>>>>>>>_______________________________________________ >>>>>>>>Gusdev-gusdev mailing list >>>>>>>>Gus...@li... >>>>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>------------------------------------------------------- >>>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials >>>>>>>Free Linux tutorial presented by Daniel Robbins, President and CEO of >>>>>>>GenToo technologies. Learn everything from fundamentals to system >>>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >>>>>>>_______________________________________________ >>>>>>>Gusdev-gusdev mailing list >>>>>>>Gus...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>------------------------------------------------------- >>>>This SF.Net email is sponsored by: IBM Linux Tutorials >>>>Free Linux tutorial presented by Daniel Robbins, President and CEO of >>>>GenToo technologies. Learn everything from fundamentals to system >>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >>>>_______________________________________________ >>>>Gusdev-gusdev mailing list >>>>Gus...@li... >>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>>> >>>> >>>> >>>> >>> >>> >>>------------------------------------------------------- >>>This SF.Net email is sponsored by: IBM Linux Tutorials >>>Free Linux tutorial presented by Daniel Robbins, President and CEO of >>>GenToo technologies. Learn everything from fundamentals to system >>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >>>_______________________________________________ >>>Gusdev-gusdev mailing list >>>Gus...@li... >>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> > > >------------------------------------------------------- >This SF.Net email is sponsored by: IBM Linux Tutorials >Free Linux tutorial presented by Daniel Robbins, President and CEO of >GenToo technologies. Learn everything from fundamentals to system >administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > |