From: <tw...@cs...> - 2004-03-12 17:04:53
|
Angel, This sounds good. I am not working on the KEGG tables at the moment so will that and OMIM to Thomas, but "I'll be back..." (spoken with thick Austrian accent). I would be tempted also to ask for 1. test data for the tables and a plugin to load them if not to query them, (maybe that's the "data migration scripts") 2. an entity-relationship or UML like diagram as part of the documentation you mention The one problem I see is that after someone goes through all the trouble of creating new stuff, the GUS core group tells them, hey, look over there... Maybe some preliminary review, but gusdev could be that. Terry On Fri, Mar 12, 2004 at 11:20:44AM -0500, Angel Pizarro wrote: > Thomas and Terry, > I think that both OMIM and KEGG are good candidates for integration into > GUS. Here is my advise on how to proceed and get you guys running as > quick as possible: > > 1) Create new schema spaces OMIM and KEGG. Enter this data into > Core.DatabaseInfo. > 2) Make the tables as GUS compliant and enter the Core.TableInfo information > 3) Update the timestamp of $PROJETC_HOME/GUS/Model/schema/VERSION by: > %> touch $PROJETC_HOME/GUS/Model/schema/VERSION > 4) rebuild the objects by: > > %> build GUS/Model install -append > > This should create new objects for these schema spaces for use in plugins > > Document your schema and send it to the gusdev list once you have the > bugs worked and and THEN we can look to see if existing GUS tables > already fill that role or if these are new tables that should be folded > into GUS propoer or if these new tables actually do a better job that > the current GUS tables that holds this data. > I think this is a good model for further GUS development, and we can > release these types of development efforts as "contributed" modules of > code. The contributed modules can be upgraded to "official" after they > pass a review process and provide data migration scripts. > > Any comments? > Angel > > Thomas Otto wrote: > > >ftp://ftp.genome.ad.jp/pub/ > >ftp://ftp.genome.ad.jp/pub/kegg/tarfiles > >from the secound directory I download and parsed the tar files. This > >suppose to be the files, like the kegg-engine are using them. > > > >Thomas > > > >Terry Clark wrote: > > > >>Thomas, > >>Could you send a pointer to the KEGG FTP URL you are > >>accessing and the names (or type of) files that you are using? I > >>would appreciate this to see how > >>others are using KEGG. > >> > >>OMIM is orthogonal; I just described it and the > >>parts of KEGG I addressed to explain the material > >>on my web site. > >> > >>Terry > >> > >> > >>On Fri, Mar 12, 2004 at 11:08:29AM -0300, Thomas Otto wrote: > >> > >> > >>>Terry, > >>> > >>>first, I parse the information from the files of the ftp-server. I > >>>do not use the xml stuff. > >>> > >>>True, there is a lot of stuff in kegg, but I want to represent it in > >>>GUS, so... > >>> > >>>... I think a link to OMIM makes sense, also to Motif/prosite. For > >>>me it is important to have all the relations without redundend. > >>> > >>>Give me some time, I will see, which parts might help me, > >>> > >>>cheers, > >>>Thomas > >>> > >>> > >>>Terry Clark wrote: > >>> > >>> > >>> > >>>>Thomas, > >>>> > >>>>A few months ago I worked out a preliminary set of tables > >>>>for *some* KEGG pathway data. > >>>>(By the way, those reactions are not complete in some of > >>>>the XML files.) I expect you know, there are two representations > >>>>in the XML: one with proteins as nodes; the other with reaction > >>>>products as nodes and proteins as edges. I focused on explicitly > >>>>representing the latter (the <reaction> tag part of the XML) but > >>>>have toyed with the idea to do a comprehensive KEGG representation > >>>>in GUS unless someone else does it first :-) > >>>> > >>>>I also have a plugin to load the tables. All I am discussing has > >>>>been implemented as a site-specific/local extension to GUS. I put > >>>>the database files (not the plugin - we can talk > >>>>about that later if you are interested) on the web page > >>>> > >>>>http://flora.uchicago.edu/gus/keggschemadraft/ > >>>> > >>>>One file you might check first at the URL is ** kegg-tables.sql > >>>>*** > >>>> > >>>>I am not satisfied with some of the design, and there may be some > >>>>uninspired things there - caveat emptor. I was waiting for KEGG to > >>>>fix some problems in the reaction tags, > >>>>and put together this operational db prototype in the meantime. > >>>>(For example, check the missing - as of three weeks ago still - > >>>>reaction tags in the XML for phenylalanine metabolism.) > >>>> > >>>>Anyway the files I point to above might be useful to start with > >>>>rather than from scratch. > >>>> > >>>>I also hacked together some table(s) for EnzymeCommission > >>>>numbers; there might be something in GUS for that, but at > >>>>the time what I did seemed faster for my needs than otherwise. > >>>>The OMIM parts there are not related to KEGG but part of a project > >>>>I am working on. > >>>> > >>>>Terry > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>On Thu, Mar 11, 2004 at 04:58:48PM -0300, Thomas Otto wrote: > >>>> > >>>> > >>>> > >>>> > >>>>>I think I will all the relation in one table... sort of > >>>>>EnzymeRelations... > >>>>> > >>>>>Thomas > >>>>>Thomas Otto wrote: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>Okay, an example: > >>>>>>To every enzyme are several reaction associated. i.e. Enzymea, > >>>>>>has reaction1, ..., reaction n. > >>>>>>(1) I could be to save the name of the reaction in the table > >>>>>>EnzymeClassAttribute > >>>>>>(2) would be, to save nothing of this in the > >>>>>>EnzymeClassAttribute, but in a table (to create) > >>>>>>EnzymeReactionRelation. > >>>>>> > >>>>>>This will mean a lot of new table, because just for the enzymes > >>>>>>there are pathways, reaction, compound, substrate, product... > >>>>>> > >>>>>>But indeed, I also prefer #2, who is less redundend. > >>>>>> > >>>>>>Thanks, > >>>>>>Thomas > >>>>>> > >>>>>>Angel Pizarro wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>>In the absence of an example of data a definite schema, > >>>>>>>definitely #2 as the data model of choice. You can always > >>>>>>>implement materialized view to speed-up/ease queries. > >>>>>>> > >>>>>>>Cheers > >>>>>>>Angel > >>>>>>> > >>>>>>>Thomas Otto wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>That's what I was thinking about. So I want to know, which > >>>>>>>>representation of datas you prefer. (1 or 2, it is more a > >>>>>>>>general question). > >>>>>>>> > >>>>>>>>Thomas > >>>>>>>> > >>>>>>>> > >>>>>>>>Angel Pizarro wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>>OK, I don't see where you are getting EnzymCompoundRelation & > >>>>>>>>>CompoundClassAttribute from. Our current version of GUS does > >>>>>>>>>not have these tables. The SRes.EnzymeClass table is also > >>>>>>>>>looking very wacky to me. Propose a nice structure for this > >>>>>>>>>and we can put it in the next release of GUS. > >>>>>>>>> > >>>>>>>>>Angel > >>>>>>>>> > >>>>>>>>>Thomas Otto wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>Hello, > >>>>>>>>>> > >>>>>>>>>>I am uploading some kegg data in the GUS system... > >>>>>>>>>> > >>>>>>>>>>So I looked which tables are existing, and which we are > >>>>>>>>>>needing i.e. for compound, reactions. Now I am not sure how > >>>>>>>>>>is the convention to relate the data. > >>>>>>>>>> > >>>>>>>>>>Example: > >>>>>>>>>>The Enzymes are related to the compounds. > >>>>>>>>>>(1) In EzymeClassAttribute I write the names of the compound > >>>>>>>>>>related to each enzyme. In CompoundClassAttribute I write > >>>>>>>>>>also all EC-numbers, related to it. > >>>>>>>>>>(2) Other possibilty would be to utilize a thrid table > >>>>>>>>>>EnzymeCompoundRelation, where I put the primkeys of the > >>>>>>>>>>relations. > >>>>>>>>>> > >>>>>>>>>>(1) would be faster, easier to query, but redundend. > >>>>>>>>>>(2) Cleaner > >>>>>>>>>> > >>>>>>>>>>What are you thinking I should use? > >>>>>>>>>> > >>>>>>>>>>Thanks, > >>>>>>>>>>Thomas > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>------------------------------------------------------- > >>>>>>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials > >>>>>>>>>>Free Linux tutorial presented by Daniel Robbins, President > >>>>>>>>>>and CEO of > >>>>>>>>>>GenToo technologies. Learn everything from fundamentals to > >>>>>>>>>>system > >>>>>>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > >>>>>>>>>> > >>>>>>>>>>_______________________________________________ > >>>>>>>>>>Gusdev-gusdev mailing list > >>>>>>>>>>Gus...@li... > >>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>------------------------------------------------------- > >>>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials > >>>>>>>Free Linux tutorial presented by Daniel Robbins, President and > >>>>>>>CEO of > >>>>>>>GenToo technologies. Learn everything from fundamentals to system > >>>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > >>>>>>> > >>>>>>>_______________________________________________ > >>>>>>>Gusdev-gusdev mailing list > >>>>>>>Gus...@li... > >>>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>>------------------------------------------------------- > >>>>>>This SF.Net email is sponsored by: IBM Linux Tutorials > >>>>>>Free Linux tutorial presented by Daniel Robbins, President and > >>>>>>CEO of > >>>>>>GenToo technologies. Learn everything from fundamentals to system > >>>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > >>>>>> > >>>>>>_______________________________________________ > >>>>>>Gusdev-gusdev mailing list > >>>>>>Gus...@li... > >>>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>>------------------------------------------------------- > >>>>>This SF.Net email is sponsored by: IBM Linux Tutorials > >>>>>Free Linux tutorial presented by Daniel Robbins, President and CEO of > >>>>>GenToo technologies. Learn everything from fundamentals to system > >>>>>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > >>>>>_______________________________________________ > >>>>>Gusdev-gusdev mailing list > >>>>>Gus...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>> > >> > >> > >>------------------------------------------------------- > >>This SF.Net email is sponsored by: IBM Linux Tutorials > >>Free Linux tutorial presented by Daniel Robbins, President and CEO of > >>GenToo technologies. Learn everything from fundamentals to system > >>administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > >>_______________________________________________ > >>Gusdev-gusdev mailing list > >>Gus...@li... > >>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > >> > >> > >> > > > > > > > > > >------------------------------------------------------- > >This SF.Net email is sponsored by: IBM Linux Tutorials > >Free Linux tutorial presented by Daniel Robbins, President and CEO of > >GenToo technologies. Learn everything from fundamentals to system > >administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > >_______________________________________________ > >Gusdev-gusdev mailing list > >Gus...@li... > >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |