From: Jinal J. <jjh...@vb...> - 2004-08-25 21:48:07
|
Hi, As per my understanding the "gbparser" loads the uniq genes obtained from the genbank file in dots.nagene table. Our group is planning to do annotation (and reannotation) of several genomes and thus we might have multiple instances of the same gene and the reviewers might add comments etc. All these required information is not available with dots.nagene but with dots.gene (which make sense). I do understand the handling of central dogma (thanks to the excellent explanation on WIKI) but what I don't understand is how to handle the transition between each phase. How shall one load the genes obtained from the genbank file (i.e in dots.nagene) in dots.gene (& dots.geninstance). Is it the matter of writing our own script/plugin (I am fine with it), or am I missing a very basic step of the pipeline? I am asking this because, in our project we are at a stage where I have to make decision (which might be very inefficient/wrong from the design perspective but FAST as far as time is concerned) ,whether to change the GUS Schema and let the data be in dots.nafeatureimp and dots.nagene table OR to use the dots.gene, dots.geneinstance, etc tables and proceed logically. Please Help!!!!!! On the other note, I remember Steve mentioning about plans to arrange a GUS Users Meeting in PCBI, in one of the mails. IMHO It would be great if we can have that as I know atleast of 3 to 4 more groups (at Virginia Bioinformatics Institute) are planning to use GUS and everyone is waiting for a GUS User Meeting like this where we can clear our doubts and become more efficient to handle it for our projects. I am sure other groups too will be interested in joining this. Any thoughts/comments on the plan ? |
From: Steve F. <sfi...@pc...> - 2004-08-25 22:19:07
|
To my knowledge we haven't done this yet. We are just now starting the planning phases of how to completely use the central dogma to captures gene, geneinstances, rnas, etc. So far we have only been using it partially. Others have opinions? steve On Aug 25, 2004, at 5:47 PM, Jinal Jhaveri wrote: > Hi, > > As per my understanding the "gbparser" loads the uniq genes obtained > from the > genbank file in dots.nagene table. Our group is planning to do > annotation > (and reannotation) of several genomes and thus we might have multiple > instances of the same gene and the reviewers might add comments etc. > All > these required information is not available with dots.nagene but with > dots.gene (which make sense). I do understand the handling of central > dogma > (thanks to the excellent explanation on WIKI) but what I don't > understand is > how to handle the transition between each phase. How shall one load > the genes > obtained from the genbank file (i.e in dots.nagene) in dots.gene (& > dots.geninstance). Is it the matter of writing our own script/plugin > (I am > fine with it), or am I missing a very basic step of the pipeline? I am > asking > this because, in our project we are at a stage where I have to make > decision > (which might be very inefficient/wrong from the design perspective but > FAST > as far as time is concerned) ,whether to change the GUS Schema and let > the > data be in dots.nafeatureimp and dots.nagene table OR to use the > dots.gene, > dots.geneinstance, etc tables and proceed logically. > > Please Help!!!!!! > > > On the other note, I remember Steve mentioning about plans to arrange > a GUS > Users Meeting in PCBI, in one of the mails. IMHO It would be great if > we can > have that as I know atleast of 3 to 4 more groups (at Virginia > Bioinformatics > Institute) are planning to use GUS and everyone is waiting for a GUS > User > Meeting like this where we can clear our doubts and become more > efficient to > handle it for our projects. I am sure other groups too will be > interested in > joining this. Any thoughts/comments on the plan ? > > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Y. T. G. <yg...@pc...> - 2004-08-26 15:40:35
|
> -----Original Message----- > From: gus...@li... > [mailto:gus...@li...] On Behalf > Of Steve Fischer > Sent: Wednesday, August 25, 2004 5:18 PM > To: Jinal Jhaveri > Cc: gus...@li... > Subject: Re: [Gusdev-gusdev] dots.nagene to dots.gene > > > To my knowledge we haven't done this yet. We are just now starting > the planning phases of how to completely use the central dogma to > captures gene, geneinstances, rnas, etc. So far we have only been > using it partially. > > Others have opinions? I think it is correct in that GUS does not yet have any plugin that populate the central dogma gene tables, and that annotators make use of some of these tables. However in our Allgenes project (see allgenes.org), we have pipelines to build DoTS transcripts (DTs) from ESTs, and DoTS Genes (DGs) from DTs by aligning them to the genome. In the latter pipeline, I have plugins (not part of GUS yet) that does the following: 1) make a DoTS.GeneInstanceCategory entry (I made the very first non-placeholder entry, so I am pretty sure we have not used this table much before) 2) make DoTS.GeneInstance entries for each of the DGs created by my pipeline, and give them the gene_instance_category_id I got above 3) associate each DoTS.GeneInstance with a corresponding DoTS.Gene entry (for us, we already have a prior set of DoTS.Gene entries made by an orthogonal algorithm so I only had to map my gene instances to them instead of creating DoTS.Gene entries anew) I also created full sets of DoTS.GeneFeature, DoTS.RnaFeature, DoTS.ExonFeature, but I will ommit the details here since it is pretty clear to see from the wiki page how to do this. -Thomas > > steve > > > On Aug 25, 2004, at 5:47 PM, Jinal Jhaveri wrote: > > > Hi, > > > > As per my understanding the "gbparser" loads the uniq genes obtained > > from the > > genbank file in dots.nagene table. Our group is planning to do > > annotation > > (and reannotation) of several genomes and thus we might > have multiple > > instances of the same gene and the reviewers might add > comments etc. > > All > > these required information is not available with > dots.nagene but with > > dots.gene (which make sense). I do understand the handling > of central > > dogma > > (thanks to the excellent explanation on WIKI) but what I don't > > understand is > > how to handle the transition between each phase. How shall one load > > the genes > > obtained from the genbank file (i.e in dots.nagene) in dots.gene (& > > dots.geninstance). Is it the matter of writing our own > script/plugin > > (I am > > fine with it), or am I missing a very basic step of the > pipeline? I am > > asking > > this because, in our project we are at a stage where I have to make > > decision > > (which might be very inefficient/wrong from the design > perspective but > > FAST > > as far as time is concerned) ,whether to change the GUS > Schema and let > > the > > data be in dots.nafeatureimp and dots.nagene table OR to use the > > dots.gene, > > dots.geneinstance, etc tables and proceed logically. > > > > Please Help!!!!!! > > > > > > On the other note, I remember Steve mentioning about plans > to arrange > > a GUS > > Users Meeting in PCBI, in one of the mails. IMHO It would > be great if > > we can > > have that as I know atleast of 3 to 4 more groups (at Virginia > > Bioinformatics > > Institute) are planning to use GUS and everyone is waiting > for a GUS > > User > > Meeting like this where we can clear our doubts and become more > > efficient to > > handle it for our projects. I am sure other groups too will be > > interested in > > joining this. Any thoughts/comments on the plan ? > > > > > > > > ------------------------------------------------------- > > SF.Net email is sponsored by Shop4tech.com-Lowest price on > Blank Media > > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for > only $33 Save > > 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > _______________________________________________ > > Gusdev-gusdev mailing list Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on > Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic > DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free > Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Jinal J. <jjh...@vb...> - 2004-08-26 16:12:49
|
Thanks Thomas, You reply was really helpful. I was thinking on the same track. Here is what I am planning to do. I already have one of my genomes annotated (and submited to genbank. And I was thinking of generating gbk files for the other genomes and submit it to gus using gbparser). So following are the steps I am planning to take 1) Parse the genbank file and thus I will have most of my na tables filled 2) Fill the geneinstance and the gene tables and connect geneinstance to nafeatureimp using na_feature_id 3) Fill the aa_featureimp table and connect it to nafeatureimp using na_feature_id 4) Fill the RNA table(just ids) and connecting it to gene table 5) Fill the protein_instance table connect it to aafeatureimp through aa_feature_id table 6) Fill the protein table and establish relation with RNA using rna_id and ofcourse with protein_instance One thing which isn't clear is how will I fill in the rnainstance table. How can this information be available from ncbi genbank? Well I understand that I can ignore it for a while and just generate fake rna_ids just to establish connection between gene_ids and protein_ids (though it would be suboptimal) Any suggestions/comments? On Thursday 26 August 2004 11:40 am, Y. Thomas Gan wrote: > > -----Original Message----- > > From: gus...@li... > > [mailto:gus...@li...] On Behalf > > Of Steve Fischer > > Sent: Wednesday, August 25, 2004 5:18 PM > > To: Jinal Jhaveri > > Cc: gus...@li... > > Subject: Re: [Gusdev-gusdev] dots.nagene to dots.gene > > > > > > To my knowledge we haven't done this yet. We are just now starting > > the planning phases of how to completely use the central dogma to > > captures gene, geneinstances, rnas, etc. So far we have only been > > using it partially. > > > > Others have opinions? > > I think it is correct in that GUS does not yet have any plugin > that populate the central dogma gene tables, and that annotators > make use of some of these tables. > > However in our Allgenes project (see allgenes.org), we have pipelines > to build DoTS transcripts (DTs) from ESTs, and DoTS Genes (DGs) from DTs > by aligning them to the genome. In the latter pipeline, I have plugins > (not part of GUS yet) that does the following: > 1) make a DoTS.GeneInstanceCategory entry (I made the very first > non-placeholder entry, > so I am pretty sure we have not used this table much before) > 2) make DoTS.GeneInstance entries for each of the DGs created by my > pipeline, > and give them the gene_instance_category_id I got above > 3) associate each DoTS.GeneInstance with a corresponding DoTS.Gene entry > (for us, we already have a prior set of DoTS.Gene entries made by an > orthogonal algorithm > so I only had to map my gene instances to them instead of creating > DoTS.Gene entries anew) > > I also created full sets of DoTS.GeneFeature, DoTS.RnaFeature, > DoTS.ExonFeature, but I > will ommit the details here since it is pretty clear to see from the > wiki page how to do this. > > -Thomas > > > steve > > > > On Aug 25, 2004, at 5:47 PM, Jinal Jhaveri wrote: > > > Hi, > > > > > > As per my understanding the "gbparser" loads the uniq genes obtained > > > from the > > > genbank file in dots.nagene table. Our group is planning to do > > > annotation > > > (and reannotation) of several genomes and thus we might > > > > have multiple > > > > > instances of the same gene and the reviewers might add > > > > comments etc. > > > > > All > > > these required information is not available with > > > > dots.nagene but with > > > > > dots.gene (which make sense). I do understand the handling > > > > of central > > > > > dogma > > > (thanks to the excellent explanation on WIKI) but what I don't > > > understand is > > > how to handle the transition between each phase. How shall one load > > > the genes > > > obtained from the genbank file (i.e in dots.nagene) in dots.gene (& > > > dots.geninstance). Is it the matter of writing our own > > > > script/plugin > > > > > (I am > > > fine with it), or am I missing a very basic step of the > > > > pipeline? I am > > > > > asking > > > this because, in our project we are at a stage where I have to make > > > decision > > > (which might be very inefficient/wrong from the design > > > > perspective but > > > > > FAST > > > as far as time is concerned) ,whether to change the GUS > > > > Schema and let > > > > > the > > > data be in dots.nafeatureimp and dots.nagene table OR to use the > > > dots.gene, > > > dots.geneinstance, etc tables and proceed logically. > > > > > > Please Help!!!!!! > > > > > > > > > On the other note, I remember Steve mentioning about plans > > > > to arrange > > > > > a GUS > > > Users Meeting in PCBI, in one of the mails. IMHO It would > > > > be great if > > > > > we can > > > have that as I know atleast of 3 to 4 more groups (at Virginia > > > Bioinformatics > > > Institute) are planning to use GUS and everyone is waiting > > > > for a GUS > > > > > User > > > Meeting like this where we can clear our doubts and become more > > > efficient to > > > handle it for our projects. I am sure other groups too will be > > > interested in > > > joining this. Any thoughts/comments on the plan ? > > > > > > > > > > > > ------------------------------------------------------- > > > SF.Net email is sponsored by Shop4tech.com-Lowest price on > > > > Blank Media > > > > > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for > > > > only $33 Save > > > > > 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > > > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > > _______________________________________________ > > > Gusdev-gusdev mailing list Gus...@li... > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > ------------------------------------------------------- > > SF.Net email is sponsored by Shop4tech.com-Lowest price on > > Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic > > DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free > > Shipping and Free Gift. > > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > > > _______________________________________________ > > Gusdev-gusdev mailing list > > Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Y. T. G. <yg...@pc...> - 2004-08-26 17:10:53
|
Hi Jinal, Your plan sounds reasonable to me. As for the question regarding RnaInstance, it is parallel to GeneInstance. You just need to make a RnaInstanceCategory entry (we have 0: unknown, 1: mRNA, 2: assembly <of ESTs>), make RnaInstance entries to be associated with this category and with RNAs. Also make RnaFeature using info such as rna name, number of exons to be attached to your RnaInstance. -Thomas > -----Original Message----- > From: gus...@li... > [mailto:gus...@li...] On Behalf > Of Jinal Jhaveri > Sent: Thursday, August 26, 2004 11:12 AM > To: gus...@li... > Cc: yg...@pc... > Subject: Re: [Gusdev-gusdev] dots.nagene to dots.gene > > > > Thanks Thomas, > > You reply was really helpful. I was thinking on the same > track. Here is what I > am planning to do. I already have one of my genomes annotated > (and submited > to genbank. And I was thinking of generating gbk files for > the other genomes > and submit it to gus using gbparser). So following are the steps I am > planning to take > > 1) Parse the genbank file and thus I will have most of my na > tables filled > 2) Fill the geneinstance and the gene tables and connect > geneinstance to > nafeatureimp using na_feature_id > 3) Fill the aa_featureimp table and connect it to nafeatureimp using > na_feature_id > 4) Fill the RNA table(just ids) and connecting it to gene table > 5) Fill the protein_instance table connect it to aafeatureimp through > aa_feature_id table > 6) Fill the protein table and establish relation with RNA > using rna_id and > ofcourse with protein_instance > > One thing which isn't clear is how will I fill in the > rnainstance table. How > can this information be available from ncbi genbank? Well I > understand that I > can ignore it for a while and just generate fake rna_ids just > to establish > connection between gene_ids and protein_ids (though it would > be suboptimal) > > Any suggestions/comments? > > > > > > > On Thursday 26 August 2004 11:40 am, Y. Thomas Gan wrote: > > > -----Original Message----- > > > From: gus...@li... > > > [mailto:gus...@li...] On Behalf Of > > > Steve Fischer > > > Sent: Wednesday, August 25, 2004 5:18 PM > > > To: Jinal Jhaveri > > > Cc: gus...@li... > > > Subject: Re: [Gusdev-gusdev] dots.nagene to dots.gene > > > > > > > > > To my knowledge we haven't done this yet. We are just > now starting > > > the planning phases of how to completely use the central dogma to > > > captures gene, geneinstances, rnas, etc. So far we have > only been > > > using it partially. > > > > > > Others have opinions? > > > > I think it is correct in that GUS does not yet have any plugin that > > populate the central dogma gene tables, and that annotators > make use > > of some of these tables. > > > > However in our Allgenes project (see allgenes.org), we have > pipelines > > to build DoTS transcripts (DTs) from ESTs, and DoTS Genes > (DGs) from > > DTs by aligning them to the genome. In the latter pipeline, I have > > plugins (not part of GUS yet) that does the following: > > 1) make a DoTS.GeneInstanceCategory entry (I made the very first > > non-placeholder entry, so I am pretty sure we have not used > this table > > much before) > > 2) make DoTS.GeneInstance entries for each of the DGs created by my > > pipeline, and give them the gene_instance_category_id I got above > > 3) associate each DoTS.GeneInstance with a corresponding > DoTS.Gene entry > > (for us, we already have a prior set of DoTS.Gene entries made by an > > orthogonal algorithm > > so I only had to map my gene instances to them instead of creating > > DoTS.Gene entries anew) > > > > I also created full sets of DoTS.GeneFeature, DoTS.RnaFeature, > > DoTS.ExonFeature, but I will ommit the details here since > it is pretty > > clear to see from the wiki page how to do this. > > > > -Thomas > > > > > steve > > > > > > On Aug 25, 2004, at 5:47 PM, Jinal Jhaveri wrote: > > > > Hi, > > > > > > > > As per my understanding the "gbparser" loads the uniq genes > > > > obtained from the genbank file in dots.nagene table. > Our group is > > > > planning to do annotation > > > > (and reannotation) of several genomes and thus we might > > > > > > have multiple > > > > > > > instances of the same gene and the reviewers might add > > > > > > comments etc. > > > > > > > All > > > > these required information is not available with > > > > > > dots.nagene but with > > > > > > > dots.gene (which make sense). I do understand the handling > > > > > > of central > > > > > > > dogma > > > > (thanks to the excellent explanation on WIKI) but what I don't > > > > understand is how to handle the transition between each > phase. How > > > > shall one load the genes > > > > obtained from the genbank file (i.e in dots.nagene) in > dots.gene (& > > > > dots.geninstance). Is it the matter of writing our own > > > > > > script/plugin > > > > > > > (I am > > > > fine with it), or am I missing a very basic step of the > > > > > > pipeline? I am > > > > > > > asking > > > > this because, in our project we are at a stage where I have to > > > > make decision (which might be very inefficient/wrong from the > > > > design > > > > > > perspective but > > > > > > > FAST > > > > as far as time is concerned) ,whether to change the GUS > > > > > > Schema and let > > > > > > > the > > > > data be in dots.nafeatureimp and dots.nagene table OR > to use the > > > > dots.gene, dots.geneinstance, etc tables and proceed logically. > > > > > > > > Please Help!!!!!! > > > > > > > > > > > > On the other note, I remember Steve mentioning about plans > > > > > > to arrange > > > > > > > a GUS > > > > Users Meeting in PCBI, in one of the mails. IMHO It would > > > > > > be great if > > > > > > > we can > > > > have that as I know atleast of 3 to 4 more groups (at Virginia > > > > Bioinformatics > > > > Institute) are planning to use GUS and everyone is waiting > > > > > > for a GUS > > > > > > > User > > > > Meeting like this where we can clear our doubts and become more > > > > efficient to handle it for our projects. I am sure other groups > > > > too will be interested in > > > > joining this. Any thoughts/comments on the plan ? > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > SF.Net email is sponsored by Shop4tech.com-Lowest price on > > > > > > Blank Media > > > > > > > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for > > > > > > only $33 Save > > > > > > > 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > > > > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > > > _______________________________________________ > > > > Gusdev-gusdev mailing list Gus...@li... > > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > ------------------------------------------------------- > > > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank > > > Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic > > > DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free > > > Shipping and Free Gift. > > > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > > > > > _______________________________________________ > > > Gusdev-gusdev mailing list Gus...@li... > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > ------------------------------------------------------- > > SF.Net email is sponsored by Shop4tech.com-Lowest price on > Blank Media > > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for > only $33 Save > > 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > _______________________________________________ > > Gusdev-gusdev mailing list Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on > Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic > DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free > Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Chris S. <sto...@pc...> - 2004-08-26 22:15:14
|
Hi Jinal, Thanks for bringing this up. I have started looking into this and hope to have some plans to report soon. Cheers, Chris On Aug 25, 2004, at 5:47 PM, Jinal Jhaveri wrote: > On the other note, I remember Steve mentioning about plans to arrange > a GUS > Users Meeting in PCBI, in one of the mails. IMHO It would be great if > we can > have that as I know atleast of 3 to 4 more groups (at Virginia > Bioinformatics > Institute) are planning to use GUS and everyone is waiting for a GUS > User > Meeting like this where we can clear our doubts and become more > efficient to > handle it for our projects. I am sure other groups too will be > interested in > joining this. Any thoughts/comments on the plan ? > Chris Stoeckert, Ph.D. Research Associate Professor, Dept. of Genetics 1415 Blockley Hall, Center for Bioinformatics 423 Guardian Dr., University of Pennsylvania Philadelphia, PA 19104 Ph: 215-573-4409 FAX: 215-573-3111 |