From: Deborah P. <pi...@pc...> - 2005-07-08 17:15:58
|
LoadNRDB.pm is not the appropriate plugin to use to load BLAST results=20 into GUS. It is specifically designed to load defline and sequence=20 information from NCBI's non redundant protein database files. It uses=20 the gitax file to get taxon information for each of the gi numbers in=20 the defline so as to attach a taxon_id. Therefore, LoadTaxon would have=20 to be run with NCBI's taxonomy files prior to running LoadNRDB. There is at least one plugin that can be used to load BLAST results,=20 InsertBlastSimilarities.pm, which is one of the GUS/Supporterd plugins.=20 Take a look at that and see if it serves your purposes. = = =20 Debbie =20 Michael Saffitz wrote: >Hi Debbie, > >Can you help with this? I think we should start with an overview of wha= t >LoadNRDB is doing-- specifically how it handles the gitax file. We can = then >figure out if Fabricio's issue is in the data or code. > >Please include gusdev in your reply-- I think it will be useful informat= ion. > >--Mike > >------ Forwarded Message > =20 > >>From: Fabr=EDcio <fab...@de...> >>Date: Thu, 9 Jun 2005 16:38:44 -0300 >>To: <gus...@li...> >>Subject: [GUSDEV] Load a Blast result using LoadNRDB >> >>Hello all,=20 >> >>=20 >> >>We=B9re trying to insert a Blast result into Gus Schema using LoadNRDB >>plug-in. Our blast result was filtered and is in a Fasta format accordi= ng to >>the plug-in requirement. The problem is the load process takes many tim= es >>and our postgres server process crashes after some hours, thus the plug= -in >>never concludes. I would like to know why this plug-in is so slow, if t= he >>file we want to store is not so big. I think that it=B9s due to the >>=ADgitax=3Dgi_taxid_prot.dmp parameter that has a large file. >> >>=20 >> >>Does anyone could help us to understand this? >> >>=20 >> >>Thanks a lot, >> >>=20 >> >>Fabr=EDcio. >> >> =20 >> > >------ End of Forwarded Message > =20 > |
From: Deborah P. <pi...@pc...> - 2005-07-08 17:25:28
|
I reread your e-mail and am guessing that your file is a file of=20 sequences (query or subject ?) with information on the defline=20 pertaining to a BLAST analysis. Perhaps you can send an excerpt of your=20 file and a bit more explanation. Perhaps we can make some better=20 suggestions. = =20 Debbie Deborah Pinney wrote: > LoadNRDB.pm is not the appropriate plugin to use to load BLAST results=20 > into GUS. It is specifically designed to load defline and sequence=20 > information from NCBI's non redundant protein database files. It uses=20 > the gitax file to get taxon information for each of the gi numbers in=20 > the defline so as to attach a taxon_id. Therefore, LoadTaxon would=20 > have to be run with NCBI's taxonomy files prior to running LoadNRDB. > > There is at least one plugin that can be used to load BLAST results,=20 > InsertBlastSimilarities.pm, which is one of the GUS/Supporterd=20 > plugins. Take a look at that and see if it serves your purposes. > = = =20 > Debbie > > =20 > > > > Michael Saffitz wrote: > >> Hi Debbie, >> >> Can you help with this? I think we should start with an overview of=20 >> what >> LoadNRDB is doing-- specifically how it handles the gitax file. We=20 >> can then >> figure out if Fabricio's issue is in the data or code. >> >> Please include gusdev in your reply-- I think it will be useful=20 >> information. >> >> --Mike >> >> ------ Forwarded Message >> =20 >> >>> From: Fabr=EDcio <fab...@de...> >>> Date: Thu, 9 Jun 2005 16:38:44 -0300 >>> To: <gus...@li...> >>> Subject: [GUSDEV] Load a Blast result using LoadNRDB >>> >>> Hello all, >>> >>> >>> We=B9re trying to insert a Blast result into Gus Schema using LoadNRD= B >>> plug-in. Our blast result was filtered and is in a Fasta format=20 >>> according to >>> the plug-in requirement. The problem is the load process takes many=20 >>> times >>> and our postgres server process crashes after some hours, thus the=20 >>> plug-in >>> never concludes. I would like to know why this plug-in is so slow,=20 >>> if the >>> file we want to store is not so big. I think that it=B9s due to the >>> =ADgitax=3Dgi_taxid_prot.dmp parameter that has a large file. >>> >>> >>> >>> Does anyone could help us to understand this? >>> >>> >>> >>> Thanks a lot, >>> >>> >>> >>> Fabr=EDcio. >>> >>> =20 >> >> >> ------ End of Forwarded Message >> =20 >> > > > > ------------------------------------------------------- > This SF.Net email is sponsored by the 'Do More With Dual!' webinar=20 > happening > July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dua= l > core and dual graphics technology at this free one hour event hosted=20 > by HP, AMD, and NVIDIA. To register visit=20 > http://www.hp.com/go/dualwebinar > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Deborah P. <pi...@pc...> - 2005-07-08 17:41:57
|
And finally I see your attachment and it looks like it is in the right=20 format to be loaded with LoadNRDB. However, if the defline only has a=20 single source_id, I think I would use LoadFastaSequences.pm to load your=20 sequences into dots.AASequence or one of its views. This plugin is=20 generic and loads your choice of sequence tables with source_id, etc=20 defined using regex. I think you will find that more satisfactory. = =20 Debbie Deborah Pinney wrote: > I reread your e-mail and am guessing that your file is a file of=20 > sequences (query or subject ?) with information on the defline=20 > pertaining to a BLAST analysis. Perhaps you can send an excerpt of=20 > your file and a bit more explanation. Perhaps we can make some better=20 > suggestions. > > = =20 > Debbie > > > > Deborah Pinney wrote: > >> LoadNRDB.pm is not the appropriate plugin to use to load BLAST=20 >> results into GUS. It is specifically designed to load defline and=20 >> sequence information from NCBI's non redundant protein database=20 >> files. It uses the gitax file to get taxon information for each of=20 >> the gi numbers in the defline so as to attach a taxon_id. Therefore,=20 >> LoadTaxon would have to be run with NCBI's taxonomy files prior to=20 >> running LoadNRDB. >> >> There is at least one plugin that can be used to load BLAST results,=20 >> InsertBlastSimilarities.pm, which is one of the GUS/Supporterd=20 >> plugins. Take a look at that and see if it serves your purposes. >> = = =20 >> Debbie >> >> =20 >> >> >> Michael Saffitz wrote: >> >>> Hi Debbie, >>> >>> Can you help with this? I think we should start with an overview of=20 >>> what >>> LoadNRDB is doing-- specifically how it handles the gitax file. We=20 >>> can then >>> figure out if Fabricio's issue is in the data or code. >>> >>> Please include gusdev in your reply-- I think it will be useful=20 >>> information. >>> >>> --Mike >>> >>> ------ Forwarded Message >>> =20 >>> >>>> From: Fabr=EDcio <fab...@de...> >>>> Date: Thu, 9 Jun 2005 16:38:44 -0300 >>>> To: <gus...@li...> >>>> Subject: [GUSDEV] Load a Blast result using LoadNRDB >>>> >>>> Hello all, >>>> >>>> >>>> We=B9re trying to insert a Blast result into Gus Schema using LoadNR= DB >>>> plug-in. Our blast result was filtered and is in a Fasta format=20 >>>> according to >>>> the plug-in requirement. The problem is the load process takes many=20 >>>> times >>>> and our postgres server process crashes after some hours, thus the=20 >>>> plug-in >>>> never concludes. I would like to know why this plug-in is so slow,=20 >>>> if the >>>> file we want to store is not so big. I think that it=B9s due to the >>>> =ADgitax=3Dgi_taxid_prot.dmp parameter that has a large file. >>>> >>>> >>>> >>>> Does anyone could help us to understand this? >>>> >>>> >>>> >>>> Thanks a lot, >>>> >>>> >>>> >>>> Fabr=EDcio. >>>> >>>> =20 >>> >>> >>> >>> ------ End of Forwarded Message >>> =20 >>> >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by the 'Do More With Dual!' webinar=20 >> happening >> July 14 at 8am PDT/11am EDT. We invite you to explore the latest in du= al >> core and dual graphics technology at this free one hour event hosted=20 >> by HP, AMD, and NVIDIA. To register visit=20 >> http://www.hp.com/go/dualwebinar >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |