You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(11) |
Jul
(34) |
Aug
(14) |
Sep
(10) |
Oct
(10) |
Nov
(11) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(56) |
Feb
(76) |
Mar
(68) |
Apr
(11) |
May
(97) |
Jun
(16) |
Jul
(29) |
Aug
(35) |
Sep
(18) |
Oct
(32) |
Nov
(23) |
Dec
(77) |
2004 |
Jan
(52) |
Feb
(44) |
Mar
(55) |
Apr
(38) |
May
(106) |
Jun
(82) |
Jul
(76) |
Aug
(47) |
Sep
(36) |
Oct
(56) |
Nov
(46) |
Dec
(61) |
2005 |
Jan
(52) |
Feb
(118) |
Mar
(41) |
Apr
(40) |
May
(35) |
Jun
(99) |
Jul
(84) |
Aug
(104) |
Sep
(53) |
Oct
(107) |
Nov
(68) |
Dec
(30) |
2006 |
Jan
(19) |
Feb
(27) |
Mar
(24) |
Apr
(9) |
May
(22) |
Jun
(11) |
Jul
(34) |
Aug
(8) |
Sep
(15) |
Oct
(55) |
Nov
(16) |
Dec
(2) |
2007 |
Jan
(12) |
Feb
(4) |
Mar
(8) |
Apr
|
May
(19) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
(12) |
Oct
(3) |
Nov
|
Dec
|
2008 |
Jan
(4) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(21) |
2009 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(19) |
Jun
(14) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
(22) |
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Steve F. <sfi...@pc...> - 2005-02-11 18:48:37
|
see below Alberto Davila wrote: >We are doing this for Garsa (another system) .. basically we have a >bioperl parser (Bio::Search::IO) that reads the Blast results file and >extract all the needed info (to the "Blast_Hit" table)... and also load >into a given table (eg: External_DB) all the sequences (in fasta format) >presenting similarity with the queries... at the end we have "Blast_Hit" >and "External_DB" populated with the same script. > > > wow, great. could you make a gus plugin from that? >Regarding Interpro and Glimmer, the main problem is to know in which >tables we should load the parsed results ? > > > describe the info you want to store. steve >Alberto > >On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: > > >>I was going to give the same answer steve gave for interpro and gene >>finding results. >> >>For loading sequences into GUS, the dillema with option 2 is: how do you >>know which sequence to load when you load (which is before you actually >>have the similarity result)? One solution would be to initially load >>complete dataset(s) but delete those without similarity after loading >>similarity results. >> >>-Thomas >> >>On Fri, 11 Feb 2005, Steve Fischer wrote: >> >> >> >>>alberto- >>> >>>we've never loaded interpro, so there isn't a plugin. >>>i believe plasmodb has loaded glimmer results, though i'm not sure. i have >>>asked a plasmodb developer to answer that question. >>> >>>steve >>> >>>Alberto Davila wrote: >>> >>> >>> >>>>Hey Steve, Thomas, >>>> >>>>Thanks a lot for the tips, really helpful.. now, few more questions: >>>> >>>> >>>> >>>> >>>>>ok. NR = NRDB >>>>> >>>>>the way we have used gus with similarities is that both the query and >>>>>subject are loaded into gus. As thomas explained, the similarity table >>>>>captures similarity between sequences that are in gus. >>>>>our approach has always been to just load (warehouse) the entire subject >>>>>database (NR, EST) that we are blasting against. >>>>> >>>>>the current plugins and blastSimilarity are set up for this. >>>>> >>>>>obviously, this takes a lot of disk space. two major efficiencies that we >>>>>don't currently have plugins for would be: >>>>> 1. to only store in gus a *reference* to the external sequence (ie, don't >>>>>store the actgs). >>>>> 2. only store in gus the sequences that actually have similarities >>>>> >>>>> >>>>> >>>>Option 2 sound better for us, since we will be blasting against several >>>>databases (> 10GB databases) >>>> >>>>What about the plugins to load Interpro and "gene finder" (glimmer, etc) >>>>results ? Is there any at all ? >>>> >>>>Cheers, Alberto >>>> >>>> >>>> >>>> >>>>>steve >>>>> >>>>>Alberto Davila wrote: >>>>> >>>>> >>>>> >>>>> >>>>>>All the blastable databases I mentioned are standard databases from NCBI >>>>>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>>>>> >>>>>>NT = nucleotides >>>>>> >>>>>>~30000 entries from genbank (genbank format) are loaded into GUS now. >>>>>> >>>>>>Not sure about your "NRDB", I know NR from NCBI that is a collection of >>>>>>aminoacid entries, could it be the same ? >>>>>> >>>>>>Alberto >>>>>> >>>>>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>(what is NT?) >>>>>>> >>>>>>>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >>>>>>>gus? >>>>>>> >>>>>>>steve >>>>>>> >>>>>>>Alberto Davila wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>Query: >>>>>>>> >>>>>>>>Either sequences from genbank (genbank format) or sequences generated >>>>>>>>in >>>>>>>>the lab (fasta format) >>>>>>>> >>>>>>>>Blastable databases (all are formatted databases from NCBI): >>>>>>>> >>>>>>>>NR >>>>>>>>NT >>>>>>>>EST >>>>>>>> >>>>>>>>Alberto >>>>>>>> >>>>>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>for the blast, what are the query sequences and what are the blastable >>>>>>>>>databases? >>>>>>>>> >>>>>>>>>steve >>>>>>>>> >>>>>>>>>Alberto Davila wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>Basically we will use sequences (loaded into GUS with the GBParser) >>>>>>>>>>for >>>>>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be >>>>>>>>>>also >>>>>>>>>>used for Interpro analyses. Results of both (Blast and Interpro) will >>>>>>>>>>be >>>>>>>>>>loaded into GUS. We will parse specific things from the Blast >>>>>>>>>>results, I >>>>>>>>>>would say: >>>>>>>>>> >>>>>>>>>>`Gi` `Accession` `Description` `E_value` `Score` `Length` >>>>>>>>>>`Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` >>>>>>>>>>`Conserved` `Hsp_Frac_Conserved` >>>>>>>>>>`Query_Start` >>>>>>>>>>`Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` >>>>>>>>>>`database_entries` >>>>>>>>>>We already have a Bioperl parser for that (specific for another >>>>>>>>>>system: >>>>>>>>>>GARSA) that could be adapted to GUS, problem being we are not sure >>>>>>>>>>what >>>>>>>>>>tables should be used to store those data in GUS. >>>>>>>>>> >>>>>>>>>>Cheers, Alberto >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>what are you planning on blasting? >>>>>>>>>>> >>>>>>>>>>>steve >>>>>>>>>>> >>>>>>>>>>>Alberto Davila wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>Hi Steve, >>>>>>>>>>>> >>>>>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>poliana- >>>>>>>>>>>>> >>>>>>>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. >>>>>>>>>>>>>it should instruct you to use the blastSimilarity command. >>>>>>>>>>>>> >>>>>>>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and >>>>>>>>>>>>>query sequences are in GUS, and their def. lines have GUS primary >>>>>>>>>>>>>keys. >>>>>>>>>>>>>Are your sequences already loaded into GUS? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>They are not, there would be any howto/tips for that plugin ? We >>>>>>>>>>>>will >>>>>>>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" >>>>>>>>>>>>results >>>>>>>>>>>>into GUS... If they are not available, then maybe we will have to >>>>>>>>>>>>write >>>>>>>>>>>>them ... >>>>>>>>>>>> >>>>>>>>>>>>Cheers, Alberto >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>steve >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>Poliana Mateus wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>Hello all, >>>>>>>>>>>>>> >>>>>>>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>>>>>I'm trying to run LoadBlastSimFast... >>>>>>>>>>>>>> >>>>>>>>>>>>>>Poliana >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> |
From: Alberto D. <da...@io...> - 2005-02-11 18:45:05
|
We are doing this for Garsa (another system) .. basically we have a bioperl parser (Bio::Search::IO) that reads the Blast results file and extract all the needed info (to the "Blast_Hit" table)... and also load into a given table (eg: External_DB) all the sequences (in fasta format) presenting similarity with the queries... at the end we have "Blast_Hit" and "External_DB" populated with the same script. Regarding Interpro and Glimmer, the main problem is to know in which tables we should load the parsed results ? Alberto On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: > I was going to give the same answer steve gave for interpro and gene > finding results. > > For loading sequences into GUS, the dillema with option 2 is: how do you > know which sequence to load when you load (which is before you actually > have the similarity result)? One solution would be to initially load > complete dataset(s) but delete those without similarity after loading > similarity results. > > -Thomas > > On Fri, 11 Feb 2005, Steve Fischer wrote: > > > alberto- > > > > we've never loaded interpro, so there isn't a plugin. > > i believe plasmodb has loaded glimmer results, though i'm not sure. i have > > asked a plasmodb developer to answer that question. > > > > steve > > > > Alberto Davila wrote: > > > >> Hey Steve, Thomas, > >> > >> Thanks a lot for the tips, really helpful.. now, few more questions: > >> > >> > >>> ok. NR = NRDB > >>> > >>> the way we have used gus with similarities is that both the query and > >>> subject are loaded into gus. As thomas explained, the similarity table > >>> captures similarity between sequences that are in gus. > >>> our approach has always been to just load (warehouse) the entire subject > >>> database (NR, EST) that we are blasting against. > >>> > >>> the current plugins and blastSimilarity are set up for this. > >>> > >>> obviously, this takes a lot of disk space. two major efficiencies that we > >>> don't currently have plugins for would be: > >>> 1. to only store in gus a *reference* to the external sequence (ie, don't > >>> store the actgs). > >>> 2. only store in gus the sequences that actually have similarities > >>> > >> > >> Option 2 sound better for us, since we will be blasting against several > >> databases (> 10GB databases) > >> > >> What about the plugins to load Interpro and "gene finder" (glimmer, etc) > >> results ? Is there any at all ? > >> > >> Cheers, Alberto > >> > >> > >>> steve > >>> > >>> Alberto Davila wrote: > >>> > >>> > >>>> All the blastable databases I mentioned are standard databases from NCBI > >>>> (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > >>>> > >>>> NT = nucleotides > >>>> > >>>> ~30000 entries from genbank (genbank format) are loaded into GUS now. > >>>> > >>>> Not sure about your "NRDB", I know NR from NCBI that is a collection of > >>>> aminoacid entries, could it be the same ? > >>>> > >>>> Alberto > >>>> > >>>> On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > >>>> > >>>> > >>>> > >>>>> (what is NT?) > >>>>> > >>>>> which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into > >>>>> gus? > >>>>> > >>>>> steve > >>>>> > >>>>> Alberto Davila wrote: > >>>>> > >>>>> > >>>>> > >>>>>> Query: > >>>>>> > >>>>>> Either sequences from genbank (genbank format) or sequences generated > >>>>>> in > >>>>>> the lab (fasta format) > >>>>>> > >>>>>> Blastable databases (all are formatted databases from NCBI): > >>>>>> > >>>>>> NR > >>>>>> NT > >>>>>> EST > >>>>>> > >>>>>> Alberto > >>>>>> > >>>>>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>> for the blast, what are the query sequences and what are the blastable > >>>>>>> databases? > >>>>>>> > >>>>>>> steve > >>>>>>> > >>>>>>> Alberto Davila wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> Basically we will use sequences (loaded into GUS with the GBParser) > >>>>>>>> for > >>>>>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be > >>>>>>>> also > >>>>>>>> used for Interpro analyses. Results of both (Blast and Interpro) will > >>>>>>>> be > >>>>>>>> loaded into GUS. We will parse specific things from the Blast > >>>>>>>> results, I > >>>>>>>> would say: > >>>>>>>> > >>>>>>>> `Gi` `Accession` `Description` `E_value` `Score` `Length` > >>>>>>>> `Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` > >>>>>>>> `Conserved` `Hsp_Frac_Conserved` > >>>>>>>> `Query_Start` > >>>>>>>> `Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` > >>>>>>>> `database_entries` > >>>>>>>> We already have a Bioperl parser for that (specific for another > >>>>>>>> system: > >>>>>>>> GARSA) that could be adapted to GUS, problem being we are not sure > >>>>>>>> what > >>>>>>>> tables should be used to store those data in GUS. > >>>>>>>> > >>>>>>>> Cheers, Alberto > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> what are you planning on blasting? > >>>>>>>>> > >>>>>>>>> steve > >>>>>>>>> > >>>>>>>>> Alberto Davila wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> Hi Steve, > >>>>>>>>>> > >>>>>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> poliana- > >>>>>>>>>>> > >>>>>>>>>>> oops, the usage statement for LoadBlastSimFast is out of date. > >>>>>>>>>>> it should instruct you to use the blastSimilarity command. > >>>>>>>>>>> > >>>>>>>>>>> LoadBlastSimFast makes a big assumption, that the subject and > >>>>>>>>>>> query sequences are in GUS, and their def. lines have GUS primary > >>>>>>>>>>> keys. > >>>>>>>>>>> Are your sequences already loaded into GUS? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> They are not, there would be any howto/tips for that plugin ? We > >>>>>>>>>> will > >>>>>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" > >>>>>>>>>> results > >>>>>>>>>> into GUS... If they are not available, then maybe we will have to > >>>>>>>>>> write > >>>>>>>>>> them ... > >>>>>>>>>> > >>>>>>>>>> Cheers, Alberto > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> steve > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Poliana Mateus wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> Hello all, > >>>>>>>>>>>> > >>>>>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? > >>>>>>>>>>>> I'm trying to run LoadBlastSimFast... > >>>>>>>>>>>> > >>>>>>>>>>>> Poliana > >>>>>>>>>>>> > >>>>>>> |
From: Steve F. <sfi...@pc...> - 2005-02-11 18:35:29
|
I think the way to do it would be to make a new plugin LoadSimilaritySequences. It would be like LoadBlastSimFast and LoadBlastSimilaritiesPK, in that it wouldread the output of blastSimilarity. But, unlike them, the subject sequences would have source_ids not na_sequence_ids. (the query sequences would still be stored in gus and extracted using dumpSequencesFromTable) the plugin would: - take as an argument the ExternalDatabase and its Version (eg, NRDB 1.3) - call the plugin superclass's getExtDbRelId() to get the external_database_release_id. - use that id to query to get all (source_id, na_sequence_id) pairs that exist for that external database release - put that in a hash, with source_id as key - take as an argument the file holding the similarities - optionally take as an argument a fasta.gz file holding the subject database. - run through all the similarities in the input file. - if a subject sequence is not already in the db (not found in hash), add it (optionally including the actgs if the fasta file is provided) - then, use that sequence's na_sequence_id to form the Similarity steve Y. Thomas Gan wrote: > I was going to give the same answer steve gave for interpro and gene > finding results. > > For loading sequences into GUS, the dillema with option 2 is: how do > you know which sequence to load when you load (which is before you > actually have the similarity result)? One solution would be to > initially load complete dataset(s) but delete those without similarity > after loading similarity results. > > -Thomas > > On Fri, 11 Feb 2005, Steve Fischer wrote: > >> alberto- >> >> we've never loaded interpro, so there isn't a plugin. i believe >> plasmodb has loaded glimmer results, though i'm not sure. i have >> asked a plasmodb developer to answer that question. >> >> steve >> >> Alberto Davila wrote: >> >>> Hey Steve, Thomas, >>> >>> Thanks a lot for the tips, really helpful.. now, few more questions: >>> >>> >>>> ok. NR = NRDB >>>> >>>> the way we have used gus with similarities is that both the query >>>> and subject are loaded into gus. As thomas explained, the >>>> similarity table captures similarity between sequences that are in >>>> gus. our approach has always been to just load (warehouse) the >>>> entire subject database (NR, EST) that we are blasting against. >>>> >>>> the current plugins and blastSimilarity are set up for this. >>>> >>>> obviously, this takes a lot of disk space. two major efficiencies >>>> that we don't currently have plugins for would be: >>>> 1. to only store in gus a *reference* to the external sequence >>>> (ie, don't store the actgs). >>>> 2. only store in gus the sequences that actually have similarities >>>> >>> >>> Option 2 sound better for us, since we will be blasting against several >>> databases (> 10GB databases) >>> >>> What about the plugins to load Interpro and "gene finder" (glimmer, >>> etc) >>> results ? Is there any at all ? >>> >>> Cheers, Alberto >>> >>> >>>> steve >>>> >>>> Alberto Davila wrote: >>>> >>>> >>>>> All the blastable databases I mentioned are standard databases >>>>> from NCBI >>>>> (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>>>> >>>>> NT = nucleotides >>>>> >>>>> ~30000 entries from genbank (genbank format) are loaded into GUS now. >>>>> >>>>> Not sure about your "NRDB", I know NR from NCBI that is a >>>>> collection of >>>>> aminoacid entries, could it be the same ? >>>>> >>>>> Alberto >>>>> >>>>> On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>>> (what is NT?) >>>>>> >>>>>> which of these (genbank, your fasta, NRDB, NT, EST) have you >>>>>> loaded into gus? >>>>>> >>>>>> steve >>>>>> >>>>>> Alberto Davila wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Query: >>>>>>> >>>>>>> Either sequences from genbank (genbank format) or sequences >>>>>>> generated in >>>>>>> the lab (fasta format) >>>>>>> >>>>>>> Blastable databases (all are formatted databases from NCBI): >>>>>>> >>>>>>> NR >>>>>>> NT >>>>>>> EST >>>>>>> >>>>>>> Alberto >>>>>>> >>>>>>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> for the blast, what are the query sequences and what are the >>>>>>>> blastable databases? >>>>>>>> >>>>>>>> steve >>>>>>>> >>>>>>>> Alberto Davila wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Basically we will use sequences (loaded into GUS with the >>>>>>>>> GBParser) for >>>>>>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences >>>>>>>>> will be also >>>>>>>>> used for Interpro analyses. Results of both (Blast and >>>>>>>>> Interpro) will be >>>>>>>>> loaded into GUS. We will parse specific things from the Blast >>>>>>>>> results, I >>>>>>>>> would say: >>>>>>>>> >>>>>>>>> `Gi` `Accession` `Description` `E_value` `Score` `Length` >>>>>>>>> `Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` >>>>>>>>> `Conserved` `Hsp_Frac_Conserved` >>>>>>>>> `Query_Start` >>>>>>>>> `Query_End` `Hit_Start` `Hit_End` `Hsp_Align` >>>>>>>>> `database_letters` `database_entries` We already have a >>>>>>>>> Bioperl parser for that (specific for another system: >>>>>>>>> GARSA) that could be adapted to GUS, problem being we are not >>>>>>>>> sure what >>>>>>>>> tables should be used to store those data in GUS. >>>>>>>>> >>>>>>>>> Cheers, Alberto >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> what are you planning on blasting? >>>>>>>>>> >>>>>>>>>> steve >>>>>>>>>> >>>>>>>>>> Alberto Davila wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hi Steve, >>>>>>>>>>> >>>>>>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> poliana- >>>>>>>>>>>> >>>>>>>>>>>> oops, the usage statement for LoadBlastSimFast is out of >>>>>>>>>>>> date. it should instruct you to use the blastSimilarity >>>>>>>>>>>> command. >>>>>>>>>>>> >>>>>>>>>>>> LoadBlastSimFast makes a big assumption, that the subject >>>>>>>>>>>> and query sequences are in GUS, and their def. lines have >>>>>>>>>>>> GUS primary keys. Are your sequences already loaded into GUS? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> They are not, there would be any howto/tips for that plugin >>>>>>>>>>> ? We will >>>>>>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" >>>>>>>>>>> results >>>>>>>>>>> into GUS... If they are not available, then maybe we will >>>>>>>>>>> have to write >>>>>>>>>>> them ... >>>>>>>>>>> >>>>>>>>>>> Cheers, Alberto >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> steve >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Poliana Mateus wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Hello all, >>>>>>>>>>>>> >>>>>>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>>>> I'm trying to run LoadBlastSimFast... >>>>>>>>>>>>> >>>>>>>>>>>>> Poliana >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>> >>>>> >>>>> >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> |
From: Y. T. G. <yon...@pc...> - 2005-02-11 18:21:55
|
I was going to give the same answer steve gave for interpro and gene finding results. For loading sequences into GUS, the dillema with option 2 is: how do you know which sequence to load when you load (which is before you actually have the similarity result)? One solution would be to initially load complete dataset(s) but delete those without similarity after loading similarity results. -Thomas On Fri, 11 Feb 2005, Steve Fischer wrote: > alberto- > > we've never loaded interpro, so there isn't a plugin. > i believe plasmodb has loaded glimmer results, though i'm not sure. i have > asked a plasmodb developer to answer that question. > > steve > > Alberto Davila wrote: > >> Hey Steve, Thomas, >> >> Thanks a lot for the tips, really helpful.. now, few more questions: >> >> >>> ok. NR = NRDB >>> >>> the way we have used gus with similarities is that both the query and >>> subject are loaded into gus. As thomas explained, the similarity table >>> captures similarity between sequences that are in gus. >>> our approach has always been to just load (warehouse) the entire subject >>> database (NR, EST) that we are blasting against. >>> >>> the current plugins and blastSimilarity are set up for this. >>> >>> obviously, this takes a lot of disk space. two major efficiencies that we >>> don't currently have plugins for would be: >>> 1. to only store in gus a *reference* to the external sequence (ie, don't >>> store the actgs). >>> 2. only store in gus the sequences that actually have similarities >>> >> >> Option 2 sound better for us, since we will be blasting against several >> databases (> 10GB databases) >> >> What about the plugins to load Interpro and "gene finder" (glimmer, etc) >> results ? Is there any at all ? >> >> Cheers, Alberto >> >> >>> steve >>> >>> Alberto Davila wrote: >>> >>> >>>> All the blastable databases I mentioned are standard databases from NCBI >>>> (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>>> >>>> NT = nucleotides >>>> >>>> ~30000 entries from genbank (genbank format) are loaded into GUS now. >>>> >>>> Not sure about your "NRDB", I know NR from NCBI that is a collection of >>>> aminoacid entries, could it be the same ? >>>> >>>> Alberto >>>> >>>> On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>>> >>>> >>>> >>>>> (what is NT?) >>>>> >>>>> which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >>>>> gus? >>>>> >>>>> steve >>>>> >>>>> Alberto Davila wrote: >>>>> >>>>> >>>>> >>>>>> Query: >>>>>> >>>>>> Either sequences from genbank (genbank format) or sequences generated >>>>>> in >>>>>> the lab (fasta format) >>>>>> >>>>>> Blastable databases (all are formatted databases from NCBI): >>>>>> >>>>>> NR >>>>>> NT >>>>>> EST >>>>>> >>>>>> Alberto >>>>>> >>>>>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> for the blast, what are the query sequences and what are the blastable >>>>>>> databases? >>>>>>> >>>>>>> steve >>>>>>> >>>>>>> Alberto Davila wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Basically we will use sequences (loaded into GUS with the GBParser) >>>>>>>> for >>>>>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be >>>>>>>> also >>>>>>>> used for Interpro analyses. Results of both (Blast and Interpro) will >>>>>>>> be >>>>>>>> loaded into GUS. We will parse specific things from the Blast >>>>>>>> results, I >>>>>>>> would say: >>>>>>>> >>>>>>>> `Gi` `Accession` `Description` `E_value` `Score` `Length` >>>>>>>> `Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` >>>>>>>> `Conserved` `Hsp_Frac_Conserved` >>>>>>>> `Query_Start` >>>>>>>> `Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` >>>>>>>> `database_entries` >>>>>>>> We already have a Bioperl parser for that (specific for another >>>>>>>> system: >>>>>>>> GARSA) that could be adapted to GUS, problem being we are not sure >>>>>>>> what >>>>>>>> tables should be used to store those data in GUS. >>>>>>>> >>>>>>>> Cheers, Alberto >>>>>>>> >>>>>>>> >>>>>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> what are you planning on blasting? >>>>>>>>> >>>>>>>>> steve >>>>>>>>> >>>>>>>>> Alberto Davila wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hi Steve, >>>>>>>>>> >>>>>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> poliana- >>>>>>>>>>> >>>>>>>>>>> oops, the usage statement for LoadBlastSimFast is out of date. >>>>>>>>>>> it should instruct you to use the blastSimilarity command. >>>>>>>>>>> >>>>>>>>>>> LoadBlastSimFast makes a big assumption, that the subject and >>>>>>>>>>> query sequences are in GUS, and their def. lines have GUS primary >>>>>>>>>>> keys. >>>>>>>>>>> Are your sequences already loaded into GUS? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> They are not, there would be any howto/tips for that plugin ? We >>>>>>>>>> will >>>>>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" >>>>>>>>>> results >>>>>>>>>> into GUS... If they are not available, then maybe we will have to >>>>>>>>>> write >>>>>>>>>> them ... >>>>>>>>>> >>>>>>>>>> Cheers, Alberto >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> steve >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Poliana Mateus wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Hello all, >>>>>>>>>>>> >>>>>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>>> I'm trying to run LoadBlastSimFast... >>>>>>>>>>>> >>>>>>>>>>>> Poliana >>>>>>>>>>>> >>>>>>>>>>>> >>>> >>>> >>>> > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Steve F. <sfi...@pc...> - 2005-02-11 18:10:16
|
alberto- we've never loaded interpro, so there isn't a plugin. i believe plasmodb has loaded glimmer results, though i'm not sure. i have asked a plasmodb developer to answer that question. steve Alberto Davila wrote: >Hey Steve, Thomas, > >Thanks a lot for the tips, really helpful.. now, few more questions: > > > >>ok. NR = NRDB >> >>the way we have used gus with similarities is that both the query and >>subject are loaded into gus. As thomas explained, the similarity table >>captures similarity between sequences that are in gus. >> >>our approach has always been to just load (warehouse) the entire subject >>database (NR, EST) that we are blasting against. >> >>the current plugins and blastSimilarity are set up for this. >> >>obviously, this takes a lot of disk space. two major efficiencies that >>we don't currently have plugins for would be: >> 1. to only store in gus a *reference* to the external sequence (ie, >>don't store the actgs). >> 2. only store in gus the sequences that actually have similarities >> >> > >Option 2 sound better for us, since we will be blasting against several >databases (> 10GB databases) > >What about the plugins to load Interpro and "gene finder" (glimmer, etc) >results ? Is there any at all ? > >Cheers, Alberto > > > >>steve >> >>Alberto Davila wrote: >> >> >> >>>All the blastable databases I mentioned are standard databases from NCBI >>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>> >>>NT = nucleotides >>> >>>~30000 entries from genbank (genbank format) are loaded into GUS now. >>> >>>Not sure about your "NRDB", I know NR from NCBI that is a collection of >>>aminoacid entries, could it be the same ? >>> >>>Alberto >>> >>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>> >>> >>> >>> >>>>(what is NT?) >>>> >>>>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >>>>gus? >>>> >>>>steve >>>> >>>>Alberto Davila wrote: >>>> >>>> >>>> >>>> >>>> >>>>>Query: >>>>> >>>>>Either sequences from genbank (genbank format) or sequences generated in >>>>>the lab (fasta format) >>>>> >>>>>Blastable databases (all are formatted databases from NCBI): >>>>> >>>>>NR >>>>>NT >>>>>EST >>>>> >>>>>Alberto >>>>> >>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>for the blast, what are the query sequences and what are the blastable >>>>>>databases? >>>>>> >>>>>>steve >>>>>> >>>>>>Alberto Davila wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>Basically we will use sequences (loaded into GUS with the GBParser) for >>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also >>>>>>>used for Interpro analyses. Results of both (Blast and Interpro) will be >>>>>>>loaded into GUS. We will parse specific things from the Blast results, I >>>>>>>would say: >>>>>>> >>>>>>>`Gi` >>>>>>>`Accession` >>>>>>>`Description` >>>>>>>`E_value` >>>>>>>`Score` >>>>>>>`Length` >>>>>>>`Frame_Query` >>>>>>>`Frame_Hit` >>>>>>>`Identical` >>>>>>>`Hsp_Frac_Identical` >>>>>>>`Conserved` >>>>>>>`Hsp_Frac_Conserved` >>>>>>>`Query_Start` >>>>>>>`Query_End` >>>>>>>`Hit_Start` >>>>>>>`Hit_End` >>>>>>>`Hsp_Align` >>>>>>>`database_letters` >>>>>>>`database_entries` >>>>>>> >>>>>>>We already have a Bioperl parser for that (specific for another system: >>>>>>>GARSA) that could be adapted to GUS, problem being we are not sure what >>>>>>>tables should be used to store those data in GUS. >>>>>>> >>>>>>>Cheers, Alberto >>>>>>> >>>>>>> >>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>what are you planning on blasting? >>>>>>>> >>>>>>>>steve >>>>>>>> >>>>>>>>Alberto Davila wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>Hi Steve, >>>>>>>>> >>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>poliana- >>>>>>>>>> >>>>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. it >>>>>>>>>>should instruct you to use the blastSimilarity command. >>>>>>>>>> >>>>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and query >>>>>>>>>>sequences are in GUS, and their def. lines have GUS primary keys. >>>>>>>>>> >>>>>>>>>>Are your sequences already loaded into GUS? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>They are not, there would be any howto/tips for that plugin ? We will >>>>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" results >>>>>>>>>into GUS... If they are not available, then maybe we will have to write >>>>>>>>>them ... >>>>>>>>> >>>>>>>>>Cheers, Alberto >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>steve >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>Poliana Mateus wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>Hello all, >>>>>>>>>>> >>>>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>>I'm trying to run LoadBlastSimFast... >>>>>>>>>>> >>>>>>>>>>>Poliana >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>> >>> >>> >>> |
From: Alberto D. <da...@io...> - 2005-02-11 17:36:17
|
Hey Steve, Thomas, Thanks a lot for the tips, really helpful.. now, few more questions: > ok. NR = NRDB > > the way we have used gus with similarities is that both the query and > subject are loaded into gus. As thomas explained, the similarity table > captures similarity between sequences that are in gus. > > our approach has always been to just load (warehouse) the entire subject > database (NR, EST) that we are blasting against. > > the current plugins and blastSimilarity are set up for this. > > obviously, this takes a lot of disk space. two major efficiencies that > we don't currently have plugins for would be: > 1. to only store in gus a *reference* to the external sequence (ie, > don't store the actgs). > 2. only store in gus the sequences that actually have similarities Option 2 sound better for us, since we will be blasting against several databases (> 10GB databases) What about the plugins to load Interpro and "gene finder" (glimmer, etc) results ? Is there any at all ? Cheers, Alberto > > steve > > Alberto Davila wrote: > > >All the blastable databases I mentioned are standard databases from NCBI > >(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > > > >NT = nucleotides > > > >~30000 entries from genbank (genbank format) are loaded into GUS now. > > > >Not sure about your "NRDB", I know NR from NCBI that is a collection of > >aminoacid entries, could it be the same ? > > > >Alberto > > > >On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > > > > > >>(what is NT?) > >> > >>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into > >>gus? > >> > >>steve > >> > >>Alberto Davila wrote: > >> > >> > >> > >>>Query: > >>> > >>>Either sequences from genbank (genbank format) or sequences generated in > >>>the lab (fasta format) > >>> > >>>Blastable databases (all are formatted databases from NCBI): > >>> > >>>NR > >>>NT > >>>EST > >>> > >>>Alberto > >>> > >>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > >>> > >>> > >>> > >>> > >>>>for the blast, what are the query sequences and what are the blastable > >>>>databases? > >>>> > >>>>steve > >>>> > >>>>Alberto Davila wrote: > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>>Basically we will use sequences (loaded into GUS with the GBParser) for > >>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also > >>>>>used for Interpro analyses. Results of both (Blast and Interpro) will be > >>>>>loaded into GUS. We will parse specific things from the Blast results, I > >>>>>would say: > >>>>> > >>>>>`Gi` > >>>>>`Accession` > >>>>>`Description` > >>>>>`E_value` > >>>>>`Score` > >>>>>`Length` > >>>>>`Frame_Query` > >>>>>`Frame_Hit` > >>>>>`Identical` > >>>>>`Hsp_Frac_Identical` > >>>>>`Conserved` > >>>>>`Hsp_Frac_Conserved` > >>>>>`Query_Start` > >>>>>`Query_End` > >>>>>`Hit_Start` > >>>>>`Hit_End` > >>>>>`Hsp_Align` > >>>>>`database_letters` > >>>>>`database_entries` > >>>>> > >>>>>We already have a Bioperl parser for that (specific for another system: > >>>>>GARSA) that could be adapted to GUS, problem being we are not sure what > >>>>>tables should be used to store those data in GUS. > >>>>> > >>>>>Cheers, Alberto > >>>>> > >>>>> > >>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>what are you planning on blasting? > >>>>>> > >>>>>>steve > >>>>>> > >>>>>>Alberto Davila wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>>Hi Steve, > >>>>>>> > >>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>poliana- > >>>>>>>> > >>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. it > >>>>>>>>should instruct you to use the blastSimilarity command. > >>>>>>>> > >>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and query > >>>>>>>>sequences are in GUS, and their def. lines have GUS primary keys. > >>>>>>>> > >>>>>>>>Are your sequences already loaded into GUS? > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>They are not, there would be any howto/tips for that plugin ? We will > >>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" results > >>>>>>>into GUS... If they are not available, then maybe we will have to write > >>>>>>>them ... > >>>>>>> > >>>>>>>Cheers, Alberto > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>steve > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>Poliana Mateus wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>>Hello all, > >>>>>>>>> > >>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? > >>>>>>>>>I'm trying to run LoadBlastSimFast... > >>>>>>>>> > >>>>>>>>>Poliana > >>>>>>>>> > >>>>>>>>> > > > > > > |
From: Y. T. G. <yon...@pc...> - 2005-02-11 15:57:57
|
Hi Alberto: When doing similarity analyses (BLASTs, BLAT) with GUS, here is the process we usually follow: 1: load query sequence set into GUS if not already in GUS (will get a GUS id for each sequence); same for target sequence set 2: dump from gus to get query fasta file and target fasta file using the script $GUS_HOME/bin/dumpSequencesFromTable.pl 3: run blast on computer cluster using the DistribJob software 4: run load plugin to load result into similarity table (DoTS.BlastSimilarity for BLAST, DoTS.BLATAlignment for BLAT alignments) In your case, I think you will need to load NR, NT as separate datasets into GUS (EST should already be in GUS) -Thomas On Fri, 11 Feb 2005, Alberto Davila wrote: > All the blastable databases I mentioned are standard databases from NCBI > (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > > NT = nucleotides > > ~30000 entries from genbank (genbank format) are loaded into GUS now. > > Not sure about your "NRDB", I know NR from NCBI that is a collection of > aminoacid entries, could it be the same ? > > Alberto > > On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >> (what is NT?) >> >> which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >> gus? >> >> steve >> >> Alberto Davila wrote: >> >>> Query: >>> >>> Either sequences from genbank (genbank format) or sequences generated in >>> the lab (fasta format) >>> >>> Blastable databases (all are formatted databases from NCBI): >>> >>> NR >>> NT >>> EST >>> >>> Alberto >>> >>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>> >>> >>>> for the blast, what are the query sequences and what are the blastable >>>> databases? >>>> >>>> steve >>>> >>>> Alberto Davila wrote: >>>> >>>> >>>> >>>>> Basically we will use sequences (loaded into GUS with the GBParser) for >>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also >>>>> used for Interpro analyses. Results of both (Blast and Interpro) will be >>>>> loaded into GUS. We will parse specific things from the Blast results, I >>>>> would say: >>>>> >>>>> `Gi` >>>>> `Accession` >>>>> `Description` >>>>> `E_value` >>>>> `Score` >>>>> `Length` >>>>> `Frame_Query` >>>>> `Frame_Hit` >>>>> `Identical` >>>>> `Hsp_Frac_Identical` >>>>> `Conserved` >>>>> `Hsp_Frac_Conserved` >>>>> `Query_Start` >>>>> `Query_End` >>>>> `Hit_Start` >>>>> `Hit_End` >>>>> `Hsp_Align` >>>>> `database_letters` >>>>> `database_entries` >>>>> >>>>> We already have a Bioperl parser for that (specific for another system: >>>>> GARSA) that could be adapted to GUS, problem being we are not sure what >>>>> tables should be used to store those data in GUS. >>>>> >>>>> Cheers, Alberto >>>>> >>>>> >>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> what are you planning on blasting? >>>>>> >>>>>> steve >>>>>> >>>>>> Alberto Davila wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Hi Steve, >>>>>>> >>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> poliana- >>>>>>>> >>>>>>>> oops, the usage statement for LoadBlastSimFast is out of date. it >>>>>>>> should instruct you to use the blastSimilarity command. >>>>>>>> >>>>>>>> LoadBlastSimFast makes a big assumption, that the subject and query >>>>>>>> sequences are in GUS, and their def. lines have GUS primary keys. >>>>>>>> >>>>>>>> Are your sequences already loaded into GUS? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> They are not, there would be any howto/tips for that plugin ? We will >>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" results >>>>>>> into GUS... If they are not available, then maybe we will have to write >>>>>>> them ... >>>>>>> >>>>>>> Cheers, Alberto >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> steve >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Poliana Mateus wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>> I'm trying to run LoadBlastSimFast... >>>>>>>>> >>>>>>>>> Poliana > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Steve F. <sfi...@pc...> - 2005-02-11 15:54:54
|
ok. NR = NRDB the way we have used gus with similarities is that both the query and subject are loaded into gus. As thomas explained, the similarity table captures similarity between sequences that are in gus. our approach has always been to just load (warehouse) the entire subject database (NR, EST) that we are blasting against. the current plugins and blastSimilarity are set up for this. obviously, this takes a lot of disk space. two major efficiencies that we don't currently have plugins for would be: 1. to only store in gus a *reference* to the external sequence (ie, don't store the actgs). 2. only store in gus the sequences that actually have similarities steve Alberto Davila wrote: >All the blastable databases I mentioned are standard databases from NCBI >(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > >NT = nucleotides > >~30000 entries from genbank (genbank format) are loaded into GUS now. > >Not sure about your "NRDB", I know NR from NCBI that is a collection of >aminoacid entries, could it be the same ? > >Alberto > >On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > > >>(what is NT?) >> >>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >>gus? >> >>steve >> >>Alberto Davila wrote: >> >> >> >>>Query: >>> >>>Either sequences from genbank (genbank format) or sequences generated in >>>the lab (fasta format) >>> >>>Blastable databases (all are formatted databases from NCBI): >>> >>>NR >>>NT >>>EST >>> >>>Alberto >>> >>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>> >>> >>> >>> >>>>for the blast, what are the query sequences and what are the blastable >>>>databases? >>>> >>>>steve >>>> >>>>Alberto Davila wrote: >>>> >>>> >>>> >>>> >>>> >>>>>Basically we will use sequences (loaded into GUS with the GBParser) for >>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also >>>>>used for Interpro analyses. Results of both (Blast and Interpro) will be >>>>>loaded into GUS. We will parse specific things from the Blast results, I >>>>>would say: >>>>> >>>>>`Gi` >>>>>`Accession` >>>>>`Description` >>>>>`E_value` >>>>>`Score` >>>>>`Length` >>>>>`Frame_Query` >>>>>`Frame_Hit` >>>>>`Identical` >>>>>`Hsp_Frac_Identical` >>>>>`Conserved` >>>>>`Hsp_Frac_Conserved` >>>>>`Query_Start` >>>>>`Query_End` >>>>>`Hit_Start` >>>>>`Hit_End` >>>>>`Hsp_Align` >>>>>`database_letters` >>>>>`database_entries` >>>>> >>>>>We already have a Bioperl parser for that (specific for another system: >>>>>GARSA) that could be adapted to GUS, problem being we are not sure what >>>>>tables should be used to store those data in GUS. >>>>> >>>>>Cheers, Alberto >>>>> >>>>> >>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>what are you planning on blasting? >>>>>> >>>>>>steve >>>>>> >>>>>>Alberto Davila wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>Hi Steve, >>>>>>> >>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>poliana- >>>>>>>> >>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. it >>>>>>>>should instruct you to use the blastSimilarity command. >>>>>>>> >>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and query >>>>>>>>sequences are in GUS, and their def. lines have GUS primary keys. >>>>>>>> >>>>>>>>Are your sequences already loaded into GUS? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>They are not, there would be any howto/tips for that plugin ? We will >>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" results >>>>>>>into GUS... If they are not available, then maybe we will have to write >>>>>>>them ... >>>>>>> >>>>>>>Cheers, Alberto >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>steve >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>Poliana Mateus wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>Hello all, >>>>>>>>> >>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>I'm trying to run LoadBlastSimFast... >>>>>>>>> >>>>>>>>>Poliana >>>>>>>>> >>>>>>>>> > > > |
From: Alberto D. <da...@io...> - 2005-02-11 15:49:00
|
All the blastable databases I mentioned are standard databases from NCBI (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): NT = nucleotides ~30000 entries from genbank (genbank format) are loaded into GUS now. Not sure about your "NRDB", I know NR from NCBI that is a collection of aminoacid entries, could it be the same ? Alberto On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > (what is NT?) > > which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into > gus? > > steve > > Alberto Davila wrote: > > >Query: > > > >Either sequences from genbank (genbank format) or sequences generated in > >the lab (fasta format) > > > >Blastable databases (all are formatted databases from NCBI): > > > >NR > >NT > >EST > > > >Alberto > > > >On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > > > > > >>for the blast, what are the query sequences and what are the blastable > >>databases? > >> > >>steve > >> > >>Alberto Davila wrote: > >> > >> > >> > >>>Basically we will use sequences (loaded into GUS with the GBParser) for > >>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also > >>>used for Interpro analyses. Results of both (Blast and Interpro) will be > >>>loaded into GUS. We will parse specific things from the Blast results, I > >>>would say: > >>> > >>> `Gi` > >>> `Accession` > >>> `Description` > >>> `E_value` > >>> `Score` > >>> `Length` > >>> `Frame_Query` > >>> `Frame_Hit` > >>> `Identical` > >>> `Hsp_Frac_Identical` > >>> `Conserved` > >>> `Hsp_Frac_Conserved` > >>> `Query_Start` > >>> `Query_End` > >>> `Hit_Start` > >>> `Hit_End` > >>> `Hsp_Align` > >>> `database_letters` > >>> `database_entries` > >>> > >>>We already have a Bioperl parser for that (specific for another system: > >>>GARSA) that could be adapted to GUS, problem being we are not sure what > >>>tables should be used to store those data in GUS. > >>> > >>>Cheers, Alberto > >>> > >>> > >>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > >>> > >>> > >>> > >>> > >>>>what are you planning on blasting? > >>>> > >>>>steve > >>>> > >>>>Alberto Davila wrote: > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>>Hi Steve, > >>>>> > >>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>poliana- > >>>>>> > >>>>>>oops, the usage statement for LoadBlastSimFast is out of date. it > >>>>>>should instruct you to use the blastSimilarity command. > >>>>>> > >>>>>>LoadBlastSimFast makes a big assumption, that the subject and query > >>>>>>sequences are in GUS, and their def. lines have GUS primary keys. > >>>>>> > >>>>>>Are your sequences already loaded into GUS? > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>They are not, there would be any howto/tips for that plugin ? We will > >>>>>certainly need a plugin to load "Interpro" and "ORF finding" results > >>>>>into GUS... If they are not available, then maybe we will have to write > >>>>>them ... > >>>>> > >>>>>Cheers, Alberto > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>steve > >>>>>> > >>>>>> > >>>>>> > >>>>>>Poliana Mateus wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>>Hello all, > >>>>>>> > >>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? > >>>>>>>I'm trying to run LoadBlastSimFast... > >>>>>>> > >>>>>>>Poliana |
From: Steve F. <sfi...@pc...> - 2005-02-11 15:41:33
|
(what is NT?) which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into gus? steve Alberto Davila wrote: >Query: > >Either sequences from genbank (genbank format) or sequences generated in >the lab (fasta format) > >Blastable databases (all are formatted databases from NCBI): > >NR >NT >EST > >Alberto > >On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > > >>for the blast, what are the query sequences and what are the blastable >>databases? >> >>steve >> >>Alberto Davila wrote: >> >> >> >>>Basically we will use sequences (loaded into GUS with the GBParser) for >>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also >>>used for Interpro analyses. Results of both (Blast and Interpro) will be >>>loaded into GUS. We will parse specific things from the Blast results, I >>>would say: >>> >>> `Gi` >>> `Accession` >>> `Description` >>> `E_value` >>> `Score` >>> `Length` >>> `Frame_Query` >>> `Frame_Hit` >>> `Identical` >>> `Hsp_Frac_Identical` >>> `Conserved` >>> `Hsp_Frac_Conserved` >>> `Query_Start` >>> `Query_End` >>> `Hit_Start` >>> `Hit_End` >>> `Hsp_Align` >>> `database_letters` >>> `database_entries` >>> >>>We already have a Bioperl parser for that (specific for another system: >>>GARSA) that could be adapted to GUS, problem being we are not sure what >>>tables should be used to store those data in GUS. >>> >>>Cheers, Alberto >>> >>> >>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>> >>> >>> >>> >>>>what are you planning on blasting? >>>> >>>>steve >>>> >>>>Alberto Davila wrote: >>>> >>>> >>>> >>>> >>>> >>>>>Hi Steve, >>>>> >>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>poliana- >>>>>> >>>>>>oops, the usage statement for LoadBlastSimFast is out of date. it >>>>>>should instruct you to use the blastSimilarity command. >>>>>> >>>>>>LoadBlastSimFast makes a big assumption, that the subject and query >>>>>>sequences are in GUS, and their def. lines have GUS primary keys. >>>>>> >>>>>>Are your sequences already loaded into GUS? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>They are not, there would be any howto/tips for that plugin ? We will >>>>>certainly need a plugin to load "Interpro" and "ORF finding" results >>>>>into GUS... If they are not available, then maybe we will have to write >>>>>them ... >>>>> >>>>>Cheers, Alberto >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>steve >>>>>> >>>>>> >>>>>> >>>>>>Poliana Mateus wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>Hello all, >>>>>>> >>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>I'm trying to run LoadBlastSimFast... >>>>>>> >>>>>>>Poliana >>>>>>> >>>>>>> > > > |
From: Alberto D. <da...@io...> - 2005-02-11 15:35:08
|
Query: Either sequences from genbank (genbank format) or sequences generated in the lab (fasta format) Blastable databases (all are formatted databases from NCBI): NR NT EST Alberto On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > for the blast, what are the query sequences and what are the blastable > databases? > > steve > > Alberto Davila wrote: > > >Basically we will use sequences (loaded into GUS with the GBParser) for > >NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also > >used for Interpro analyses. Results of both (Blast and Interpro) will be > >loaded into GUS. We will parse specific things from the Blast results, I > >would say: > > > > `Gi` > > `Accession` > > `Description` > > `E_value` > > `Score` > > `Length` > > `Frame_Query` > > `Frame_Hit` > > `Identical` > > `Hsp_Frac_Identical` > > `Conserved` > > `Hsp_Frac_Conserved` > > `Query_Start` > > `Query_End` > > `Hit_Start` > > `Hit_End` > > `Hsp_Align` > > `database_letters` > > `database_entries` > > > >We already have a Bioperl parser for that (specific for another system: > >GARSA) that could be adapted to GUS, problem being we are not sure what > >tables should be used to store those data in GUS. > > > >Cheers, Alberto > > > > > >On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > > > > > >>what are you planning on blasting? > >> > >>steve > >> > >>Alberto Davila wrote: > >> > >> > >> > >>>Hi Steve, > >>> > >>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>> > >>> > >>> > >>> > >>>>poliana- > >>>> > >>>>oops, the usage statement for LoadBlastSimFast is out of date. it > >>>>should instruct you to use the blastSimilarity command. > >>>> > >>>>LoadBlastSimFast makes a big assumption, that the subject and query > >>>>sequences are in GUS, and their def. lines have GUS primary keys. > >>>> > >>>>Are your sequences already loaded into GUS? > >>>> > >>>> > >>>> > >>>> > >>>They are not, there would be any howto/tips for that plugin ? We will > >>>certainly need a plugin to load "Interpro" and "ORF finding" results > >>>into GUS... If they are not available, then maybe we will have to write > >>>them ... > >>> > >>>Cheers, Alberto > >>> > >>> > >>> > >>> > >>> > >>>>steve > >>>> > >>>> > >>>> > >>>>Poliana Mateus wrote: > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>>Hello all, > >>>>> > >>>>>Where can find the script parseBlastFilesForSimilarity.pl?? > >>>>>I'm trying to run LoadBlastSimFast... > >>>>> > >>>>>Poliana |
From: Steve F. <sfi...@pc...> - 2005-02-11 15:32:14
|
for the blast, what are the query sequences and what are the blastable databases? steve Alberto Davila wrote: >Basically we will use sequences (loaded into GUS with the GBParser) for >NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also >used for Interpro analyses. Results of both (Blast and Interpro) will be >loaded into GUS. We will parse specific things from the Blast results, I >would say: > > `Gi` > `Accession` > `Description` > `E_value` > `Score` > `Length` > `Frame_Query` > `Frame_Hit` > `Identical` > `Hsp_Frac_Identical` > `Conserved` > `Hsp_Frac_Conserved` > `Query_Start` > `Query_End` > `Hit_Start` > `Hit_End` > `Hsp_Align` > `database_letters` > `database_entries` > >We already have a Bioperl parser for that (specific for another system: >GARSA) that could be adapted to GUS, problem being we are not sure what >tables should be used to store those data in GUS. > >Cheers, Alberto > > >On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > > >>what are you planning on blasting? >> >>steve >> >>Alberto Davila wrote: >> >> >> >>>Hi Steve, >>> >>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>> >>> >>> >>> >>>>poliana- >>>> >>>>oops, the usage statement for LoadBlastSimFast is out of date. it >>>>should instruct you to use the blastSimilarity command. >>>> >>>>LoadBlastSimFast makes a big assumption, that the subject and query >>>>sequences are in GUS, and their def. lines have GUS primary keys. >>>> >>>>Are your sequences already loaded into GUS? >>>> >>>> >>>> >>>> >>>They are not, there would be any howto/tips for that plugin ? We will >>>certainly need a plugin to load "Interpro" and "ORF finding" results >>>into GUS... If they are not available, then maybe we will have to write >>>them ... >>> >>>Cheers, Alberto >>> >>> >>> >>> >>> >>>>steve >>>> >>>> >>>> >>>>Poliana Mateus wrote: >>>> >>>> >>>> >>>> >>>> >>>>>Hello all, >>>>> >>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>I'm trying to run LoadBlastSimFast... >>>>> >>>>>Poliana >>>>> >>>>> > > > |
From: Alberto D. <da...@io...> - 2005-02-11 15:16:56
|
Basically we will use sequences (loaded into GUS with the GBParser) for NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also used for Interpro analyses. Results of both (Blast and Interpro) will be loaded into GUS. We will parse specific things from the Blast results, I would say: `Gi` `Accession` `Description` `E_value` `Score` `Length` `Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` `Conserved` `Hsp_Frac_Conserved` `Query_Start` `Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` `database_entries` We already have a Bioperl parser for that (specific for another system: GARSA) that could be adapted to GUS, problem being we are not sure what tables should be used to store those data in GUS. Cheers, Alberto On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > what are you planning on blasting? > > steve > > Alberto Davila wrote: > > >Hi Steve, > > > >On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > > > > > >>poliana- > >> > >>oops, the usage statement for LoadBlastSimFast is out of date. it > >>should instruct you to use the blastSimilarity command. > >> > >>LoadBlastSimFast makes a big assumption, that the subject and query > >>sequences are in GUS, and their def. lines have GUS primary keys. > >> > >>Are your sequences already loaded into GUS? > >> > >> > > > >They are not, there would be any howto/tips for that plugin ? We will > >certainly need a plugin to load "Interpro" and "ORF finding" results > >into GUS... If they are not available, then maybe we will have to write > >them ... > > > >Cheers, Alberto > > > > > > > >>steve > >> > >> > >> > >>Poliana Mateus wrote: > >> > >> > >> > >>>Hello all, > >>> > >>>Where can find the script parseBlastFilesForSimilarity.pl?? > >>>I'm trying to run LoadBlastSimFast... > >>> > >>>Poliana |
From: Y. T. G. <yon...@pc...> - 2005-02-11 15:08:33
|
A couple more comments. The"big assumption" (also applies to LoadBLATAlignments plugin) might seem restrictive and anti-intuitive (thus easily assumed otherwise) at first, and impose more work if you just want to experiment with your datasets (which could be heterogenious). But think about it, this is necessay safeguard to protect data integrity in GUS. This is dictated by the foreign key constraints (query sequence and subject sequence) on the similarity table. In reality, this forces you to think carefully about your datasets (e.g. how to organize if have gene trap tags from half a dozen sources, and you want to align them all to the genome). -Thomas On Fri, 11 Feb 2005, Steve Fischer wrote: > poliana- > > oops, the usage statement for LoadBlastSimFast is out of date. it should > instruct you to use the blastSimilarity command. > > LoadBlastSimFast makes a big assumption, that the subject and query sequences > are in GUS, and their def. lines have GUS primary keys. > Are your sequences already loaded into GUS? > > steve > > > > Poliana Mateus wrote: > >> Hello all, >> >> Where can find the script parseBlastFilesForSimilarity.pl?? >> I'm trying to run LoadBlastSimFast... >> >> Poliana >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Steve F. <sfi...@pc...> - 2005-02-11 15:05:00
|
what are you planning on blasting? steve Alberto Davila wrote: >Hi Steve, > >On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > > >>poliana- >> >>oops, the usage statement for LoadBlastSimFast is out of date. it >>should instruct you to use the blastSimilarity command. >> >>LoadBlastSimFast makes a big assumption, that the subject and query >>sequences are in GUS, and their def. lines have GUS primary keys. >> >>Are your sequences already loaded into GUS? >> >> > >They are not, there would be any howto/tips for that plugin ? We will >certainly need a plugin to load "Interpro" and "ORF finding" results >into GUS... If they are not available, then maybe we will have to write >them ... > >Cheers, Alberto > > > >>steve >> >> >> >>Poliana Mateus wrote: >> >> >> >>>Hello all, >>> >>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>I'm trying to run LoadBlastSimFast... >>> >>>Poliana >>> >>> >>> >>> |
From: Alberto D. <da...@io...> - 2005-02-11 14:52:02
|
Hi Steve, On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > poliana- > > oops, the usage statement for LoadBlastSimFast is out of date. it > should instruct you to use the blastSimilarity command. > > LoadBlastSimFast makes a big assumption, that the subject and query > sequences are in GUS, and their def. lines have GUS primary keys. > > Are your sequences already loaded into GUS? They are not, there would be any howto/tips for that plugin ? We will certainly need a plugin to load "Interpro" and "ORF finding" results into GUS... If they are not available, then maybe we will have to write them ... Cheers, Alberto > > steve > > > > Poliana Mateus wrote: > > >Hello all, > > > >Where can find the script parseBlastFilesForSimilarity.pl?? > >I'm trying to run LoadBlastSimFast... > > > >Poliana > > > > |
From: Steve F. <sfi...@pc...> - 2005-02-11 13:54:29
|
poliana- oops, the usage statement for LoadBlastSimFast is out of date. it should instruct you to use the blastSimilarity command. LoadBlastSimFast makes a big assumption, that the subject and query sequences are in GUS, and their def. lines have GUS primary keys. Are your sequences already loaded into GUS? steve Poliana Mateus wrote: >Hello all, > >Where can find the script parseBlastFilesForSimilarity.pl?? >I'm trying to run LoadBlastSimFast... > >Poliana > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: Poliana M. <pol...@gm...> - 2005-02-11 12:54:43
|
Hello all, Where can find the script parseBlastFilesForSimilarity.pl?? I'm trying to run LoadBlastSimFast... Poliana |
From: Poliana M. <pol...@gm...> - 2005-02-11 12:52:06
|
From: Poliana M. <pol...@gm...> - 2005-02-11 12:33:41
|
Hello all, Where can find the script parseBlastFilesForSimilarity.pl?? I'm trying to run LoadBlastSimFast... Poliana |
From: Steve F. <sfi...@pc...> - 2005-02-10 21:49:28
|
try ga +meta --commit steve fab...@de... wrote: >Hi all,=20 > >I really don't know why always when I try to run 'ga +create >GUS::PluginMgr::GusApplication --commit' command I receive always the er= ror: > >USER ERROR: GUS::PluginMgr::GusApplication has never been registered. >Please use 'ga +create GUS::PluginMgr::GusApplication --commit' > >Use the same command again??? I used and the error remains!!! :( > >I've already put "ga +meta +create" (no problem) and after that I'm tryi= ng to >run the command above!!! Do you know why this is happening?? > >Thanks a lot!! > >Fabricio > >------------------------------------------------- >This mail sent through IMP: http://horde.org/imp/ > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_ide95&alloc_id=14396&op=CCk >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > =20 > |
From: <fab...@de...> - 2005-02-10 19:06:56
|
Hi all,=20 I really don't know why always when I try to run 'ga +create GUS::PluginMgr::GusApplication --commit' command I receive always the err= or: USER ERROR: GUS::PluginMgr::GusApplication has never been registered. Please use 'ga +create GUS::PluginMgr::GusApplication --commit' Use the same command again??? I used and the error remains!!! :( I've already put "ga +meta +create" (no problem) and after that I'm tryin= g to run the command above!!! Do you know why this is happening?? Thanks a lot!! Fabricio ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/ |
From: Ed R. <ero...@ug...> - 2005-02-10 17:12:47
|
I'm loading the Sequence ontology table, and I am getting a constraint violation on a table in SRESVER.SEQUENCEONTOLOGYVER Does anyone have any idea where this is coming from? I can't find any triggers or cross schema constraints involving these tables. Here's the stack trace DBD::Oracle::st execute failed: ORA-00001: unique constraint (SRESVER.PK_SEQUENCEONTOLOGYVER) violated (DBD ERROR: OCIStmtExecute) at /var/local/GUS/gus_home/lib/perl/GUS/ObjRelP/DbiDbHandle.pm line 147, <GEN0> line 15. SQL ERROR!! involving Values: 43 at /var/local/GUS/gus_home/lib/perl/GUS/ObjRelP/DbiDbHandle.pm line 187 GUS::ObjRelP::DbiDbHandle::death('GUS::ObjRelP::DbiDbHandle=HASH(0x8614aa8)', '^J SQL ERROR!! involving^J ^J Values: 43') called at /var/local/GUS/gus_home/lib/perl/GUS/ObjRelP/DbiDbHandle.pm line 150 GUS::ObjRelP::DbiDbHandle::sqlExec('GUS::ObjRelP::DbiDbHandle=HASH(0x8614aa8)', 'GUS::ObjRelP::DbiDbHandle::st=HASH(0x88e2670)', 'ARRAY(0x88f8af4)') called at /var/local/GUS/gus_home/lib/perl/GUS/Model/GusRow.pm line 1798 GUS::Model::GusRow::version('GUS::Model::SRes::SequenceOntology=HASH(0x8841364)') called at /var/local/GUS/gus_home/lib/perl/GUS/Model/GusRow.pm line 1665 GUS::Model::GusRow::submit('GUS::Model::SRes::SequenceOntology=HASH(0x8841364)') called at /var/local/GUS/gus_home/lib/perl/GUS/Common/Plugin/LoadSODefinitions.pm line 70 GUS::Common::Plugin::LoadSODefinitions::run('GUS::Common::Plugin::LoadSODefinitions=HASH(0x8530354)', 'HASH(0x88412a4)') called at /var/local/GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 453 eval {...} called at /var/local/GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 450 GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport('GUS::PluginMgr::GusApplication=HASH(0x80fbb4c)', 'GUS::Common::Plugin::LoadSODefinitions', 1) called at /var/local/GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 383 GUS::PluginMgr::GusApplication::doMajorMode_Run('GUS::PluginMgr::GusApplication=HASH(0x80fbb4c)', 'GUS::Common::Plugin::LoadSODefinitions') called at /var/local/GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 289 GUS::PluginMgr::GusApplication::doMajorMode('GUS::PluginMgr::GusApplication=HASH(0x80fbb4c)', 'GUS::Common::Plugin::LoadSODefinitions') called at /var/local/GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 198 GUS::PluginMgr::GusApplication::parseAndRun('GUS::PluginMgr::GusApplication=HASH(0x80fbb4c)', 'ARRAY(0x81051b0)') called at /var/local/GUS/gus_home/bin/ga line 11 ----------------- Ed Robinson Center for Tropical and Emerging Global Diseases University of Georgia, Athens, GA 30602 ero...@ug.../(706)542.1447/254.8883 |
From: <fed...@bi...> - 2005-02-10 14:07:54
|
Ok! As also Poliana suggested I inserted values in my core.userinfo table; problems were because my table was empty! Now if I type 'select' I can see what I inserted. I have inserted random values, except for login and passward; now my questions are: 1) it's right my tables were empty; in the documentation I read only the core.machine is empty; 2) it's a problem for values I inserted randomly, like row_user_id..? If it's a problem, where I can find the right values? Thanks everyone! Federica > Hi Federica, > > It sounds like the row doesn't exist... Did you do the preliminary > insert? > > select count(*) from core.userinfo; > > This should be >= 1. If not, refer back to the documentation about > populating these tables. > > --Mike > > > fed...@bi... wrote: >> Thanks Poliana, >> I tried as you suggest; no error appeared but I had always the same >> reply: >> '0 rows updated'. >> I connect to Oracle as 'sys/password as sysdba'. >> Any other ideas about how to solve the problem? >> Thanks >> Federica >> >> >> >>>Hi Federica, >>> >>>Try >>> >>>UPDATE core.userinfo SET login='your_login', password='your_password' >>>WHERE user_id = 1; >>> >>>Poliana >>> >>> >>>On Wed, 9 Feb 2005 18:32:14 +0100 (CET), fed...@bi... >>><fed...@bi...> wrote: >>> >>>>Hi guys, I'm Federica! Some months ago I had some problems with GUS >>>>installation; then I had to perform other tasks, and after that I >>>>decided >>>>to delete everything and restart all!! >>>>I have got a new problem. I've never used sql before: I can't change >>>> the >>>>login, password and group in the core.userinfo, core.groupinfo, >>>>core.projectinfo tables! For example when I try to select or update >>>> some >>>>rows of core.userinfo table the replay is always: 0 rows >>>>selected/updated. >>>>Anyone can help me? >>>>Thanks, Federica >>>> >>>>------------------------------------------------------- >>>>SF email is sponsored by - The IT Product Guide >>>>Read honest & candid reviews on hundreds of IT Products from real >>>> users. >>>>Discover which products truly live up to the hype. Start reading now. >>>>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >>>>_______________________________________________ >>>>Gusdev-gusdev mailing list >>>>Gus...@li... >>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>> >>> >>>------------------------------------------------------- >>>SF email is sponsored by - The IT Product Guide >>>Read honest & candid reviews on hundreds of IT Products from real users. >>>Discover which products truly live up to the hype. Start reading now. >>>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >>>_______________________________________________ >>>Gusdev-gusdev mailing list >>>Gus...@li... >>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >> >> >> >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Michael S. <msa...@pc...> - 2005-02-10 13:02:29
|
Hi Federica, It sounds like the row doesn't exist... Did you do the preliminary insert? select count(*) from core.userinfo; This should be >= 1. If not, refer back to the documentation about populating these tables. --Mike fed...@bi... wrote: > Thanks Poliana, > I tried as you suggest; no error appeared but I had always the same reply: > '0 rows updated'. > I connect to Oracle as 'sys/password as sysdba'. > Any other ideas about how to solve the problem? > Thanks > Federica > > > >>Hi Federica, >> >>Try >> >>UPDATE core.userinfo SET login='your_login', password='your_password' >>WHERE user_id = 1; >> >>Poliana >> >> >>On Wed, 9 Feb 2005 18:32:14 +0100 (CET), fed...@bi... >><fed...@bi...> wrote: >> >>>Hi guys, I'm Federica! Some months ago I had some problems with GUS >>>installation; then I had to perform other tasks, and after that I >>>decided >>>to delete everything and restart all!! >>>I have got a new problem. I've never used sql before: I can't change the >>>login, password and group in the core.userinfo, core.groupinfo, >>>core.projectinfo tables! For example when I try to select or update some >>>rows of core.userinfo table the replay is always: 0 rows >>>selected/updated. >>>Anyone can help me? >>>Thanks, Federica >>> >>>------------------------------------------------------- >>>SF email is sponsored by - The IT Product Guide >>>Read honest & candid reviews on hundreds of IT Products from real users. >>>Discover which products truly live up to the hype. Start reading now. >>>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >>>_______________________________________________ >>>Gusdev-gusdev mailing list >>>Gus...@li... >>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >> >> >>------------------------------------------------------- >>SF email is sponsored by - The IT Product Guide >>Read honest & candid reviews on hundreds of IT Products from real users. >>Discover which products truly live up to the hype. Start reading now. >>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >>_______________________________________________ >>Gusdev-gusdev mailing list >>Gus...@li... >>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |