From: Y. T. G. <yon...@pc...> - 2005-02-11 15:57:57
|
Hi Alberto: When doing similarity analyses (BLASTs, BLAT) with GUS, here is the process we usually follow: 1: load query sequence set into GUS if not already in GUS (will get a GUS id for each sequence); same for target sequence set 2: dump from gus to get query fasta file and target fasta file using the script $GUS_HOME/bin/dumpSequencesFromTable.pl 3: run blast on computer cluster using the DistribJob software 4: run load plugin to load result into similarity table (DoTS.BlastSimilarity for BLAST, DoTS.BLATAlignment for BLAT alignments) In your case, I think you will need to load NR, NT as separate datasets into GUS (EST should already be in GUS) -Thomas On Fri, 11 Feb 2005, Alberto Davila wrote: > All the blastable databases I mentioned are standard databases from NCBI > (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > > NT = nucleotides > > ~30000 entries from genbank (genbank format) are loaded into GUS now. > > Not sure about your "NRDB", I know NR from NCBI that is a collection of > aminoacid entries, could it be the same ? > > Alberto > > On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >> (what is NT?) >> >> which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >> gus? >> >> steve >> >> Alberto Davila wrote: >> >>> Query: >>> >>> Either sequences from genbank (genbank format) or sequences generated in >>> the lab (fasta format) >>> >>> Blastable databases (all are formatted databases from NCBI): >>> >>> NR >>> NT >>> EST >>> >>> Alberto >>> >>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>> >>> >>>> for the blast, what are the query sequences and what are the blastable >>>> databases? >>>> >>>> steve >>>> >>>> Alberto Davila wrote: >>>> >>>> >>>> >>>>> Basically we will use sequences (loaded into GUS with the GBParser) for >>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be also >>>>> used for Interpro analyses. Results of both (Blast and Interpro) will be >>>>> loaded into GUS. We will parse specific things from the Blast results, I >>>>> would say: >>>>> >>>>> `Gi` >>>>> `Accession` >>>>> `Description` >>>>> `E_value` >>>>> `Score` >>>>> `Length` >>>>> `Frame_Query` >>>>> `Frame_Hit` >>>>> `Identical` >>>>> `Hsp_Frac_Identical` >>>>> `Conserved` >>>>> `Hsp_Frac_Conserved` >>>>> `Query_Start` >>>>> `Query_End` >>>>> `Hit_Start` >>>>> `Hit_End` >>>>> `Hsp_Align` >>>>> `database_letters` >>>>> `database_entries` >>>>> >>>>> We already have a Bioperl parser for that (specific for another system: >>>>> GARSA) that could be adapted to GUS, problem being we are not sure what >>>>> tables should be used to store those data in GUS. >>>>> >>>>> Cheers, Alberto >>>>> >>>>> >>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> what are you planning on blasting? >>>>>> >>>>>> steve >>>>>> >>>>>> Alberto Davila wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Hi Steve, >>>>>>> >>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> poliana- >>>>>>>> >>>>>>>> oops, the usage statement for LoadBlastSimFast is out of date. it >>>>>>>> should instruct you to use the blastSimilarity command. >>>>>>>> >>>>>>>> LoadBlastSimFast makes a big assumption, that the subject and query >>>>>>>> sequences are in GUS, and their def. lines have GUS primary keys. >>>>>>>> >>>>>>>> Are your sequences already loaded into GUS? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> They are not, there would be any howto/tips for that plugin ? We will >>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" results >>>>>>> into GUS... If they are not available, then maybe we will have to write >>>>>>> them ... >>>>>>> >>>>>>> Cheers, Alberto >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> steve >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Poliana Mateus wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>> I'm trying to run LoadBlastSimFast... >>>>>>>>> >>>>>>>>> Poliana > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |