From: Sucheta T. <su...@vb...> - 2005-02-25 20:32:10
|
Hi Poliana, You should typically use the regex which should be the subject id in the HSP section. Pl. Check the script I have sent you. There are some details about it in the comment section of the script In your case, regex will be : \S+\|(.*?)\| But caution here: There may be some issues. Even though your blast results are parsed in our machine, it may not work for you . You may need to consider the following points. 1. Pass one output at a time(I had mentioned about this earlier- for this you may need to split the blast output into individual files and then pass them through a shell script or perl script to parseblastsimilarity.pl). 2. I have again modified the BlastAnal.pm file for small parsing and space issues, so you may need to check that. So, just let me know after modifying the regex if it still did not work, then you need to tweak the BlastAnal.pm littlebit. Even after you parse, you may not be able to upload this data into dots.similarity and dots.similarityspan tables, since you need numbers in places of query_id and subject_id. These numbers are going to be referenced to suject_table and query_tables. So, in this case, you either change them to some numbers or change the plugin. HTH Sucheta Since I had this problem earlier and I had to spend lots of time solving this, I am ccing this to the group, it may help others. At 04:41 PM 2/25/2005 -0300, you wrote: >see... > >[bioinfo@kineto2 bioinfo]$ perl parseblastsimilarity.pl >--regex='(\d+)\s+\d+\s+\d+' --inputFile='new.out' >--outputFile='parse.out' --adjustMatch > >Inside analyzeblast >DEBUGGING ON >algorithm = BLASTX >ParseType='1' >algorithm = BLASTX >ParseType='1' >algorithm = BLASTX >ParseType='1' >Matching query: 390 416 >Can't call method "getID" on an undefined value at >/opt/GUS/gus_home/lib/perl/CBIL/Bio/Blast/BlastAnal.pm line 136, ><INPUT> line 5761. > >---------- > >[bioinfo@kineto2 bioinfo]$ more parse.out >Cutoff parameters: > P value: 1e-05 > Length: 10 > Percent Identity: 20 > >Cutoff parameters: > P value: 1e-05 > Length: 10 > Percent Identity: 20 > >Cutoff parameters: > P value: 1e-05 > Length: 10 > Percent Identity: 20 > >Cutoff parameters: > P value: 1e-05 > Length: 10 > Percent Identity: 20 > >Cutoff parameters: > P value: 1e-05 > Length: 10 > Percent Identity: 20 > >Cutoff parameters: > P value: 1e-05 > Length: 10 > Percent Identity: 20 >--More--(74%) > >----- > >Poliana > >On Fri, 25 Feb 2005 14:36:17 -0500, Sucheta Tripathy <su...@vb...> >wrote: > > Ok now if possible, send me the output, I will check and see if it is the > > space problem. > > > > Sucheta > > > > At 04:22 PM 2/25/2005 -0300, you wrote: > > >Hi Sucheta, > > > > > >I twirled other blast... now I have one output with "querie name". But > > >I continue with same the problems...if you it will have another > > >idea... > > > > > >Thanks > > > > > > > > >On Fri, 25 Feb 2005 11:38:46 -0500, Sucheta Tripathy <su...@vb...> > > >wrote: > > > > I think its important that your blast output should have the query > name. I > > > > did not check your fasta file, but make sure that your blast output has > > > > query name, else it will not be able to parse the output. > > > > > > > > Sucheta > > > > > > > > At 01:35 PM 2/25/2005 -0300, you wrote: > > > > >ok...in my "RESULTADO_FASTA" does not have "query name".... > > > > >But and my original file (result blast) > > > > >"leimajor-nr-poliana-blastx-01.out" also it does not have? > > > > > > > > > >Poliana > > > > > > > > > > > > > > >On Fri, 25 Feb 2005 11:26:17 -0500, Sucheta Tripathy > <su...@vb...> > > > > >wrote: > > > > > > Hi Poliana, > > > > > > > > > > > > By having a quick look at your output, it look sto me that the > > > problem lies > > > > > > with your not having a "Query name". How do you want to store it if > > > it does > > > > > > not have a query name? > > > > > > > > > > > > Sucheta > > > > > > > > > > > > At 01:16 PM 2/25/2005 -0300, you wrote: > > > > > > >sorry... but I only removed (script) the sequencias that > interest me > > > > > > >of the result of blast....and I want to insert in the GUS. The > > > > > > >original archive is this in annex. > > > > > > > > > > > > > >Poliana > > > > > > > > > > > > > > > > > > > > >On Fri, 25 Feb 2005 11:06:46 -0500, Sucheta Tripathy > > > <su...@vb...> > > > > > > >wrote: > > > > > > > > Hi Poliana, > > > > > > > > > > > > > > > > I wanted the blast output, not the sequence itself :) Can you > > > > > please snd a > > > > > > > > sample of that. > > > > > > > > > > > > > > > > Sucheta > > > > > > > > > > > > > > > > At 12:48 PM 2/25/2005 -0300, you wrote: > > > > > > > > >yes...of course!!! > > > > > > > > >it's in annex... > > > > > > > > > > > > > > > > > >Poliana > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >On Fri, 25 Feb 2005 09:39:26 -0500 (EST), Sucheta Tripathy > > > > > > > > ><su...@vb...> wrote: > > > > > > > > > > Can you pl. send me a part of your blast input file? > > > > > > > > > > > > > > > > > > > > It will make easier for me to check if the problem is with > > > > > > > BlastAnal.pm or > > > > > > > > > > not. > > > > > > > > > > > > > > > > > > > > Sucheta > > > > > > > > > > > > > > > > > > > > > Ok...Sucheta! > > > > > > > > > > > > > > > > > > > > > > My "--INPUTFILE" (blast result) does not have the GUSIDs, > > > > > because I > > > > > > > > > > > twirled blast with the NR of the web. > > > > > > > > > > > It has some problem? > > > > > > > > > > > > > > > > > > > > > > Other problem is: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------------------------ > > > > > > > > > > > [bioinfo@kineto2 bioinfo]$ perl parseblastsimilarity.pl > > > > > > > > > > > --regex='(\d+)\s+\d+\s+\d+' > > > > > > > > > > > --inputFile='leimajor-nr-poliana-blastx-01.out' > > > > > > > > > > > --outputFile='parse.out' --adjustMatch > > > > > > > > > > > Inside analyzeblast > > > > > > > > > > > DEBUGGING ON > > > > > > > > > > > algorithm = BLASTX > > > > > > > > > > > ParseType='1' > > > > > > > > > > > algorithm = BLASTX > > > > > > > > > > > ParseType='1' > > > > > > > > > > > algorithm = BLASTX > > > > > > > > > > > ParseType='1' > > > > > > > > > > > Matching query: 821 847 > > > > > > > > > > > Can't call method "getID" on an undefined value at > > > > > > > > > > > /opt/GUS/gus_home/lib/perl/CBIL/Bio/Blast/BlastAnal.pm > > > line 136, > > > > > > > > > > > <INPUT> line 5599. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------------------------------------------------------------- > > > > > > > > > > > I read that the function "getID" catches the primary key > > > in the > > > > > > > > > > > Oracle, but I am using PostgreSQL. > > > > > > > > > > > ?? > > > > > > > > > > > > > > > > > > > > > > Poliana > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 25 Feb 2005 09:04:12 -0500 (EST), Sucheta > Tripathy > > > > > > > > > > > <su...@vb...> wrote: > > > > > > > > > > >> Hi Poliana, > > > > > > > > > > >> > > > > > > > > > > >> This could be happening because of 2 reasons. One > may be the > > > > > > > regex that > > > > > > > > > > >> you are providing is not right. The other could be in > > > > > > > BlastAnal.pm. Just > > > > > > > > > > >> go there and see(at some point I have changed my > > > version, it is > > > > > > > to do > > > > > > > > > > >> with > > > > > > > > > > >> a space i.e; in original version it is \s, but I changed > > > it to > > > > > > > \s+. If > > > > > > > > > > >> you > > > > > > > > > > >> did not find it, I will tell you after sometime, > once I am > > > > > back at > > > > > > > > > > >> office. > > > > > > > > > > >> > > > > > > > > > > >> Ok > > > > > > > > > > >> > > > > > > > > > > >> Sucheta > > > > > > > > > > >> > > > > > > > > > > >> > Hi Sucheta, > > > > > > > > > > >> > > > > > > > > > > > >> > I am trying to twirl script "parseblastsimilarity.pl" > > > > > > > again....but I > > > > > > > > > > >> > am catching the following result: > > > > > > > > > > >> > > > > > > > > > > > >> > [bioinfo@kineto2 bioinfo]$ perl > parseblastsimilarity.pl > > > > > > > > > > >> > --regex='(\d+)\s+\d+\s+\d+' > --inputFile='RESULTADO_FASTA' > > > > > > > > > > >> > --outputFile='parse.out' --adjustMatch > > > > > > > > > > >> > > > > > > > > > > > >> > Inside analyzeblast > > > > > > > > > > >> > DEBUGGING ON > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------------------------- > > > > > > > > > > >> > > > > > > > > > > > >> > [bioinfo@kineto2 bioinfo]$ more parse.out > > > > > > > > > > >> > Cutoff parameters: > > > > > > > > > > >> > P value: 1e-05 > > > > > > > > > > >> > Length: > > > > > > > > > > >> > 10 > > > > > > > > > > >> > Percent Identity: 20 > > > > > > > > > > >> > > > > > > > > > > > >> >> (0 subjects) > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------------------------- > > > > > > > > > > >> > > > > > > > > > > > >> > I believe that this is not the certain waited... > > > > > > > > > > >> > ??? > > > > > > > > > > >> > > > > > > > > > > > >> > []'s > > > > > > > > > > >> > Poliana > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> -- > > > > > > > > > > >> Sucheta Tripathy > > > > > > > > > > >> Virginia Bioinformatics Institute Phase-I > > > > > > > > > > >> Washington street. > > > > > > > > > > >> Virginia Tech. > > > > > > > > > > >> Blacksburg,VA 24061-0447 > > > > > > > > > > >> phone:(540)231-8138 > > > > > > > > > > >> Fax: (540) 231-2606 > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sucheta Tripathy > > > > > > > > > > Virginia Bioinformatics Institute Phase-I > > > > > > > > > > Washington street. > > > > > > > > > > Virginia Tech. > > > > > > > > > > Blacksburg,VA 24061-0447 > > > > > > > > > > phone:(540)231-8138 > > > > > > > > > > Fax: (540) 231-2606 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |