From: kok <kok...@gm...> - 2013-09-30 03:26:49
|
Dear Lee Katz, condition evidence CDS w/o prodigal 1 6048 w prodigal 1 8263 w prodigal 2 6192 w prodigal 3 5683 w prodigal 4 875 Table above shows the statistics of the results that I have got for the test on several evidence value. It seems that evidence 2 is the one that I would like to proceed for as it returns with the most CDS which by manual check I can see some are collected from just prodigal and BLAST evidence that missed from previous CGP without prodigal included. However, I find out that the coordinates of the CDS from BLAST and prodigal recorded with infinity "complement(1617923..inf)". From the run_prediction scripts I see that the prodigal is not used for reconcile prediction which I am not sure whether it's the cause of the error, but including prodigal for the start prediction during prediction reconciliation will be good as I have seen cases that prodigal are doing better with the start prediction compared to the other predictors. Hope this is easy to be implemented and thanks for your time. Regards, Kok On 12/8/2013 7:25 PM, Lee Katz wrote: > I know! Prodigal is just so easy to use, and so it was really easy to > make a wrapper around it. > > 2/4 might be ok too, but I do not have enough time to perform any > rigorous tests to see which way is better. If you have time, please > let me and the community know which gives you better results. I think > it would be informative to know what 1/4, 2/4, 3/4, and 4/4 gives you > for each genome. There is an interesting table that the Georgia Tech > compgenomics class created this year, at > http://compgenomics2013.biology.gatech.edu/index.php/Gene_Prediction_Group#Gene_Prediction_Pipeline. > > > The way to change the minimum number of predictors is to alter the > variable $$settings{min_predictors_to_call_orf} in run_prediction. > > Around line 161 in run_prediction, where it says something like > # Categorize and reconcile predictions > Set it back to 2 so that you can have 2/4 predictors. > > $$settings{min_predictors_to_call_orf}=2; > > > On Sun, Aug 11, 2013 at 11:57 PM, kok wei <kok...@gm... > <mailto:kok...@gm...>> wrote: > > Wow, that's very great and it's faster than planned. I will > certainly try out the pipeline on my genome and update you with > the results. I'm thinking of probably having 2/4 evidence will be > good enough as false positive is preferred over false negative for > the gene prediction, any opinion? > > Thanks for your efforts and helps. > > > > On Sun, Aug 11, 2013 at 7:35 PM, Lee Katz <ls...@gm... > <mailto:ls...@gm...>> wrote: > > Hi Kok, I added a script run_prediction_prodigal.pl > <http://run_prediction_prodigal.pl> into source control. It > outputs a GFF file of CDS predictions. I also made sure it > outputs the training file to the temporary directory because > it seems like you are interested in the training files. > > I also modified run_prediction and the CGPipelineUtils module > so that it predicts alongside the other predictors. Lastly, I > added an option prediction_use_prodigal = 1 under the config > file so that you can enable it for run_prediction. With > Prodigal, each gene must have 2/3 or 3/4 majority to be called > (depending on whether you use genemark too). > > I'm new to prodigal, so please let me know if it all looks > correct. The command seems simple enough but I don't know if > there are any idiosyncrasies to be aware of. > > > On Fri, Aug 2, 2013 at 11:32 PM, Gmail <kok...@gm... > <mailto:kok...@gm...>> wrote: > > Thanks Jay and Lee. It will be great if the option is > added. I like prodigal for their better start prediction > (from what i get for my test genome) and less false > prediction for bacterial genome as claimed. Looking > forward to the update, thanks! > > On 02/08/2013, at 23:51, Lee Katz <ls...@gm... > <mailto:ls...@gm...>> wrote: > >> I'm returning to the US and back to work on Aug 12. It >> sounds like a worthy addition. >> >> I like prodigal but never bothered to put it in as an >> option. I think it could be something optional like >> genemark and would be preferred if not using genemark. In >> this way, CGP would still be able to have a majority for >> gene calling even if you don't have genemark. >> >> On Aug 2, 2013, at 15:43, Jay <jhu...@gm... >> <mailto:jhu...@gm...>> wrote: >> >>> As far as I know, there is no convenient way of doing >>> this. The run_prediction script would have to be >>> modified to support running it and parsing the results. >>> >>> On 02/08/2013 22:52, kok wrote: >>>> Is it possible for cg pipeline to include the results >>>> of other /ab-initio /predictor (eg. prodigal)? Is there >>>> any development for this function? >>>> Or if I would like to use prodigal in place of genemark >>>> (if only two predictors allowed), can I convert the >>>> results of prodigal into genemark-like gm_out.lst file >>>> for cg pipeline's run_predict as a simple modification? >>>> >>>> - kok - >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Get your SQL database under version control now! >>>> Version control is standard for application code, but databases havent >>>> caught up. So what steps can you take to put your SQL databases under >>>> version control? Why should you start doing it? Read more to find out. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>>> >>>> >>>> _______________________________________________ >>>> Cg-pipeline-users mailing list >>>> Cg-...@li... <mailto:Cg-...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >>> >>> ------------------------------------------------------------------------------ >>> Get your SQL database under version control now! >>> Version control is standard for application code, but >>> databases havent >>> caught up. So what steps can you take to put your SQL >>> databases under >>> version control? Why should you start doing it? Read >>> more to find out. >>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Cg-pipeline-users mailing list >>> Cg-...@li... >>> <mailto:Cg-...@li...> >>> https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users > > > > > -- > Lee Katz, Ph.D. > > > > > > -- > Lee Katz, Ph.D. |