From: Lee K. <ls...@gm...> - 2013-08-12 11:25:59
|
I know! Prodigal is just so easy to use, and so it was really easy to make a wrapper around it. 2/4 might be ok too, but I do not have enough time to perform any rigorous tests to see which way is better. If you have time, please let me and the community know which gives you better results. I think it would be informative to know what 1/4, 2/4, 3/4, and 4/4 gives you for each genome. There is an interesting table that the Georgia Tech compgenomics class created this year, at http://compgenomics2013.biology.gatech.edu/index.php/Gene_Prediction_Group#Gene_Prediction_Pipeline. The way to change the minimum number of predictors is to alter the variable $$settings{min_predictors_to_call_orf} in run_prediction. Around line 161 in run_prediction, where it says something like # Categorize and reconcile predictions Set it back to 2 so that you can have 2/4 predictors. $$settings{min_predictors_to_call_orf}=2; On Sun, Aug 11, 2013 at 11:57 PM, kok wei <kok...@gm...> wrote: > Wow, that's very great and it's faster than planned. I will certainly try > out the pipeline on my genome and update you with the results. I'm thinking > of probably having 2/4 evidence will be good enough as false positive is > preferred over false negative for the gene prediction, any opinion? > > Thanks for your efforts and helps. > > > > On Sun, Aug 11, 2013 at 7:35 PM, Lee Katz <ls...@gm...> wrote: > >> Hi Kok, I added a script run_prediction_prodigal.pl into source control. >> It outputs a GFF file of CDS predictions. I also made sure it outputs the >> training file to the temporary directory because it seems like you are >> interested in the training files. >> >> I also modified run_prediction and the CGPipelineUtils module so that it >> predicts alongside the other predictors. Lastly, I added an option >> prediction_use_prodigal = 1 under the config file so that you can enable it >> for run_prediction. With Prodigal, each gene must have 2/3 or 3/4 majority >> to be called (depending on whether you use genemark too). >> >> I'm new to prodigal, so please let me know if it all looks correct. The >> command seems simple enough but I don't know if there are >> any idiosyncrasies to be aware of. >> >> >> On Fri, Aug 2, 2013 at 11:32 PM, Gmail <kok...@gm...> wrote: >> >>> Thanks Jay and Lee. It will be great if the option is added. I like >>> prodigal for their better start prediction (from what i get for my test >>> genome) and less false prediction for bacterial genome as claimed. Looking >>> forward to the update, thanks! >>> >>> On 02/08/2013, at 23:51, Lee Katz <ls...@gm...> wrote: >>> >>> I'm returning to the US and back to work on Aug 12. It sounds like a >>> worthy addition. >>> >>> I like prodigal but never bothered to put it in as an option. I think >>> it could be something optional like genemark and would be preferred if not >>> using genemark. In this way, CGP would still be able to have a majority for >>> gene calling even if you don't have genemark. >>> >>> On Aug 2, 2013, at 15:43, Jay <jhu...@gm...> wrote: >>> >>> As far as I know, there is no convenient way of doing this. The >>> run_prediction script would have to be modified to support running it and >>> parsing the results. >>> >>> On 02/08/2013 22:52, kok wrote: >>> >>> Is it possible for cg pipeline to include the results of other *ab-initio >>> *predictor (eg. prodigal)? Is there any development for this function? >>> Or if I would like to use prodigal in place of genemark (if only two >>> predictors allowed), can I convert the results of prodigal into >>> genemark-like gm_out.lst file for cg pipeline's run_predict as a simple >>> modification? >>> >>> - kok - >>> >>> >>> ------------------------------------------------------------------------------ >>> Get your SQL database under version control now! >>> Version control is standard for application code, but databases havent >>> caught up. So what steps can you take to put your SQL databases under >>> version control? Why should you start doing it? Read more to find out.http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>> >>> >>> >>> _______________________________________________ >>> Cg-pipeline-users mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Get your SQL database under version control now! >>> Version control is standard for application code, but databases havent >>> caught up. So what steps can you take to put your SQL databases under >>> version control? Why should you start doing it? Read more to find out. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>> >>> _______________________________________________ >>> Cg-pipeline-users mailing list >>> Cg-...@li... >>> https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >>> >>> >> >> >> -- >> Lee Katz, Ph.D. >> > > -- Lee Katz, Ph.D. |