From: Lee K. <ls...@gm...> - 2013-10-01 14:12:03
|
This looks like something that I need to understand further. Unfortunately I am on furlough and so I am wrapping up everything right now. Please keep this project folder on-hand though because this is a very relevant analysis that will help CG-Pipeline in the future. On Sun, Sep 29, 2013 at 11:26 PM, kok <kok...@gm...> wrote: > Dear Lee Katz, > > condition evidence CDS w/o prodigal 1 6048 w prodigal 1 8263 w > prodigal 2 6192 w prodigal 3 5683 w prodigal 4 875 > Table above shows the statistics of the results that I have got for the > test on several evidence value. It seems that evidence 2 is the one that I > would like to proceed for as it returns with the most CDS which by manual > check I can see some are collected from just prodigal and BLAST evidence > that missed from previous CGP without prodigal included. > > However, I find out that the coordinates of the CDS from BLAST and > prodigal recorded with infinity "complement(1617923..inf)". From the > run_prediction scripts I see that the prodigal is not used for reconcile > prediction which I am not sure whether it's the cause of the error, but > including prodigal for the start prediction during prediction > reconciliation will be good as I have seen cases that prodigal are doing > better with the start prediction compared to the other predictors. Hope > this is easy to be implemented and thanks for your time. > > Regards, > Kok > > > > On 12/8/2013 7:25 PM, Lee Katz wrote: > > I know! Prodigal is just so easy to use, and so it was really easy to > make a wrapper around it. > > 2/4 might be ok too, but I do not have enough time to perform any > rigorous tests to see which way is better. If you have time, please let me > and the community know which gives you better results. I think it would be > informative to know what 1/4, 2/4, 3/4, and 4/4 gives you for each genome. > There is an interesting table that the Georgia Tech compgenomics class > created this year, at > http://compgenomics2013.biology.gatech.edu/index.php/Gene_Prediction_Group#Gene_Prediction_Pipeline. > > > The way to change the minimum number of predictors is to alter the > variable $$settings{min_predictors_to_call_orf} in run_prediction. > > Around line 161 in run_prediction, where it says something like > # Categorize and reconcile predictions > Set it back to 2 so that you can have 2/4 predictors. > > $$settings{min_predictors_to_call_orf}=2; > > > On Sun, Aug 11, 2013 at 11:57 PM, kok wei <kok...@gm...> wrote: > >> Wow, that's very great and it's faster than planned. I will certainly try >> out the pipeline on my genome and update you with the results. I'm thinking >> of probably having 2/4 evidence will be good enough as false positive is >> preferred over false negative for the gene prediction, any opinion? >> >> Thanks for your efforts and helps. >> >> >> >> On Sun, Aug 11, 2013 at 7:35 PM, Lee Katz <ls...@gm...> wrote: >> >>> Hi Kok, I added a script run_prediction_prodigal.pl into source >>> control. It outputs a GFF file of CDS predictions. I also made sure it >>> outputs the training file to the temporary directory because it seems like >>> you are interested in the training files. >>> >>> I also modified run_prediction and the CGPipelineUtils module so that >>> it predicts alongside the other predictors. Lastly, I added an option >>> prediction_use_prodigal = 1 under the config file so that you can enable it >>> for run_prediction. With Prodigal, each gene must have 2/3 or 3/4 majority >>> to be called (depending on whether you use genemark too). >>> >>> I'm new to prodigal, so please let me know if it all looks correct. >>> The command seems simple enough but I don't know if there are >>> any idiosyncrasies to be aware of. >>> >>> >>> On Fri, Aug 2, 2013 at 11:32 PM, Gmail <kok...@gm...> wrote: >>> >>>> Thanks Jay and Lee. It will be great if the option is added. I like >>>> prodigal for their better start prediction (from what i get for my test >>>> genome) and less false prediction for bacterial genome as claimed. Looking >>>> forward to the update, thanks! >>>> >>>> On 02/08/2013, at 23:51, Lee Katz <ls...@gm...> wrote: >>>> >>>> I'm returning to the US and back to work on Aug 12. It sounds like a >>>> worthy addition. >>>> >>>> I like prodigal but never bothered to put it in as an option. I >>>> think it could be something optional like genemark and would be preferred >>>> if not using genemark. In this way, CGP would still be able to have a >>>> majority for gene calling even if you don't have genemark. >>>> >>>> On Aug 2, 2013, at 15:43, Jay <jhu...@gm...> wrote: >>>> >>>> As far as I know, there is no convenient way of doing this. The >>>> run_prediction script would have to be modified to support running it and >>>> parsing the results. >>>> >>>> On 02/08/2013 22:52, kok wrote: >>>> >>>> Is it possible for cg pipeline to include the results of other *ab-initio >>>> *predictor (eg. prodigal)? Is there any development for this function? >>>> Or if I would like to use prodigal in place of genemark (if only two >>>> predictors allowed), can I convert the results of prodigal into >>>> genemark-like gm_out.lst file for cg pipeline's run_predict as a simple >>>> modification? >>>> >>>> - kok - >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Get your SQL database under version control now! >>>> Version control is standard for application code, but databases havent >>>> caught up. So what steps can you take to put your SQL databases under >>>> version control? Why should you start doing it? Read more to find out.http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>>> >>>> >>>> >>>> _______________________________________________ >>>> Cg-pipeline-users mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Get your SQL database under version control now! >>>> Version control is standard for application code, but databases havent >>>> caught up. So what steps can you take to put your SQL databases under >>>> version control? Why should you start doing it? Read more to find out. >>>> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>>> >>>> _______________________________________________ >>>> Cg-pipeline-users mailing list >>>> Cg-...@li... >>>> https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >>>> >>>> >>> >>> >>> -- >>> Lee Katz, Ph.D. >>> >> >> > > > -- > Lee Katz, Ph.D. > > > -- Lee Katz, Ph.D. |