From: kok <kok...@gm...> - 2013-10-02 03:25:56
|
Sure, wish the government issue get over soon and everything's fine. Thanks. On 1/10/2013 10:11 PM, Lee Katz wrote: > This looks like something that I need to understand further. > Unfortunately I am on furlough and so I am wrapping up everything > right now. Please keep this project folder on-hand though because > this is a very relevant analysis that will help CG-Pipeline in the future. > > > On Sun, Sep 29, 2013 at 11:26 PM, kok <kok...@gm... > <mailto:kok...@gm...>> wrote: > > Dear Lee Katz, > > condition evidence CDS > w/o prodigal 1 6048 > w prodigal 1 8263 > w prodigal 2 6192 > w prodigal 3 5683 > w prodigal 4 875 > > > Table above shows the statistics of the results that I have got > for the test on several evidence value. It seems that evidence 2 > is the one that I would like to proceed for as it returns with the > most CDS which by manual check I can see some are collected from > just prodigal and BLAST evidence that missed from previous CGP > without prodigal included. > > However, I find out that the coordinates of the CDS from BLAST and > prodigal recorded with infinity "complement(1617923..inf)". From > the run_prediction scripts I see that the prodigal is not used for > reconcile prediction which I am not sure whether it's the cause of > the error, but including prodigal for the start prediction during > prediction reconciliation will be good as I have seen cases that > prodigal are doing better with the start prediction compared to > the other predictors. Hope this is easy to be implemented and > thanks for your time. > > Regards, > Kok > > > > On 12/8/2013 7:25 PM, Lee Katz wrote: >> I know! Prodigal is just so easy to use, and so it was really >> easy to make a wrapper around it. >> >> 2/4 might be ok too, but I do not have enough time to perform any >> rigorous tests to see which way is better. If you have time, >> please let me and the community know which gives you better >> results. I think it would be informative to know what 1/4, 2/4, >> 3/4, and 4/4 gives you for each genome. There is an interesting >> table that the Georgia Tech compgenomics class created this year, >> at >> http://compgenomics2013.biology.gatech.edu/index.php/Gene_Prediction_Group#Gene_Prediction_Pipeline. >> >> >> The way to change the minimum number of predictors is to alter >> the variable $$settings{min_predictors_to_call_orf} in >> run_prediction. >> >> Around line 161 in run_prediction, where it says something like >> # Categorize and reconcile predictions >> Set it back to 2 so that you can have 2/4 predictors. >> >> $$settings{min_predictors_to_call_orf}=2; >> >> >> On Sun, Aug 11, 2013 at 11:57 PM, kok wei <kok...@gm... >> <mailto:kok...@gm...>> wrote: >> >> Wow, that's very great and it's faster than planned. I will >> certainly try out the pipeline on my genome and update you >> with the results. I'm thinking of probably having 2/4 >> evidence will be good enough as false positive is preferred >> over false negative for the gene prediction, any opinion? >> >> Thanks for your efforts and helps. >> >> >> >> On Sun, Aug 11, 2013 at 7:35 PM, Lee Katz <ls...@gm... >> <mailto:ls...@gm...>> wrote: >> >> Hi Kok, I added a script run_prediction_prodigal.pl >> <http://run_prediction_prodigal.pl> into source control. >> It outputs a GFF file of CDS predictions. I also made >> sure it outputs the training file to the temporary >> directory because it seems like you are interested in the >> training files. >> >> I also modified run_prediction and the CGPipelineUtils >> module so that it predicts alongside the other >> predictors. Lastly, I added an option >> prediction_use_prodigal = 1 under the config file so that >> you can enable it for run_prediction. With Prodigal, >> each gene must have 2/3 or 3/4 majority to be called >> (depending on whether you use genemark too). >> >> I'm new to prodigal, so please let me know if it all >> looks correct. The command seems simple enough but I >> don't know if there are any idiosyncrasies to be aware of. >> >> >> On Fri, Aug 2, 2013 at 11:32 PM, Gmail >> <kok...@gm... <mailto:kok...@gm...>> wrote: >> >> Thanks Jay and Lee. It will be great if the option is >> added. I like prodigal for their better start >> prediction (from what i get for my test genome) and >> less false prediction for bacterial genome as >> claimed. Looking forward to the update, thanks! >> >> On 02/08/2013, at 23:51, Lee Katz <ls...@gm... >> <mailto:ls...@gm...>> wrote: >> >>> I'm returning to the US and back to work on Aug 12. >>> It sounds like a worthy addition. >>> >>> I like prodigal but never bothered to put it in as >>> an option. I think it could be something optional >>> like genemark and would be preferred if not using >>> genemark. In this way, CGP would still be able to >>> have a majority for gene calling even if you don't >>> have genemark. >>> >>> On Aug 2, 2013, at 15:43, Jay <jhu...@gm... >>> <mailto:jhu...@gm...>> wrote: >>> >>>> As far as I know, there is no convenient way of >>>> doing this. The run_prediction script would have to >>>> be modified to support running it and parsing the >>>> results. >>>> >>>> On 02/08/2013 22:52, kok wrote: >>>>> Is it possible for cg pipeline to include the >>>>> results of other /ab-initio /predictor (eg. >>>>> prodigal)? Is there any development for this >>>>> function? >>>>> Or if I would like to use prodigal in place of >>>>> genemark (if only two predictors allowed), can I >>>>> convert the results of prodigal into genemark-like >>>>> gm_out.lst file for cg pipeline's run_predict as a >>>>> simple modification? >>>>> >>>>> - kok - >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Get your SQL database under version control now! >>>>> Version control is standard for application code, but databases havent >>>>> caught up. So what steps can you take to put your SQL databases under >>>>> version control? Why should you start doing it? Read more to find out. >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>>>> >>>>> >>>>> _______________________________________________ >>>>> Cg-pipeline-users mailing list >>>>> Cg-...@li... <mailto:Cg-...@li...> >>>>> https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >>>> >>>> ------------------------------------------------------------------------------ >>>> Get your SQL database under version control now! >>>> Version control is standard for application code, >>>> but databases havent >>>> caught up. So what steps can you take to put your >>>> SQL databases under >>>> version control? Why should you start doing it? >>>> Read more to find out. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Cg-pipeline-users mailing list >>>> Cg-...@li... >>>> <mailto:Cg-...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/cg-pipeline-users >> >> >> >> >> -- >> Lee Katz, Ph.D. >> >> >> >> >> >> -- >> Lee Katz, Ph.D. > > > > > -- > Lee Katz, Ph.D. |