Menu

precision while computing likelihoods

Help
2012-06-01
2012-09-22
  • Pranav Jawale

    Pranav Jawale - 2012-06-01

    Hello,

    I'm using s3 nbest list to compute LVCSR LL ratio as in this paper - LVCSR
    log-likelihood ratio scoring for keyword spotting http://citeseerx.ist.psu.ed
    u/viewdoc/summary?doi=10.1.1.156.669
    (Pls see eqn. 5 )

    Since computation involvs sum and ratio of probabilities, I thought it might
    be better to use high precision computation library
    (http://gmplib.org/) to compute 1.0003^logLikelihood

    Anyway, the kind of probabilities I'm seeing are of the range E-14 per word !
    Is this alright? Is the GMM spread over the feature space so much that it
    gives rise to such low probabilities?

     
  • Pranav Jawale

    Pranav Jawale - 2012-06-01

    Please advise me as to how much precision (how many bytes) I should keep while
    doing probability summing/multiplying operations. If I want to compute
    P(W)*P(O|W) , I first do ; then I do 1.0003^.

    Thanks.

     
  • Nickolay V. Shmyrev

    I thought it might be better to use high precision

    This is a bad idea. Logbase is exactly to save precision

    Anyway, the kind of probabilities I'm seeing are of the range E-14 per word
    ! Is this alright? Is the GMM spread over the feature space so much that it
    gives rise to such low probabilities?

    Yes, the dimension is very high (39). That's why logbase is used for
    calculations.

    Probability sum could be calculated without converting to linear scale, in the
    log domain. logmath_add function exists exactly for that reason.

     
  • Pranav Jawale

    Pranav Jawale - 2012-06-01

    Hello,

    I had thought about that, but I can't directly use sphinxbase function
    (logmath_add) as it requires lmath object which I don't know how to create in
    my external c-code. Please let me know if some easy solution exists.

    logmath.c
    http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/sphinxbase/src/li
    bsphinxbase/util/logmath.c?revision=11275&view=markup

    As I gather, logmath_add takes help of a prespecified log table, and if the
    value (x - y) is not present in the table, then it falls back to
    logmath_add_exact which uses the pow function in math library. If using
    pow function is "exact" (or is it a misnomer here), why is it not a good idea
    to use it by default?

     
  • Pranav Jawale

    Pranav Jawale - 2012-06-01

    figured how to get logmath object (logmath_t *lmath = logmath_init(1.0001, 0,
    0);)

     
  • Nickolay V. Shmyrev

    and if the value (x - y) is not present in the table, then it falls back to
    logmath_add_exact

    Not like that, it falls back if there is no table, it doesn't fall back if
    there is no value

    why is it not a good idea to use it by default?

    Because it loose the precision doing exponents two times. The right way is to
    use a formula involving x - y which is used during the construction of the
    table

    or is it a misnomer here

    The name is not good

     
  • Pranav Jawale

    Pranav Jawale - 2012-06-05

    Thanks for the clarification. I went through logmath_init() to see how the log
    table is created. We are saving the precision by doing only 1 exponention
    (instead of 2).

    Another thing is, (I think), the logtable seems to be constrained to give only
    INTEGER answers as per it's logic. Which may introduce some small errors.
    (Actually the errors may be because of 1/x function, I'm not sure)

    For example, when I wanted to add 1000 and 2000 in log domain, I get 3848.

    i.e.

    log(1.0003^1000 + 1.0003^2000) = 3848 = log(1.0003^3848)
    or 1.0003^1000 + 1.0003^2000 = 1.0003^3848

    This result is off by 1.8E-4 acc to my calculator. If one needs more accuracy,
    then perhaps using more precision while creating log table might help.

     
  • Nickolay V. Shmyrev

    In sphinx4 for example logtable uses floats.

     

Log in to post a comment.