Theory behind unigram and language wt

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Theory behind unigram and language wt

Forum: Help

Creator: Pranav Jawale

Created: 2011-06-18

Updated: 2012-09-22

Pranav Jawale - 2011-06-18

Hi,

I was debugging sphinx3_decode and I found that a function

static void lm_uw(lm_t * lm, float64 uw)

is used to change the unigram probabilities of the words. As in if there are 5
equiprobable words (Pr(each word) = 0.2).
Then lm_uw function is called. uw(unigram weight) is set to 0.7 by default.

The comment in this function says "/ Interpolate unigram probs with uniform
PDF, with weight uw /"

What the function does is, if the unigram probability (as defined in LM) is
0.2, it will modify it to
0.2*uw + (1-uw)/(#words - 1)

Why is this interpolating done? Could someone point to some ref?

Later on in lm_set_param(lm, lw, wip) this updated unigram prob is modified in
the following way

log(unigram_prob)*language_weight + log(word_insertion_penalty)

lm->ug[i].prob.l = (int32) ((lm->ug[i].prob.l - lm->wip) * f) + iwip;

lm->wip is zero so can be neglected. iwip corresponds to -wip param given to
decoder.

Here what is the 'theory' behind language weight?

Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-18

Hello

Smoothing with uniform probability is a common thing in language model
estimation, see for example descriptoin of smoothing here:

An Empirical Study of Smoothing Techniques for Language Modeling

Stanley F. Chen and Joshua Goodman

http://research.microsoft.com/en-
us/um/people/joshuago/tr-10-98.pdf

unigram weight is just a way to adjust this smoothing in runtime.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Vassil Panayotov - 2011-06-19

Language model weight is usually used to balance/tune the relative influence
of the acoustic and language models on the search. My understanding is that if
you don't use lw, LM will have a negligible impact and the outcome of the
search will be decided by the acoustic model alone.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pranav Jawale - 2011-06-19

Thanks.

So "uw" is used to make all the unigram probabilities more uniform as the text
from which LM is created might induce bias towards some words.

"lw" seems more like an empirically determined param. I also found a paper
where they try to vary lw (no better results though)

Towards a Dynamic Adjustment of the Language Weight (2001)
by Georg Stemmer , Viktor Zeissler , Elmar Nöth , Heinrich Niemann
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.856

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.