I was debugging sphinx3_decode and I found that a function
static void
lm_uw(lm_t * lm, float64 uw)
is used to change the unigram probabilities of the words. As in if there are 5
equiprobable words (Pr(each word) = 0.2).
Then lm_uw function is called. uw(unigram weight) is set to 0.7 by default.
The comment in this function says "/ Interpolate unigram probs with uniform
PDF, with weight uw /"
What the function does is, if the unigram probability (as defined in LM) is
0.2, it will modify it to
0.2*uw + (1-uw)/(#words - 1)
Why is this interpolating done? Could someone point to some ref?
Later on in lm_set_param(lm, lw, wip) this updated unigram prob is modified in
the following way
Language model weight is usually used to balance/tune the relative influence
of the acoustic and language models on the search. My understanding is that if
you don't use lw, LM will have a negligible impact and the outcome of the
search will be decided by the acoustic model alone.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I was debugging sphinx3_decode and I found that a function
is used to change the unigram probabilities of the words. As in if there are 5
equiprobable words (Pr(each word) = 0.2).
Then lm_uw function is called. uw(unigram weight) is set to 0.7 by default.
The comment in this function says "/ Interpolate unigram probs with uniform
PDF, with weight uw /"
What the function does is, if the unigram probability (as defined in LM) is
0.2, it will modify it to
0.2*uw + (1-uw)/(#words - 1)
Why is this interpolating done? Could someone point to some ref?
Later on in lm_set_param(lm, lw, wip) this updated unigram prob is modified in
the following way
log(unigram_prob)*language_weight + log(word_insertion_penalty)
lm->wip is zero so can be neglected. iwip corresponds to -wip param given to
decoder.
Here what is the 'theory' behind language weight?
Thanks.
Hello
Smoothing with uniform probability is a common thing in language model
estimation, see for example descriptoin of smoothing here:
An Empirical Study of Smoothing Techniques for Language Modeling
Stanley F. Chen and Joshua Goodman
http://research.microsoft.com/en-
us/um/people/joshuago/tr-10-98.pdf
unigram weight is just a way to adjust this smoothing in runtime.
Language model weight is usually used to balance/tune the relative influence
of the acoustic and language models on the search. My understanding is that if
you don't use lw, LM will have a negligible impact and the outcome of the
search will be decided by the acoustic model alone.
Thanks.
So "uw" is used to make all the unigram probabilities more uniform as the text
from which LM is created might induce bias towards some words.
"lw" seems more like an empirically determined param. I also found a paper
where they try to vary lw (no better results though)
Towards a Dynamic Adjustment of the Language Weight (2001)
by Georg Stemmer , Viktor Zeissler , Elmar Nöth , Heinrich Niemann
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.856