when I create the .arpa file I found 3 ngram
each of theses contain words and probabilities
\1-grams:
p_1 wd_1 bo_wt_1
\2-grams:
p_2 wd_1 wd_2 bo_wt_2
\3-grams:
p_3 wd_1 wd_2 wd_3
what theses probabilities means?? (ie. p_1 , p_2 , p_3 , bo_wt_1 , bo_wt_2)
and what we mean by log back-off probability
Please any help.....
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
when I create the .arpa file I found 3 ngram
each of theses contain words and probabilities
\1-grams:
p_1 wd_1 bo_wt_1
\2-grams:
p_2 wd_1 wd_2 bo_wt_2
\3-grams:
p_3 wd_1 wd_2 wd_3
what theses probabilities means?? (ie. p_1 , p_2 , p_3 , bo_wt_1 , bo_wt_2)
and what we mean by log back-off probability
Please any help.....
The description of the language models are on pages 191-234 of Dan Jurafksy's
book "Speech and language processing"
http://www.cs.colorado.edu/~martin/slp.html