Menu

Negative LM-Score in lattice file - Is it possible?

Help
Micheal H
2015-05-09
2015-05-11
  • Micheal H

    Micheal H - 2015-05-09

    Hi,
    I'm using LibriSpeech acoustic modesl, with my bigram language model.
    After looking at the lattice file, I can notice some negative numbers in the lm-score, i.e:

    82 88 LESS -3.68641,1481.89,17_1568_1567_1567_1567_1570_1572_1571_1502_1501_1501_1501_1501_1501_1501_1501_1501_1504_1503_1503_1503_1506_1505_1505_1505_2162_2164_2166_2165_2165_188_187_187_187_187_187_187_187_190_189

    Is it possible? isn't it negative log-prob?
    If not, what could be the problem?

    Thanks !

     
  • Guoguo Chen

    Guoguo Chen - 2015-05-09

    It only makes sense if you look at the weight of the whole path -- WFST can distribute weights along the path arbitrarily, and weight of individual arc may not make sense.

    Guoguo

     
  • Micheal H

    Micheal H - 2015-05-11

    Thanks for replying, but can you more specified,
    in which cases arcs set to be negative?

    it is very strange to me, because negative weights means probability larger than 1.

     
  • Guoguo Chen

    Guoguo Chen - 2015-05-11

    I guess what I was trying to say was: don't interpret individual arc weights as negated log probabilities. If you need posteriors of certain arc, you have to do forward-backward on the lattice.

    Consider the following two WFST examples in log semiring:
    WFST1:
    0 1 1 1 -1
    1 2 2 2 1
    2

    WFST2:
    0 1 1 1 0
    1 2 2 2 0
    2

    They are equivalent in the sense that the total weights of the path (in this a single path) are the same. So you can actually create equivalent WFSTs with different weights on each arc, and that's why I said WFST can distribute weights along the path arbitrarily.

    Guoguo

     
  • Micheal H

    Micheal H - 2015-05-11

    Can you please specify where "break an arc into positive and negative weight" is necessary during speech recognition software process?

     
  • Guoguo Chen

    Guoguo Chen - 2015-05-11

    It is not necessary, it is more like a byproduct -- the WFST algorithms will have to manipulate arc weights, and they don't guarantee the weight on a single arc is negated logprob.

    Guoguo

     
MongoDB Logo MongoDB