Menu

Problem with n-best list

Help
2012-01-05
2012-09-22
  • Pranav Jawale

    Pranav Jawale - 2012-01-05

    Hello,

    We've observed 2 problems with nbest list that is created in sphinx3. Please
    find this zip (http://home.iitb.ac.in/~pranavj/nbest_problems.zip) which contains trimmed log file

    203silTIFR_decode.log.small.txt
    

    , two n-best files

    F13MH02A0118I411.nbest and F13MH04A0303I309.nbest
    

    .

    1. Problem with **fwdflat ** mode, wrong number of frames
      Number of frames as shown in n-best file is not the same as number of frames
      in decodelog. Decode-log version is correct.
      For example, for file F13MH02A0118I411 nbest list shows 755 frames while
      decode_log shows 188 frames

      FWDXCT: F13MH02A0118I411 S -1953087 T -4148068 A -4119218 L -28850 0 221969 808 10 -1480076 -20054 saraasarii(2) 76 -1782491 -476 baqgaalii 148 -405385 -7676 +horn+ 155 -405977 -1224 canxe 178 -267258 -228 188

    When fwdtree mode is used this problem is not observed.

    1. The hypothesis with "best" score in nbest list is not the one which is shown in decode_log.

    For the file F13MH04A0303I309

    decode_log shows

    FWDXCT: F13MH04A0303I309 S -1897607 T -5219618 A -5087530 L -132088 0 355327 808 <s> 12 -570300 -17528 eka 29 -654016 -13411 tiina 50 -1219260 -7676 +babble+ 90 210517 -7676 +pau+ 115 -190847 -7676 <sil> 139 -161975 -7676 +bn+ 171 -438656 -17716 te 188 487541 -7676 +pau+ 216 -1572783 -22945 tiinashe 254 -1207754 -17016 tiinashe 297 -125324 -5900 </s> 301
    

    Top-line in nbest list matches with decode_log, but the line with best T score
    is third one in n-best list

    T -4432187 A -3423487 L -98098 0 -58323 0 <s> 12 -558565 -34342 eighty 50 -674715 -7676 +babble+ 90 -143318 -7676 +pau+ 115 -274708 -7676 <sil> 139 -432205 -7676 +bn+ 171 -338568 -7676 +babble+ 188 -146472 -7676 +pau+ 216 -352156 -5900 tiinashe 254 -444457 -5900 tiinashe 297 0 -5900 </s> 5246
    

    For file F13MH02A0118I411

    decode_log shows

    FWDXCT: F13MH02A0118I411 S -1953087 T -4148068 A -4119218 L -28850 0 221969 808 <s> 10 -1480076 -20054 saraasarii(2) 76 -1782491 -476 baqgaalii 148 -405385 -7676 +horn+ 155 -405977 -1224 canxe 178 -267258 -228 </s> 188
    

    The above matches with second line in n-best list (which is also the line with
    best T score).

    Could you please explain ambiguity in o/p of n-best list?

    Thanks.

     
  • Pranav Jawale

    Pranav Jawale - 2012-01-09

    Could you please confirm whether the 2nd problem is actually a problem
    (best T score hypothesis in nbest list != decoder hyp) or not? i.e. is it
    theoretically possible for them to be different (with same params)?

     
  • Nickolay V. Shmyrev

    I don't think it's a problem. There are insertion penalties and fillers
    handled differently during decoder and dag search to generate n-best after
    that. The details depend on the type of search you are using though, is it TST
    search or something else.

    It might be debugged

     

Log in to post a comment.