Menu

How can I get 10-best WER result? (from local/score.sh)

Ken Kim
2015-01-14
2015-01-15
  • Ken Kim

    Ken Kim - 2015-01-14

    Hello all KALDI users !

    First, thanks to provide awesome speech recognition toolkit & this forum.

    I am currently running my own 'isolated word dataset' recognition system in KALDI.

    And I am successfully train this system with DNN, and get a 1-best decoding result.

    Here is my script for scoring from local/score.sh

    "$cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/best_path.LMWT.log \ lattice-scale --inv-acoustic-scale=LMWT "ark:gunzip -c $dir/lat.*.gz|" ark:- \| \ lattice-add-penalty --word-ins-penalty=$word_ins_penalty ark:- ark:- \| \ lattice-best-path --word-symbol-table=$symtab \ ark:- ark,t:$dir/scoring/LMWT.tra || exit 1; "

    I modify this script to get 10-best result by adding 'lattice-to-nbest' line like below :

    "$cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/best_path.LMWT.log \ lattice-scale --inv-acoustic-scale=LMWT "ark:gunzip -c $dir/lat.*.gz|" ark:- \| \ lattice-add-penalty --word-ins-penalty=$word_ins_penalty ark:- ark:- \| \ lattice-to-nbest --acoustic-scale=0.1 --n=10 ark:- ark:- \| \ lattice-best-path --word-symbol-table=$symtab \ ark:- ark,t:$dir/scoring/LMWT.tra || exit 1; "

    But I got nan WER from this script.

    Can anyone give me clue how to get 10-best result?

    Any comments will be appreciated.

    Thank you :)

    Best regards,

    Ken Kim

     
    • Jan "yenda" Trmal

      if you inspect $dir/scoring/*.tra files and compare them with the previous
      files, you will notice, that the utterance ID is different for the
      "lattice-to-nbest | lattice-best-path", for example UTT-A-1 instead
      of UTT-A.
      The program compute-wer then cannot find the reference (as it is looking
      for UTT-A-1) instead of UTT-A

      BTW: I find the concept of nbest wer strange --haven't heard of it yet.
      Even if there would exist something like that, "lattice-to-nbest |
      lattice-best-path" is almost certanly not doing anything better that just
      lattice-best-path -- lattice-to-nbest selects n best paths through the
      lattice and lattice-best-path chooses the best one from these n paths, i.e.
      you will end up with 1-best path again (and I think it will be exactly the
      same as the path obtained from lattice-best-path).

      y.

      On Wed, Jan 14, 2015 at 1:14 PM, Ken Kim kenkim@users.sf.net wrote:

      Hello all KALDI users !

      First, thanks to provide awesome speech recognition toolkit & this forum.

      I am currently running my own 'isolated word dataset' recognition system
      in KALDI.

      And I am successfully train this system with DNN, and get a 1-best
      decoding result.

      Here is my script for scoring from local/score.sh

      "$cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/best_path.LMWT.log \ lattice-scale --inv-acoustic-scale=LMWT "ark:gunzip -c $dir/lat.*.gz|"
      ark:- \| \ lattice-add-penalty --word-ins-penalty=$word_ins_penalty ark:-
      ark:- \| \ lattice-best-path --word-symbol-table=$symtab \ ark:-
      ark,t:$dir/scoring/LMWT.tra || exit 1; "

      I modify this script to get 10-best result by adding 'lattice-to-nbest'
      line like below :

      "$cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/best_path.LMWT.log \ lattice-scale --inv-acoustic-scale=LMWT "ark:gunzip -c $dir/lat.*.gz|"
      ark:- \| \ lattice-add-penalty --word-ins-penalty=$word_ins_penalty ark:-
      ark:- \| \ lattice-to-nbest --acoustic-scale=0.1 --n=10 ark:- ark:- \| \ lattice-best-path --word-symbol-table=$symtab \ ark:-
      ark,t:$dir/scoring/LMWT.tra || exit 1; "

      But I got nan WER from this script.

      Can anyone give me clue how to get 10-best result?

      Any comments will be appreciated.

      Thank you :)

      Best regards,

      Ken Kim

      How can I get 10-best WER result? (from local/score.sh)
      https://sourceforge.net/p/kaldi/discussion/1355347/thread/d4f860f0/?limit=25#a35c


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355347/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Daniel Povey

        Daniel Povey - 2015-01-14

        I think there might be a way to compute the n-best oracle WER given
        existing command line tools.
        The program lattice-oracle computes the lattice oracle WER; and if the
        input lattices consist of the n-best alternatives it will give you what you
        want. You can achieve this by piping your lattices through
        lattice-to-nbest | nbest-to-lattice.

        Dan

        On Wed, Jan 14, 2015 at 7:36 AM, Jan Trmal jtrmal@users.sf.net wrote:

        if you inspect $dir/scoring/*.tra files and compare them with the previous
        files, you will notice, that the utterance ID is different for the
        "lattice-to-nbest | lattice-best-path", for example UTT-A-1 instead
        of UTT-A.
        The program compute-wer then cannot find the reference (as it is looking
        for UTT-A-1) instead of UTT-A

        BTW: I find the concept of nbest wer strange --haven't heard of it yet.
        Even if there would exist something like that, "lattice-to-nbest |
        lattice-best-path" is almost certanly not doing anything better that just
        lattice-best-path -- lattice-to-nbest selects n best paths through the
        lattice and lattice-best-path chooses the best one from these n paths, i.e.
        you will end up with 1-best path again (and I think it will be exactly the
        same as the path obtained from lattice-best-path).

        y.

        On Wed, Jan 14, 2015 at 1:14 PM, Ken Kim kenkim@users.sf.net wrote:

        Hello all KALDI users !

        First, thanks to provide awesome speech recognition toolkit & this forum.

        I am currently running my own 'isolated word dataset' recognition system
        in KALDI.

        And I am successfully train this system with DNN, and get a 1-best
        decoding result.

        Here is my script for scoring from local/score.sh

        "$cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/best_path.LMWT.log \ lattice-scale --inv-acoustic-scale=LMWT "ark:gunzip -c $dir/lat.*.gz|"
        ark:- \| \ lattice-add-penalty --word-ins-penalty=$word_ins_penalty ark:-
        ark:- \| \ lattice-best-path --word-symbol-table=$symtab \ ark:-
        ark,t:$dir/scoring/LMWT.tra || exit 1; "

        I modify this script to get 10-best result by adding 'lattice-to-nbest'
        line like below :

        "$cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/best_path.LMWT.log \ lattice-scale --inv-acoustic-scale=LMWT "ark:gunzip -c $dir/lat.*.gz|"
        ark:- \| \ lattice-add-penalty --word-ins-penalty=$word_ins_penalty ark:-
        ark:- \| \ lattice-to-nbest --acoustic-scale=0.1 --n=10 ark:- ark:- \| \ lattice-best-path --word-symbol-table=$symtab \ ark:-
        ark,t:$dir/scoring/LMWT.tra || exit 1; "

        But I got nan WER from this script.

        Can anyone give me clue how to get 10-best result?

        Any comments will be appreciated.

        Thank you :)

        Best regards,
        Ken Kim

        How can I get 10-best WER result? (from local/score.sh)

        https://sourceforge.net/p/kaldi/discussion/1355347/thread/d4f860f0/?limit=25#a35c

        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355347/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/


        How can I get 10-best WER result? (from local/score.sh)
        http://sourceforge.net/p/kaldi/discussion/1355347/thread/d4f860f0/?limit=25#a35c/46ec


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355347/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

         
        • Ken Kim

          Ken Kim - 2015-01-15

          Thank you !

          I will try piping lattices through lattice-to-nbest | nbest-to-lattice too.

           
      • Ken Kim

        Ken Kim - 2015-01-15

        Thank you ! I find n-best decoding result in *.tra files.

        I can manually analyze result from these files.