Menu

List of words that were misrecognized.

Help
y91
2014-10-29
2014-10-30
  • y91

    y91 - 2014-10-29

    Hi,

    I am working with the RM dataset and online2 decoder.

    I want the list of words that were misrecognized. The current script uses compute-wer to give the number of INS,DEL and SUB but does not list which words were recognized incorrectly. Does kaldi provide any utilities for such a task?

    I know that there are tools like SCTK which give detailed analysis of the errors. I have seen that there are scripts for other datasets that use SCTK, RM however does not. Is it because SCTK requires references in the STM format?

    What are my options here?

    Thanks,
    Yash

     
    • Daniel Povey

      Daniel Povey - 2014-10-29

      Yes, sctk requires stm-format references and it gets complex quickly.
      But look at the usage message of compute-wer; you can run it in a mode
      where it will output more detailed information and if you pipe it into sort
      | uniq -c, I think you get something quite usable.
      Dan

      On Wed, Oct 29, 2014 at 4:16 PM, Yash y91@users.sf.net wrote:

      Hi,

      I am working with the RM dataset and online2 decoder.

      I want the list of words that were misrecognized. The current script uses
      compute-wer to give the number of INS,DEL and SUB but does not list which
      words were recognized incorrectly. Does kaldi provide any utilities for
      such a task?

      I know that there are tools like SCTK which give detailed analysis of the
      errors. I have seen that there are scripts for other datasets that use
      SCTK, RM however does not. Is it because SCTK requires references in the
      STM format?

      What are my options here?

      Thanks,
      Yash


      List of words that were misrecognized.
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/5df0cd0d/?limit=25#616c


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
  • y91

    y91 - 2014-10-30

    Hi Dan,

    Thanks for you prompt response yesterday. Now, I have the list of words that were recognized incorrectly. I was wondering if there is a way to output more stats for errors using compute-wer, like the files in which these errors occur?

    For my purpose, I want to associate each of these errors with a test file.

    Thanks,
    Yash

     
    • Daniel Povey

      Daniel Povey - 2014-10-30

      That isn't possible, I'm afraid. We should probably implement a more
      verbose mode at some point.
      Dan

      On Thu, Oct 30, 2014 at 4:08 PM, Yash y91@users.sf.net wrote:

      Hi Dan,

      Thanks for you prompt response yesterday. Now, I have the list of words
      that were recognized incorrectly. I was wondering if there is a way to
      output more stats for errors using compute-wer, like the files in which
      these errors occur?

      For my purpose, I want to associate each of these errors with a test file.

      Thanks,
      Yash


      List of words that were misrecognized.
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/5df0cd0d/?limit=25#2cd4


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
MongoDB Logo MongoDB