Kaldi / Discussion / Help: List of words that were misrecognized.

y91 - 2014-10-29

Hi,

I am working with the RM dataset and online2 decoder.

I want the list of words that were misrecognized. The current script uses compute-wer to give the number of INS,DEL and SUB but does not list which words were recognized incorrectly. Does kaldi provide any utilities for such a task?

I know that there are tools like SCTK which give detailed analysis of the errors. I have seen that there are scripts for other datasets that use SCTK, RM however does not. Is it because SCTK requires references in the STM format?

What are my options here?

Thanks,
Yash

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2014-10-29
  
  Yes, sctk requires stm-format references and it gets complex quickly.
  But look at the usage message of compute-wer; you can run it in a mode
  where it will output more detailed information and if you pipe it into sort
  | uniq -c, I think you get something quite usable.
  Dan
  
  On Wed, Oct 29, 2014 at 4:16 PM, Yash y91@users.sf.net wrote:
  
  Hi,
  
  I am working with the RM dataset and online2 decoder.
  
  I want the list of words that were misrecognized. The current script uses
  compute-wer to give the number of INS,DEL and SUB but does not list which
  words were recognized incorrectly. Does kaldi provide any utilities for
  such a task?
  
  I know that there are tools like SCTK which give detailed analysis of the
  errors. I have seen that there are scripts for other datasets that use
  SCTK, RM however does not. Is it because SCTK requires references in the
  STM format?
  
  What are my options here?
  
  Thanks,
  Yash
  
  List of words that were misrecognized.
  https://sourceforge.net/p/kaldi/discussion/1355348/thread/5df0cd0d/?limit=25#616c
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

y91 - 2014-10-30

Hi Dan,

Thanks for you prompt response yesterday. Now, I have the list of words that were recognized incorrectly. I was wondering if there is a way to output more stats for errors using compute-wer, like the files in which these errors occur?

For my purpose, I want to associate each of these errors with a test file.

Thanks,
Yash

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2014-10-30
  
  That isn't possible, I'm afraid. We should probably implement a more
  verbose mode at some point.
  Dan
  
  On Thu, Oct 30, 2014 at 4:08 PM, Yash y91@users.sf.net wrote:
  
  Hi Dan,
  
  Thanks for you prompt response yesterday. Now, I have the list of words
  that were recognized incorrectly. I was wondering if there is a way to
  output more stats for errors using compute-wer, like the files in which
  these errors occur?
  
  For my purpose, I want to associate each of these errors with a test file.
  
  Thanks,
  Yash
  
  List of words that were misrecognized.
  https://sourceforge.net/p/kaldi/discussion/1355348/thread/5df0cd0d/?limit=25#2cd4
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

List of words that were misrecognized.

Forums

Help

List of words that were misrecognized. document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

List of words that were misrecognized.