I have a test database composed of files that are named XX_YY_ZZZZ.mfc, where
XX is the speaker id, YY is the recording id (phrases type 1, phrases type 2,
etc) and ZZZZ is the file id.
As explained in the tutorial I've created a list of all the test files
(files_test.fileids) and a list of all the transcriptions
(files_test.transcription) and I'm using these lists to do the decoding. I'm
getting a Sentence error and a WER.
Is there any way of getting the WER per speaker or the WER per recording type
instead of the overall WER? At this point I'm creating separate lists of
fileids and transcription for every test that I need to do, but that's really
time consuming when you have more than 10 speakers in the testing database...
I'm thinking to go into the internals of ./scripts_pl/decode/slave.pl and
modify the part that computes the WER (the decoding can be done using the full
lists, but the word aligning and WER calculation has to be split into several
parts). Do I need to do that or is there a simpler way of computing the per
speaker WER?
Horia
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You can install NIST sclite tool and configure it in sphinx_decode.cfg instead
of builtin word_align.pl. sclite gives much more comprehensive statistics.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It was really simple to integrate sclite within sphinx system :). Now I only
have to learn how to use sclite to compute WER on groups of files. And I also
have to modify etc/decode/slave.pl to print out the various statistics.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have a test database composed of files that are named XX_YY_ZZZZ.mfc, where
XX is the speaker id, YY is the recording id (phrases type 1, phrases type 2,
etc) and ZZZZ is the file id.
As explained in the tutorial I've created a list of all the test files
(files_test.fileids) and a list of all the transcriptions
(files_test.transcription) and I'm using these lists to do the decoding. I'm
getting a Sentence error and a WER.
Is there any way of getting the WER per speaker or the WER per recording type
instead of the overall WER? At this point I'm creating separate lists of
fileids and transcription for every test that I need to do, but that's really
time consuming when you have more than 10 speakers in the testing database...
I'm thinking to go into the internals of ./scripts_pl/decode/slave.pl and
modify the part that computes the WER (the decoding can be done using the full
lists, but the word aligning and WER calculation has to be split into several
parts). Do I need to do that or is there a simpler way of computing the per
speaker WER?
Horia
You can install NIST sclite tool and configure it in sphinx_decode.cfg instead
of builtin word_align.pl. sclite gives much more comprehensive statistics.
Thanks!
It was really simple to integrate sclite within sphinx system :). Now I only
have to learn how to use sclite to compute WER on groups of files. And I also
have to modify etc/decode/slave.pl to print out the various statistics.