Pocketsphinx - TIDIGITS accuracy data

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Pocketsphinx - TIDIGITS accuracy data

Forum: Help

Creator: creative64

Created: 2012-08-29

Updated: 2012-09-22

creative64 - 2012-08-29

Hi Nickolay,

Are there any accuracy (WER) studies done on pocketsphinx on TIDIGITS on any other similar reference
database ? Are those numbers available in public domain ?

Where can we get the TIDIGITS database from ?

Thanks and regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-08-29

Are there any accuracy (WER) studies done on pocketsphinx on TIDIGITS on any
other similar reference database ? Are those numbers available in public
domain ?

TOTAL Words: 28564 Correct: 28354 Errors: 259 TOTAL Percent correct = 99.26% Error = 0.91% Accuracy = 99.09% TOTAL Insertions: 49 Deletions: 60 Substitutions: 150

Where can we get the TIDIGITS database from ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2012-08-29

Hi Nickolay,

1) Thanks for your quick response. For the information you supplied, could you
please provide some additional info:

a) Are the numbers based on TIDIGIT database ?
b) Are they on "clean" speech or on "clean and noisy" inputs ?
c) Vocabulary size ?
d) Grammar or LM ?
d) Acoustic model ?

2) Springer Handbook of speech processing mentions studies conducted by ETI
STQ on Aurora-2 task where they give
data corresponding to WER based on "Acoustic model trained on clean data" with
audio with various ratios of noises
and "Acoustic model trained on Noisy data" with audio with various ratios of
noises......and various parameters like
without/with CMN etc...

Could it be said that Pocketsphinx data would be in the same range or better ?

Thanks and regards,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-08-29

a) Are the numbers based on TIDIGIT database

Yes

Are they on "clean" speech or on "clean and noisy" inputs ?

Tidigits database contain clean recordings

c) Vocabulary size

11 digits

? d) Grammar or LM

lm

d) Acoustic model

Tidigits acoustic model

? 2) Springer Handbook of speech processing mentions studies conducted by
ETI STQ on Aurora-2 task where they give data corresponding to WER based on
"Acoustic model trained on clean data" with audio with various ratios of
noises and "Acoustic model trained on Noisy data" with audio with various
ratios of noises......and various parameters like without/with CMN etc...
Could it be said that Pocketsphinx data would be in the same range or better ?
Thanks and regards,

Aurora 2 is a different database and results must be different. Pocketsphinx
results must be in the same range.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.