Acoustic scores in sphinx3 aligner

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Acoustic scores in sphinx3 aligner

Forum: Help

Creator: Anuj Tewari

Created: 2011-11-04

Updated: 2012-09-22

Anuj Tewari - 2011-11-04

For the sphinx 3 aligneroutput, I was wondering why sometimes the acoustic
scores for some phones, is a positive number. If these are log likelihood
probabilities to the logbase 1.0001, shouldn't they be all negative numbers?
Is there scaling happening? If yes, then is there a way I can obtain real
scores?

SFrm EFrm SegAScr Phone
0 2 -54898 SIL
3 5 -219021 SIL
6 12 -307350 M SIL IY b
13 32 131837 IY M SIL e
33 44 345816 SIL
45 68 176492 SIL
69 117 126858 SIL
Total score: 199734

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-04

Acoustic scores are densities, not probabilities. They are not necessary less
that 1.

Sphinx3 aligner output is unscaled.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anuj Tewari - 2011-11-04

Thanks! Is there a way to obtain likelihood probabilities for phones in a
word, using the aligner?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anuj Tewari - 2011-11-04

I am trying to see if I can rate the phonetic breakup of the pronunciation of
a word, using the aligner. For example, if the user says PEAK (P IY K), I
would want to determine the quality of the individual phones (context-
dependent) and then give feedback on pronunciation.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-04

No, aligner doesn't print that. Aligner is for alignment, not for the phone
evaluation.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-04

For pronunciation evaluation see the FAQ please

http://cmusphinx.sourceforge.net/wiki/faq#qhow_to_implement_pronunciation_eva
luation

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anuj Tewari - 2011-11-04

I see. Thanks again! Is there documentation on exactly what the aligner scores
represent then? I could find resources for the decoder, but it is not clear
what the aligner output means. Any pointers would be nice.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anuj Tewari - 2011-11-07

Any help on this would be appreciated. Is there a detailed description of
acoustic scores for Sphinx3, somewhere?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anuj Tewari - 2011-11-08

I ended finding answers to my own questions! :) In case anyone ever follows
this post, this is how they got resolved:
http://cmusphinx.sourceforge.net/wiki/sphinx4:outstandingissues#acoustic_scor
ing

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pranav Jawale - 2012-03-20

Hello nshymrev,

Sphinx3 aligner output is unscaled.

I'm upscaling the sphinx3_align scores using the .bsenscr files produced by
sphinx3_decode. (using word segmentation info in .wdseg, add the
corresponding scores from *.bsenscr).

Is this procedure correct?

If it's correct, why would the same word's score as given by sphinx3_decode be different than that obtained by above method? Internally sphinx3_decode does this upscaling by itself and gives the score.

I'm getting different scores even wheh the word boundaries in sphinx3_decode
and sphinx3_align are exactly the same!

Could it be because phone segmentation assumed by sphinx3_decode is different
than that assumed by sphinx3_align?

Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-03-23

Hi Pranav

On your place I would disable scaling in s3 altogether in the sources and go
sleep in a good mood.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.