Is acoustic score computed by forward algorithm in Sphinx?
You need to be more specific in what do you mean by "Sphinx". Acoustic score is computed in various places of CMUSphinx toolkit - in decoders, aligner, trainer. Viterbi (Forward) algorithm in different variations is used.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi! Thank you for your reply.
I'm talking about acoustic score that i obtain from forced alignment. I give in input to Sphinx the transcription, and the corrisponding audio. It gives me acoustic score per phoneme. Is that acoustic score computed by forward algorithm?
Thank you
Last edit: Davide Mangiameli 2014-02-24
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you!
You said that decoding destroyed acoustic score probabilistic nature. If i use it in comparison could it be useful? For example, if i have mean acoustic score from correct pronunciations of Phoneme AH. And i have an acoustic score from a good pronunciation of the same Phoneme. If i compare them they should be similar. Is it correct?
Thank You :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I read a lot of papers on the web. Someone for the pronunciation scoring use a log-posterior probability. But i find a paper in which they seem to use acoustic score, saving mean and standard deviation of acoustic scores per phoneme from correct pronunciation aligned with text, and then they calculate z-score to evaluate new pronunciations. Is it possible? Or i misunderstand it all?
ref http://aclweb.org/anthology/W/W12/W12-5808.pdf
Thank you.
Bye
Last edit: Davide Mangiameli 2014-02-27
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
standard deviation of acoustic scores per phoneme from correct pronunciation aligned with text, and then they calculate z-score to evaluate new pronunciations. Is it possible?
I doubt you describe the paper method properly. Most likely the score is per state, not per phoneme.
If you want to discuss some specific paper give a link on it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok... :) My mistake.... Sorry
So if i understand it well, acoustic score is not useful in pronunciation scoring.
Confidence Score seem to be better. But if i understand it well, Sphinx does not return this score automatically. I need to build it. Is it correct?
Thank you for your patience
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all,
I read that acoustic score is a likelihood. Is that correct?
But i don't understand very well what is it.
Can i define acoustic score as
P(O|X) where O is an observation and X is the model?
Thanks and sorry if i'm completly wrong :)
Not really, score might undergo some normalization during decoding which destroy its probabilistic nature
No, score is usually scaled, so it's not P(O|X) but something like C * P(O|X) or close to it
Thank you very much
I'm sorry if a have another question :)
Is acoustic score computed by forward algorithm in Sphinx?
Thank you in advice :)
You need to be more specific in what do you mean by "Sphinx". Acoustic score is computed in various places of CMUSphinx toolkit - in decoders, aligner, trainer. Viterbi (Forward) algorithm in different variations is used.
Hi! Thank you for your reply.
I'm talking about acoustic score that i obtain from forced alignment. I give in input to Sphinx the transcription, and the corrisponding audio. It gives me acoustic score per phoneme. Is that acoustic score computed by forward algorithm?
Thank you
Last edit: Davide Mangiameli 2014-02-24
Yes
Thank you!
You said that decoding destroyed acoustic score probabilistic nature. If i use it in comparison could it be useful? For example, if i have mean acoustic score from correct pronunciations of Phoneme AH. And i have an acoustic score from a good pronunciation of the same Phoneme. If i compare them they should be similar. Is it correct?
Thank You :)
No, it is not going to work this way.
You can learn more about pronunciation scoring methods and applications from the papers on the web.
Hi Nickolay.
I read a lot of papers on the web. Someone for the pronunciation scoring use a log-posterior probability. But i find a paper in which they seem to use acoustic score, saving mean and standard deviation of acoustic scores per phoneme from correct pronunciation aligned with text, and then they calculate z-score to evaluate new pronunciations. Is it possible? Or i misunderstand it all?
ref http://aclweb.org/anthology/W/W12/W12-5808.pdf
Thank you.
Bye
Last edit: Davide Mangiameli 2014-02-27
I doubt you describe the paper method properly. Most likely the score is per state, not per phoneme.
If you want to discuss some specific paper give a link on it.
This paper is not very professional and not worth attention, it contains few conceptual mistakes.
Ok... :) My mistake.... Sorry
So if i understand it well, acoustic score is not useful in pronunciation scoring.
Confidence Score seem to be better. But if i understand it well, Sphinx does not return this score automatically. I need to build it. Is it correct?
Thank you for your patience