I am trying to develop sphinx to identify different accent without training the hmm file.
So, what I did is:-
1. took a large file (which had the phrase 'the effort right from the start are evident') and built a language model out of it
2. uttered the sentence (the effort right from the start are evident), used phoneme recogniser to identify the phonemes in it. (Which came out as "D EH D AY UH D AE T AH D AY D AH Z EH R AH N IH Z IH HH UH D")
3. now made a .dict file where only these 7 words and only corresponding phoneme (for ex. the is D EH, effort is D AY UH D and so on)
4. tried to decode with pocketsphinx_continuous using the above built dict file and lm file
But still I am not getting the output correctly as ('the effort ...) . Could anyone explain to me why this has not gotten trained to identify and how can I improve its efficiency.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just wanted to verify that the construction of the text from the phonemes is accurate given lm and dictionary. Is there any other way to do it? And if it is inaccurate, then how would I go about improving it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am trying to develop sphinx to identify different accent without training the hmm file.
So, what I did is:-
1. took a large file (which had the phrase 'the effort right from the start are evident') and built a language model out of it
2. uttered the sentence (the effort right from the start are evident), used phoneme recogniser to identify the phonemes in it. (Which came out as "D EH D AY UH D AE T AH D AY D AH Z EH R AH N IH Z IH HH UH D")
3. now made a .dict file where only these 7 words and only corresponding phoneme (for ex. the is D EH, effort is D AY UH D and so on)
4. tried to decode with pocketsphinx_continuous using the above built dict file and lm file
But still I am not getting the output correctly as ('the effort ...) . Could anyone explain to me why this has not gotten trained to identify and how can I improve its efficiency.
It is pretty senseless activity.
I just wanted to verify that the construction of the text from the phonemes is accurate given lm and dictionary. Is there any other way to do it? And if it is inaccurate, then how would I go about improving it.
Write proper dictionary and adapt the acoustic model.
Thanks man. I will try building a dictionary according to my pronounciation and then adapting the model for it.