Hi,
I need something like the hellodigit demo, but that can recognize italian numbers and some other words.
I am not expert in speech recognition and I could not find any simple info on how to train sphinx 4. Can someone give me a hint on teaching some words to sphinx?
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Probably you'd want to submit your speech to Italian database on http://voxforge.org, then model will be trained for you. Although it requires very big amount of audio data.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am currently experimenting with PocketSphinx in Italian with a 20-word vocabulary including numbers 0-10 and a few simple commands. This does definitely NOT require much audio data. I am getting pretty good results training with 3 repetitions of the whole vocabulary (60 words, about 1 minute of speech). Of course the training you get is strongly speaker-dependent.
I didn't know about voxforge.org, anyway my speech data would give little or no contribution to such a database.
Marco
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I need something like the hellodigit demo, but that can recognize italian numbers and some other words.
I am not expert in speech recognition and I could not find any simple info on how to train sphinx 4. Can someone give me a hint on teaching some words to sphinx?
Thanks
You need to train database with sphinxtrain. Documentation is here:
http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html
Probably you'd want to submit your speech to Italian database on http://voxforge.org, then model will be trained for you. Although it requires very big amount of audio data.
I am currently experimenting with PocketSphinx in Italian with a 20-word vocabulary including numbers 0-10 and a few simple commands. This does definitely NOT require much audio data. I am getting pretty good results training with 3 repetitions of the whole vocabulary (60 words, about 1 minute of speech). Of course the training you get is strongly speaker-dependent.
I didn't know about voxforge.org, anyway my speech data would give little or no contribution to such a database.
Marco
> anyway my speech data would give little or no contribution to such a database.
Since there is no real recordings there currently, your voice is very significant :) I'd say every voice is significant.