I have tested the performances of the pocketsphinx engine, and noticed that if I record and resend my audio recording more one or two times (especially when recognizing from the microphone), the recognition accuracy often increases. Could it be a valid technique (and so it is true that the recognizer 'tunes' to the user voice) to improve the performances?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I did a workaround technique
I added a pre recorded audio ( 2 sec) to the first beginning of the audio file I am intending to decode
and it work perfect for me and no need to calibrate cmn or cmninit
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have tested the performances of the pocketsphinx engine, and noticed that if I record and resend my audio recording more one or two times (especially when recognizing from the microphone), the recognition accuracy often increases. Could it be a valid technique (and so it is true that the recognizer 'tunes' to the user voice) to improve the performances?
It is valid but very slow. It is easier to set proper cmninit value instead. You can search the forum, it was discussed many times.
I did a workaround technique
I added a pre recorded audio ( 2 sec) to the first beginning of the audio file I am intending to decode
and it work perfect for me and no need to calibrate cmn or cmninit
Found the discussion here, thanks.