I am investigating the application of the Sphinx-III recognizer in a DSP/embedded system. The implementation is required to handle multiple audio channels and I am assuming I will be able to sequentially retrieve a frame of audio data for each channel and present that frame to the recognizer with each channel being processed independent of the others.
I am interested in the amount of memory required to maintain the state for each channel. Can someone point me to appropriate code locations describing any information (structures, arrays, etc) that would need to be maintained for each channel? Or, are there any publications describing these requirements?
Thanks in advance!
Brad
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am assuming I will be able to sequentially retrieve a frame of audio data for each channel and present that frame to the recognizer with each channel being processed independent of the others.
This assumption is wrong. You have to create a separate recognizer for every channel.
Can someone point me to appropriate code locations describing any information (structures, arrays, etc) that would need to be maintained for each channel?
That would be the whole codebase.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I am investigating the application of the Sphinx-III recognizer in a DSP/embedded system. The implementation is required to handle multiple audio channels and I am assuming I will be able to sequentially retrieve a frame of audio data for each channel and present that frame to the recognizer with each channel being processed independent of the others.
I am interested in the amount of memory required to maintain the state for each channel. Can someone point me to appropriate code locations describing any information (structures, arrays, etc) that would need to be maintained for each channel? Or, are there any publications describing these requirements?
Thanks in advance!
Brad
This assumption is wrong. You have to create a separate recognizer for every channel.
That would be the whole codebase.
Thanks for the quick reply.