Hi, all
After researching several days, I think I start to understand the acoustic model/language model/dictionary.
From Sphinx 4 API, it's very easy to customize the implementation of language model and dictionary. Especially I can store the language model and dictionary in the central server and ask the implementation to get the data from remote calls.
But I found it seemed that it is not that easy to do so for acoustic model.
From my understanding, the acoustic model needs data to be trained to build a HMM model. It will decide which HMM state is the most possible state for the input phoneme. It must need a lot of data to train to get a good model. I also understand I can buy the commercial data from LDC and train them using the Sphinx train tool to get the model files. It seems that sphinx will use ant task to help developers to generate the Model and ModelLoader class implementation. From the source code of WSJ, it seems doesn't provide the feasibility to change how to access the model data, for an example, from local file system or from files inside a jar file or from a remote service.
I understand I can modify the implementation manually to make the implementation to access the remote service and cache some data locally. But I didn't figure out if Sphinx already had this kind of mechanism. If so, I won't bother to reinvent the wheel.
Can someone help me to show me the correct way? Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, all
After researching several days, I think I start to understand the acoustic model/language model/dictionary.
From Sphinx 4 API, it's very easy to customize the implementation of language model and dictionary. Especially I can store the language model and dictionary in the central server and ask the implementation to get the data from remote calls.
But I found it seemed that it is not that easy to do so for acoustic model.
From my understanding, the acoustic model needs data to be trained to build a HMM model. It will decide which HMM state is the most possible state for the input phoneme. It must need a lot of data to train to get a good model. I also understand I can buy the commercial data from LDC and train them using the Sphinx train tool to get the model files. It seems that sphinx will use ant task to help developers to generate the Model and ModelLoader class implementation. From the source code of WSJ, it seems doesn't provide the feasibility to change how to access the model data, for an example, from local file system or from files inside a jar file or from a remote service.
I understand I can modify the implementation manually to make the implementation to access the remote service and cache some data locally. But I didn't figure out if Sphinx already had this kind of mechanism. If so, I won't bother to reinvent the wheel.
Can someone help me to show me the correct way? Thanks.
> But I didn't figure out if Sphinx already had this kind of mechanism.
No, there is no such mechanizm. It's rather strange, but you can implement it if needed.