I have been using the keyword spotting mode of pocketsphinx for a while. I'm wondering what is the best way to reduce as much of the source code/ project in terms of size as possible to make pocketsphinx work as a light-weight keyword detector. Any ideas are appreciated.
Regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have been using the keyword spotting mode of pocketsphinx for a while. I'm wondering what is the best way to reduce as much of the source code/ project in terms of size as possible to make pocketsphinx work as a light-weight keyword detector. Any ideas are appreciated.
Regards
You can
1) Drop unused searches like ngram search and fsg search, drop language model code from sphinxbase
2) Compress the dictionary into CART-WFST
3) Quantize the model to 4 bit to make it smaller.
4) Retrain the model to reduce amount of senones.
That should give you quite significant reduction in size. It is possible to fit everything in 2mb probably.
Other steps depends on your requirements on vocabulary, size and accuracy.
Thanks for the advice.
May I know what is CART-WFST? Can I simply keep only the words I am going to detect in the dictionary?
What document and code should I refer to to quantize the AM?
Is it the part that require a new set of recordings?
Regards
Last edit: Ming Chen 2014-08-12