I have a question about possibility of using multiple language models in
pocketsphinx.
The problem I'm facing is that I have many users, with their own text
comments, and I want to adapt a background language model for each of them.
Since the size of background language model is very big, (600MB arpa text), I
can not save one language model for each user. So I was wondering if it's
possible to use multiple language models in pocket sphinx.
Bests
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-10-17
I use multiple language models by using multiple decoders in a multi-threaded
architecture. I have a cache of decoders and add and remove language models to
each of the decoders. Each decoder is driven by a supervisor thread which
listens on TCP sockets for audio buffers and the buffers are then assigned to
decoders along with the appropriate language model. The supervisor then
signals the decoder thread to operate on the acoustic buffer using the model
assigned to it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-10-17
I should add that my system is a distributed system with many clients, and has
many servers running many servers with multi-threaded decoders and databases.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If I understand it right, it will use new lm instead of the previous lm.
ngs = (ngram_search_t *)search; /* Free any previous lmset if this is a new one. */
if (ngs->lmset != NULL && ngs->lmset != lmset)
ngram_model_free(ngs->lmset);
ngs->lmset = lmset;
I want to update the weights (adapt the old lm with new lm)
How can I do that?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, it will use new lmset but lmset itself already can contain multiple models mixed with weights. Please check the model_set_init
function documentation again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a problem now. I made a lmctl file, included two lm files in it, and
passed the lmctl file to pocketsphinx_batch with -lmctl argument. but the
problem is that apparently it will only use the first lm file in the lmctl
file instead of using both files. Is there any other arguments that I should
pass to pocektsphinx_batch? something like a weight or something that is 0 by
default and I have to change it?
Thanks,
Amin
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Lmctl is a different option. The lm interpolation is only supported in API,
not in the command line tool. You need to modify pocketsphinx_batch code to
support it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have a question about possibility of using multiple language models in
pocketsphinx.
The problem I'm facing is that I have many users, with their own text
comments, and I want to adapt a background language model for each of them.
Since the size of background language model is very big, (600MB arpa text), I
can not save one language model for each user. So I was wondering if it's
possible to use multiple language models in pocket sphinx.
Bests
I use multiple language models by using multiple decoders in a multi-threaded
architecture. I have a cache of decoders and add and remove language models to
each of the decoders. Each decoder is driven by a supervisor thread which
listens on TCP sockets for audio buffers and the buffers are then assigned to
decoders along with the appropriate language model. The supervisor then
signals the decoder thread to operate on the acoustic buffer using the model
assigned to it.
I should add that my system is a distributed system with many clients, and has
many servers running many servers with multi-threaded decoders and databases.
Thanks for your response, but I don't think that's the same thing as I want to
use.
Or maybe i didn't understand it fully.
I want to process one audio file with two language models combined together,
with out having to combine them together myself. or something like that.
Yes, pocketsphinx supports that. See
API function in sphinxbase and
in pocketsphinx.
If I understand it right, it will use new lm instead of the previous lm.
I want to update the weights (adapt the old lm with new lm)
How can I do that?
Yes, it will use new lmset but lmset itself already can contain
multiple models mixed with weights. Please check the model_set_init
function documentation again.
oh! I see. That's great. Thanks Nikolay
Hi Nikolay,
I have a problem now. I made a lmctl file, included two lm files in it, and
passed the lmctl file to pocketsphinx_batch with -lmctl argument. but the
problem is that apparently it will only use the first lm file in the lmctl
file instead of using both files. Is there any other arguments that I should
pass to pocektsphinx_batch? something like a weight or something that is 0 by
default and I have to change it?
Thanks,
Amin
Hello
Lmctl is a different option. The lm interpolation is only supported in API,
not in the command line tool. You need to modify pocketsphinx_batch code to
support it.