Hi, I need to add new words in the graph without recompiling the whole CLG cascade.
I found a method to do it for dynamic transfucer (i.e. on-the fly composition of CL with G),
but the dynamic transducer works slow. How can I add new words to static graph wihtout recompilation of the whole cascade?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm not sure what software are you referring too, but dynamic transducer should be reasonably fast.
How can I add new words to static graph wihtout recompilation of the whole cascade?
It is hard to discuss things without understanding the details. If you consider large vocabulary transcription or some other task. In general it is harder to insert words into already compiled transducer just because it is already compiled. For that reason CMUSphinx does not use WFST framework.
There could be combined solutions for example you can construct small graph from dynamic vocabulary and run it in parallel with statically compiled graph. Those solutions are specialized.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A dynamic graph works slower than a static graph.
I have a large vocabulary (300 000 words), big language model and about 100-500 of new words. The whole cascade recompilation takes about 7 minutes and 6gb memory. And it is problem for me.
Could we find a way to simplify my life?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I need to add new words in the graph without recompiling the whole CLG cascade.
I found a method to do it for dynamic transfucer (i.e. on-the fly composition of CL with G),
but the dynamic transducer works slow. How can I add new words to static graph wihtout recompilation of the whole cascade?
Hello
I'm not sure what software are you referring too, but dynamic transducer should be reasonably fast.
It is hard to discuss things without understanding the details. If you consider large vocabulary transcription or some other task. In general it is harder to insert words into already compiled transducer just because it is already compiled. For that reason CMUSphinx does not use WFST framework.
There could be combined solutions for example you can construct small graph from dynamic vocabulary and run it in parallel with statically compiled graph. Those solutions are specialized.
A dynamic graph works slower than a static graph.
I have a large vocabulary (300 000 words), big language model and about 100-500 of new words. The whole cascade recompilation takes about 7 minutes and 6gb memory. And it is problem for me.
Could we find a way to simplify my life?
Sure, use dynamic decoding and speedup the decoding by tuning beams, reducing model size and parallelizing computation on CPU or GPU.