I have been working with PocketSphinx (currently under Windoze), with a pleasing degree of success.
In particular, I have managed to generate effective FSG (on the fly) to constrain the recognised, and that seems to work well.
I am now tussling with OOV words. It seems I can add such words to the standard dictionary (i'm using swb.dic at the moment), but it looks like I may be able to put a much larger .dic in there, which will get me some way to where I want to be.
However, there are a number of OOV words (names etc) that probably don't belong there, and could usefully be in a separate dictionary.
"cmdln_macro.h" suggests there is an option to load OOV dictionaries via -oovdict, or personal dictionaries via -perdict, but there is no indication of the file format required. I did assume it would be straight text, just like the standard dictionaries, but if I put a simple pronunciation in there, it blows up with:
FATAL_ERROR: "....\Sphinx3\sphinxbase\src\libsphinxutil\ckd_alloc.c", line 104: calloc(-1074166288,4) failed from ...\SphinxPocket\pocketsphinx-0.3\src\libpocketsphinx\lm_3g.c(1542)
... which isn't too helpful, unless the file format is badly wrong, or something like that....
Can you help (or suggest a more appropriate method???)
Thanks
Alan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, the one word I put into oov.dic works fine when it is swb.dic, but moving it as the only word in oov.dic causes the blow-up.
If I generate an output file, everything looks pretty much the same in each file, right up to the line after " 100711 = LM.trigrams read", which would be "12841 = LM.prob2 entries read", except that it fails with (in the output file)... "\ckd_alloc.c", line 104: calloc(-1074166288,4) failed from \Documents and Settings.... etc"
Bemused...
Alan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have been working with PocketSphinx (currently under Windoze), with a pleasing degree of success.
In particular, I have managed to generate effective FSG (on the fly) to constrain the recognised, and that seems to work well.
I am now tussling with OOV words. It seems I can add such words to the standard dictionary (i'm using swb.dic at the moment), but it looks like I may be able to put a much larger .dic in there, which will get me some way to where I want to be.
However, there are a number of OOV words (names etc) that probably don't belong there, and could usefully be in a separate dictionary.
"cmdln_macro.h" suggests there is an option to load OOV dictionaries via -oovdict, or personal dictionaries via -perdict, but there is no indication of the file format required. I did assume it would be straight text, just like the standard dictionaries, but if I put a simple pronunciation in there, it blows up with:
FATAL_ERROR: "....\Sphinx3\sphinxbase\src\libsphinxutil\ckd_alloc.c", line 104: calloc(-1074166288,4) failed from ...\SphinxPocket\pocketsphinx-0.3\src\libpocketsphinx\lm_3g.c(1542)
... which isn't too helpful, unless the file format is badly wrong, or something like that....
Can you help (or suggest a more appropriate method???)
Thanks
Alan
A bit more information:
Yes, the one word I put into oov.dic works fine when it is swb.dic, but moving it as the only word in oov.dic causes the blow-up.
If I generate an output file, everything looks pretty much the same in each file, right up to the line after " 100711 = LM.trigrams read", which would be "12841 = LM.prob2 entries read", except that it fails with (in the output file)... "\ckd_alloc.c", line 104: calloc(-1074166288,4) failed from \Documents and Settings.... etc"
Bemused...
Alan