I have sphinx4 installed and have successfully run several of the demos. I also reconfigured test/performance/an4 to use the WSJ dictionary and that ran correctly although this was still using the simple wordlist grammar.
Now, I'm trying to reconfigure the wavfile demo to also use the WSJ dictionary and acoustic model with a LexTreeLinguist. Everything now compiles and runs but I always run out Java heap space. The stack trace is shown below:
java -Xms1424m -Xmx1424m -jar bin/WavFile.jar
Loading Recognizer as defined in 'jar:file:/root/sphinx4/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml'...
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeState.createUnitStateArc(LexTreeLinguist.java:695)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeWordState.getSuccessors(LexTreeLinguist.java:1390)
at edu.cmu.sphinx.linguist.util.LinguistStats.run(LinguistStats.java:36)
at edu.cmu.sphinx.instrumentation.RecognizerMonitor.statusChanged(RecognizerMonitor.java:88)
at edu.cmu.sphinx.recognizer.Recognizer.setState(Recognizer.java:142)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:158)
at edu.cmu.sphinx.demo.wavfile.WavFile.main(WavFile.java:46)
The more memory I allocate, the long it takes to generate this error but I'm wondering if it really is just a problem of needing more memory or if there is something wrong with my configuration. I'm very new to this and could quite easily have something silly in my config file (see below).
Any ideas? How much memory should I expect this to require? Thanks,
I am running from subversion but hadn't updated recently. I'm now at r9059. I removed linguistStats from the recognizerMonitor and recompiled everything and it now seems to work:
>:~/sphinx4# java -Xms1024m -Xmx1024m -jar bin/WavFile.jarLoading Recognizer as defined in 'jar:file:/root/sphinx4/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml'...
Decoding jar:file:/root/sphinx4/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav
This Time Audio: 2.88s Proc: 0.40s Speed: 0.14 X real time
Total Time Audio: 2.88s Proc: 0.40s Speed: 0.14 X real time
Mem Total: 1016.12 Mb Free: 903.10 Mb
Used: This: 113.03 Mb Avg: 113.03 Mb Max: 113.03 Mb
Result: on am
Or at least it didn't run out of heap space. Thanks! Was the problem that my code was somewhat out of date or was it the linguistStat that was causing the problem?
Mitch
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
After some investigation I think it's not a bug. The idea of LinguistStats component is to traverse all search states of Linguist and collect the statistics. It's possible with small jsgf, while with lexTreeLinguist the search graph is huge and it's impossible to traverse it without prunning. So basically the application of this monitor to lexTreeLinguist is not practical.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
>I am running from subversion but hadn't updated recently. I'm now at r9059. I removed linguistStats from the recognizerMonitor and recompiled everything and it now seems to work
It looks like a bug that would be nice to fix.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi -
I have sphinx4 installed and have successfully run several of the demos. I also reconfigured test/performance/an4 to use the WSJ dictionary and that ran correctly although this was still using the simple wordlist grammar.
Now, I'm trying to reconfigure the wavfile demo to also use the WSJ dictionary and acoustic model with a LexTreeLinguist. Everything now compiles and runs but I always run out Java heap space. The stack trace is shown below:
java -Xms1424m -Xmx1424m -jar bin/WavFile.jar
Loading Recognizer as defined in 'jar:file:/root/sphinx4/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml'...
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeState.createUnitStateArc(LexTreeLinguist.java:695)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeWordState.getSuccessors(LexTreeLinguist.java:1390)
at edu.cmu.sphinx.linguist.util.LinguistStats.run(LinguistStats.java:36)
at edu.cmu.sphinx.instrumentation.RecognizerMonitor.statusChanged(RecognizerMonitor.java:88)
at edu.cmu.sphinx.recognizer.Recognizer.setState(Recognizer.java:142)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:158)
at edu.cmu.sphinx.demo.wavfile.WavFile.main(WavFile.java:46)
The more memory I allocate, the long it takes to generate this error but I'm wondering if it really is just a problem of needing more memory or if there is something wrong with my configuration. I'm very new to this and could quite easily have something silly in my config file (see below).
Any ideas? How much memory should I expect this to require? Thanks,
Mitch
<?xml version="1.0" encoding="UTF-8"?>
<!--
Sphinx-4 Configuration file
-->
<!-- ******** -->
<!-- tidigits configuration file -->
<!-- ******** -->
<config>
</config>
I am running from subversion but hadn't updated recently. I'm now at r9059. I removed linguistStats from the recognizerMonitor and recompiled everything and it now seems to work:
>:~/sphinx4# java -Xms1024m -Xmx1024m -jar bin/WavFile.jarLoading Recognizer as defined in 'jar:file:/root/sphinx4/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml'...
Decoding jar:file:/root/sphinx4/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav
This Time Audio: 2.88s Proc: 0.40s Speed: 0.14 X real time
Total Time Audio: 2.88s Proc: 0.40s Speed: 0.14 X real time
Mem Total: 1016.12 Mb Free: 903.10 Mb
Used: This: 113.03 Mb Avg: 113.03 Mb Max: 113.03 Mb
Result: on am
Or at least it didn't run out of heap space. Thanks! Was the problem that my code was somewhat out of date or was it the linguistStat that was causing the problem?
Mitch
After some investigation I think it's not a bug. The idea of LinguistStats component is to traverse all search states of Linguist and collect the statistics. It's possible with small jsgf, while with lexTreeLinguist the search graph is huge and it's impossible to traverse it without prunning. So basically the application of this monitor to lexTreeLinguist is not practical.
>I am running from subversion but hadn't updated recently. I'm now at r9059. I removed linguistStats from the recognizerMonitor and recompiled everything and it now seems to work
It looks like a bug that would be nice to fix.
What version are you talking about? Are you using nightly build?
Can you remove recognizerMonitor from monitors?
<item>linguistStats</item>