Hi. I've been trying to get my head around Sphinx4 for days. (recent SVN) Absolutely bewildering for noobs like me to get simple dictation working, but that's 100% my problem, not Sphinx!
I copied some config.xml options from HelloNGram to WavFile, hoping to increase the vocabulary. I am using the WSJ5K LM and the WSJ acoustic model + dictionary. For now, I'm sticking with the included wav file and stream settings. Classpath, etc have been updated and it doesn't throw any errors. It doesn't produce any results though. The process finishes and it prints "RESULT:"
I first tried converting to the WSJ acoustic model while sticking with the original grammar file. It worked. I was able to add words to the grammar and test new soundfiles and it was fine. Then I jumped into the NGram stuff and the WSJ5K model and here we are. I've tried adjusting beamwidths as per previous posts, and switching the LM to HUB4, but I think there's something much more elemental that I'm missing.
The audio file is the one included with the demo (1234.WAV) I'm not adding any extra arguments at runtime (except increasing RAM) so it refers to the default file hardcoded in the class.
-ZLP
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok, I found the problem, it's the same StartDataSignal in scorer. To make it work properly with trunk you need to add useStreamSignals=false in scorer properties:
Thanks! That solved my problem. I put the beamwidth back up to a sane value (5k) and now the WSJ5K LM is running great. I'm grateful that you gave it a look. - ZLP
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi. I've been trying to get my head around Sphinx4 for days. (recent SVN) Absolutely bewildering for noobs like me to get simple dictation working, but that's 100% my problem, not Sphinx!
I copied some config.xml options from HelloNGram to WavFile, hoping to increase the vocabulary. I am using the WSJ5K LM and the WSJ acoustic model + dictionary. For now, I'm sticking with the included wav file and stream settings. Classpath, etc have been updated and it doesn't throw any errors. It doesn't produce any results though. The process finishes and it prints "RESULT:"
I first tried converting to the WSJ acoustic model while sticking with the original grammar file. It worked. I was able to add words to the grammar and test new soundfiles and it was fine. Then I jumped into the NGram stuff and the WSJ5K model and here we are. I've tried adjusting beamwidths as per previous posts, and switching the LM to HUB4, but I think there's something much more elemental that I'm missing.
Any ideas?
my config file:
<?xml version="1.0" encoding="UTF-8"?>
<!--
Sphinx-4 Configuration file
-->
<!-- ******** -->
<!-- wavFile configuration file (converted to WSJ + NGRAM) -->
<!-- ******** -->
<config>
<!-- value="/HUB4/language_model.arpaformat.DMP"/> -->
</config>
Provide recording you are trying to decode please.
The audio file is the one included with the demo (1234.WAV) I'm not adding any extra arguments at runtime (except increasing RAM) so it refers to the default file hardcoded in the class.
-ZLP
With the following header:
<property name="absoluteBeamWidth" value="5000"/>
<property name="relativeBeamWidth" value="1E-120"/>
<property name="absoluteWordBeamWidth" value="100"/>
<property name="relativeWordBeamWidth" value="1E-60"/>
<property name="wordInsertionProbability" value="0.7"/>
<property name="languageWeight" value="8.5"/>
<property name="silenceInsertionProbability" value=".1"/>
<property name="skip" value="0"/>
<property name="logLevel" value="WARNING"/>
Released version sphinx4-beta works perfectly:
RESULT: one two three four five
Trunk is quite broken actually. I promise a beer to one who will fix it :)
Ok, I found the problem, it's the same StartDataSignal in scorer. To make it work properly with trunk you need to add useStreamSignals=false in scorer properties:
<component name="threadedScorer"
type="edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer">
<property name="frontend" value="${frontend}"/>
<property name="isCpuRelative" value="true"/>
<property name="numThreads" value="0"/>
<property name="minScoreablesPerThread" value="10"/>
<property name="scoreablesKeepFeature" value="true"/>
<property name="useStreamSignals" value="false"/>
</component>
RESULT: one two three four five
But it looks more like a workaround than a solution.
Thanks! That solved my problem. I put the beamwidth back up to a sane value (5k) and now the WSJ5K LM is running great. I'm grateful that you gave it a look. - ZLP