In short, the output now includes a call to getTimedBestResult(). Using the demo-supplied grammar and .wav, the output looks as you would expect:
<sil>(0.65,0.85) one(0.85,1.07) zero(1.07,1.52) zero(1.52,1.94) zero(1.94,2.23) one(2.23,2.74) -- one zero zero zero one
<sil>(2.74,3.91) nine(3.91,4.17) oh(4.17,4.3) two(4.3,4.5) one(4.5,4.67) oh(4.67,4.96) -- nine oh two one oh
<sil>(4.96,6.24) zero(6.24,6.67) one(6.67,6.88) eight(6.88,7.07) zero(7.07,7.49) three(7.49,8.2) -- zero one eight zero three
When I change the config file to use the HelloNGram set-up and lm file, but with the same Transcriber .wav file, I get the following results:
-- ones there was there was there are one
-- back of two one all
-- still one it's are with three
The actual recognition is completely expected, and eerily accurate given the circumstances...it nailed one and three every time for example, and with a little imagination one can see how "eight zero" might be taken as "it's are" by a system lacking the words eight and zero. Neato!
What isn't intuitive to me is how there is suddenly no timestamp information, and I'm just wondering if this has to do with my config file or if it might be a bug? Here is my config:
Perfect. Many thanks! As a rule of thumb then, is the SearchManager the closest we can get in the config file to tweaking what the Result includes? Or at least, is that a good place to start looking if the Result contains something it shouldn't, or doesn't contain something it should? Thanks again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello there,
I'm not sure if this should go in Sphinx-4 Open Discussion or not.
I modified the Transcriber demo source code as follows:
Result result;
while ((result = recognizer.recognize())!= null) {
String resultText = result.getBestResultNoFiller();
System.out.println(result.getTimedBestResult(true, true) +
" -- " + resultText);
unitTestBuffer.add(result);
}
In short, the output now includes a call to getTimedBestResult(). Using the demo-supplied grammar and .wav, the output looks as you would expect:
<sil>(0.65,0.85) one(0.85,1.07) zero(1.07,1.52) zero(1.52,1.94) zero(1.94,2.23) one(2.23,2.74) -- one zero zero zero one
<sil>(2.74,3.91) nine(3.91,4.17) oh(4.17,4.3) two(4.3,4.5) one(4.5,4.67) oh(4.67,4.96) -- nine oh two one oh
<sil>(4.96,6.24) zero(6.24,6.67) one(6.67,6.88) eight(6.88,7.07) zero(7.07,7.49) three(7.49,8.2) -- zero one eight zero three
When I change the config file to use the HelloNGram set-up and lm file, but with the same Transcriber .wav file, I get the following results:
-- ones there was there was there are one
-- back of two one all
-- still one it's are with three
The actual recognition is completely expected, and eerily accurate given the circumstances...it nailed one and three every time for example, and with a little imagination one can see how "eight zero" might be taken as "it's are" by a system lacking the words eight and zero. Neato!
What isn't intuitive to me is how there is suddenly no timestamp information, and I'm just wondering if this has to do with my config file or if it might be a bug? Here is my config:
<config>
<!-- ******** -->
<!-- frequently tuned properties -->
<!-- ******** -->
<property name="absoluteBeamWidth" value="500"/>
<property name="relativeBeamWidth" value="1E-80"/>
<property name="absoluteWordBeamWidth" value="20"/>
<property name="relativeWordBeamWidth" value="1E-60"/>
<property name="wordInsertionProbability" value="1E-16"/>
<property name="languageWeight" value="7.0"/>
<property name="silenceInsertionProbability" value=".1"/>
<property name="frontend" value="epFrontEnd"/>
<property name="recognizer" value="recognizer"/>
<property name="showCreations" value="false"/>
</config>
You need <property name="keepAllTokens" value="true"/> in searchManager most probably.
Perfect. Many thanks! As a rule of thumb then, is the SearchManager the closest we can get in the config file to tweaking what the Result includes? Or at least, is that a good place to start looking if the Result contains something it shouldn't, or doesn't contain something it should? Thanks again.
Well, sort of. The result is built from the token tree which is maintained by search manager.