Menu

A beginner's tutorial

Jhon
2013-09-25
2013-11-03
  • Jhon

    Jhon - 2013-09-25

    Hello! I have seen and crawled the web for 6, 7 hours trying to find a simple tutorial that shows how to transcribe wav files. the Transcriber demo does not have code (all the demos have microphone usage). I tried to play with http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4

    BatchSpeechRecognizer recognizer = BatchSpeechRecognizer(configuration);
    recognizer.startRecognition(new File("speech.wav").toURI().toURL());
    SpeechResult result = recognizer.getResult()
    recognizer.stopRecognition();

    but BatchSpeechRecognizer was found nowhere even in javadoc of spinx4

     
  • Nickolay V. Shmyrev

    Tutorial describes new updated API for sphinx4. You can checkout corresponding code from the branch:

    https://svn.code.sf.net/p/cmusphinx/code/branches/hl-interface

     
    • Ondrej Popp

      Ondrej Popp - 2013-11-03

      I like this new interface, however it (seems to have | has) some bugs in it. There are two things I noticed when I tried out the Transcriber application, (/cmusphinx/hl-interface/src/apps/edu/cmu/sphinx/demo/transcriber) compared to the LatticeDemo application of the trunk,
      (cmusphinx/cmusphinx-code/sphinx4/src/apps/edu/cmu/sphinx/demo/lattice),
      and that is,

      1. The whole input file is recognized as one utterance instead of breaking it up in multiple utterances.

      2. The time stamps are missing.

      I am now debugging this because I really like the hl-interface approach,
      and so I am running two java debug sessions in eclipse IDE / Java debugger comparing the program flow of the Lattice demo which I am taking as a reference against the Transcriber hl-interface flow.

      So far I have found the problem with 1) and that is that the batchFrontend pipeline misses the speechClassifier, speechMarker and the nonSpeechDataFilter components, in the hl-interface/src/sphinx4/edu/cmu/sphinx/api/default.config.xml setup, when I put those in, problem 1) goes away, so it starts to behave like the lattice demo there,

      ~~~~~~~~~~~~~~~~~
      <component name="batchFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
      <propertylist name="pipeline">
      <item>dataSource</item>
      <item>dataBlocker</item></propertylist></component>

      • <item>speechClassifier </item>
      • <item>speechMarker </item>
      • <item>nonSpeechDataFilter </item>
        <item>preemphasizer</item>
        <item>windower</item>
        <item>fft</item>
        @@ -174,9 +180,12 @@


        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

      I am not done yet to figure out problem 2, the missing time stamps,
      what I have found so far is that the Token.predecessors remain uninitialized in comparison to the Lattice demo.

      I am posting my findings so far because if someone can fix this faster than I can I would be more than happy to update from svn. Otherwise I'll report back here when I find more...

      kind regards,
      Ondrej Popp

       

      Last edit: Ondrej Popp 2013-11-03
      • Nickolay V. Shmyrev

        Hi Ondrej

        It's great you help with testing the recent changes, this is very much needed.

        Overall, latticedemo xml config is way more reasonable than current hl-interface common config and the latter must be replaced with the former.

        As for times, keepAllTokens property in search manager must be set to true, that should solve the time issue temporary. The proper fix will require rework of the token itself which should keep time reference, I wanted to do that for a long time but this change is still pending.

         
  • Frank

    Frank - 2013-09-27

    what is best command line app out of all sphinx versions?

     

Log in to post a comment.

MongoDB Logo MongoDB