Menu

problems running demos in the tests directory

Help
2004-08-27
2012-09-22
  • Andrew MacGinitie

    I was able to build sphinx4 and get some functionality, but I'm seeing these problems:
    1. TIDIGITS: the test prints "say something," correctly recognizes a few digits in an utterance & prints them, then loops back & prints "say something"; after that, it won't recognize any more, has to be "Ctrl-C"ed.
    2. in tests/live, "ant live" causes "all" to build, but will not execute the live demo (same with "ant live-free" etc.).
    I'm new to ant; I can't see anything wrong with tests/live/build.xml, and I don't know how to debug this. There's nothing I recognize as an error message, it says "build successful." Any tips, pointers, or suggestions appreciated.

     
    • Paul Lamere

      Paul Lamere - 2004-08-27

      Andrew:
      Hey Andrew,

      Thanks for trying out Sphinx-4. I hope we can help you get up and running on this.

      1) What system are you running on?  We have seen some issues with JavaSound on linux systems.  Knowing your OS and version number will help us figure out what is going wrong.

      2)  What error messages (if any) are you seeing when you run 'ant live'?

      Paul

       
      • Andrew MacGinitie

        1. I'm actually trying to run it on Windows XP Home (2002 edition). java -version gives:
        java version "1.4.0_01-ea"
        and similar build numbers for JRE and HotSpot ending in "-b02"

        ...hmmm!

        I'm using cygwin, which is apparently finding a 1.4.0 Java (in c:/Windows/system32, says "which java"). I set JAVA_HOME to my 1.4.2 version before ant would run... I wonder if there could be a conflict there? Something to try (fixing where cygwin finds java), although there's no indication of a problem like that in the ant output:

        2. "ant live" produces only:
        Buildfile: build.xml

        all:
            [javac] Compiling 3 source files to C:\src\java\sphinx4\sphinx4-0.1alpha\bld\classes

        BUILD SUCCESSFUL
        Total time: 5 seconds

        I've looked in the build.xml file, it definitely contains sections like '<target name="live" ' etc.

        I also get no error messages running the TIDIGITS test. It justs appears to stop recognizing after the first utterance (which it has done pretty well with, so something's definitely working).

         
    • Andrew MacGinitie

      I referred to the TIDIGITS test. What I meant was:

      java -jar bin/HelloDigits.jar [from the main sphinx4 dir]

      One example:

      Say any digit(s): e.g. "two oh oh four", "three six five".
      Start speaking. Press Ctrl-C to quit.

      You said: five nine three

      Start speaking. Press Ctrl-C to quit.

      ...& there it stays until I press Ctrl-C to quit. I did in fact say "five nine three," although it hasn't been entirely reliable (sometimes it prints "You said:" and leaves it blank, even though I did speak). It always stops responding after the first "You said:" prints.

       
    • Andrew MacGinitie

      I think the version of java that cygwin found was actually the same as JAVA_HOME was pointing to, after all.

      I modified HelloDigits.java to add debug output. I used the command

      ant clean all

      to rebuild everything. My mods were succesfully compiled into HelloDigits.jar. From this I learned that the demo program is calling recognizer.recognize(), which seems to return once, but never returns from the second call (made in the second loop iteration in HelloDigits.java main()).

      another "ant" clue: when running ant from the tests/live dir, the following are all equivalent:
      ant
      ant -projecthelp
      ant live
      ant live-ep
      ...etc.

      ...yet "ant clean all" seemed to work from the main dir.

       
      • Paul Lamere

        Paul Lamere - 2004-08-30

        Andrew:

        What version of ant are you running with ('ant -version').

        Paul

         
        • Andrew MacGinitie

          It seems my ant installation is either messed up... or perhaps incompatible with cygwin? I execute the "ant -version" command and get this:

          Buildfile: build.xml does not exist!
          Build failed

          ...when I run it from the c:\ant\bin directory. I ran "ant --execdebug" and got:

          exec "/cygdrive/c/j2sdk.142/bin/java" -classpath "c:/ant/lib/ant-launcher.jar" -Dant.home="c:/ant" -Dant.library.dir="c:/ant/lib" -Dcygwin.user.home="C:/cygwin/home/Owner" org.apache.tools.ant.launch.Launcher -lib ".;C;c:/Program Files/Borland/InterBase/InterClient/interclient.jar;"C;c:/Program Files/Java/j2re1.4.0_01/lib/ext/QTJava.zip""
          Buildfile: build.xml does not exist!

          I have no idea what QTJava.zip is, BTW.

          Seems like an ant re-install might be advisable. I should at least see if I can find any pointers on this from an ant list or site; not much sense trying to debug any sphinx4 issues that might be caused by an incorrect build.

          Thanks,
          Andrew

           
    • Andrew MacGinitie

      OK, ant version 1.6.2 seems to be working better now. Apparently it didn't handle one or more entries in the CLASSPATH correctly.

      I ran "ant clean all" and was able to get the Live demo running using "ant live"; however, I see a similar result to the HelloDigits issue (recognizer seems to get stuck after the first recognition result). HelloDigits is still exhibiting the same behavior as well.

      Has anyone tried sphinx4 on XP Home Edition before?

       
    • Philip Kwok

      Philip Kwok - 2004-08-31

      Hi Andrew,

      From my experience, JavaSound usually works pretty well on Windows platforms. To help debug this problem, would you mind trying a few things:

      1) Can you try running the live demo using 'ant live-ep'? This turns on the endpointer, but it tries to figure out the end of speech. The start of speech is the start of audio.

      2) Can you now try using the 'ant live-free'? This will just performs endpointing throughout. Please read the README.html on how to run it, its a little different. In this mode, if you see weird behavior, such as it doesn't seem to detect any speech, or detect everything as speech, try to tune tidigits/tidigits.config.xml (assuming you're running the connected digits test) according to this page:

      http://cmusphinx.sourceforge.net/sphinx4/javadoc/edu/cmu/sphinx/frontend/doc-files/FrontEndFAQ.html#enable_endpointer

      Hope this helps. Please post again if not.

      philip

       
      • Paul Lamere

        Paul Lamere - 2004-08-31

        Also, some folks who have had similar issues were able to solve it for their system by upgrading to Java 1.5

         
        • Andrew MacGinitie

          [OK, I'll try Java 1.5]

          Phil,

          I monkeyed with the tidigits.config.xml file a bit, although I don't fully
          know what I'm doing yet (more doc reading required). I tried adding the three
          stages mentioned in the file you pointed me to (they weren't in both frontend
          configurations I found, so I added them to the mfcFrontEnd, which didn't have them).
          I also changed "keepLastAudio" to false (under "microphone"), and moved the
          microphone entry below the 3 stages (SpeechClassifier etc.), as suggested in the
          FAQ, in the epFrontEnd section. I detected no change in results, compared to the
          original configuration.

          Here are my results for "live-free" and "live-ep":

          live-ep:
          1. program initializes, prints info in console window
          2. zero appears in the "say" box, "press 'speak' to ..." appears at bottom
          3. click "speak" button
          4. <sil> <sil> appears in "recognized" box
          5. say "zero"; contents of "recognized" changes to: <sil> zero
          (i.e., second <sil> appears to be replaced by "zero")
          6. more information prints in console window
          7. "Speaker turned off" appears at bottom

          after that, nothing gets recognized, no matter what I try;

          press Next --> updates "say" box
          press Speak --> "Wait..." appears at bottom, followed by "OK, ..." etc.
          press Stop --> message "SimpleAcousticScorer: Data is null (agm)" appears in console

          I modified the source so I could tell which of two identical "Data is null" messages
          in SimpleAcousticScorer.java was printing; it's always the first (line 119 in my file).

          live-free is similar.

          console output follows:

          $ ant live-free
          Buildfile: build.xml

          all:
              [javac] Compiling 3 source files to C:\src\java\sphinx4\bld\classes

          live-free:
               [java] Parsing file decoders.list ......done parsing decoders.list
               [java] Initializing first decoder: Isolated Digits ...
               [java] Changing to Isolated Digits recognizer ...
               [java] ... done initializing
               [java] # ----------- linguist stats ------------
               [java] # Total states: 256
               [java] # class edu.cmu.sphinx.linguist.flat.NonEmittingHMMState: 46
               [java] # class edu.cmu.sphinx.linguist.flat.ExtendedUnitState: 46
               [java] # class edu.cmu.sphinx.linguist.flat.PronunciationState: 13
               [java] # class edu.cmu.sphinx.linguist.flat.HMMStateState: 138
               [java] # class edu.cmu.sphinx.linguist.flat.BranchState: 12
               [java] # class edu.cmu.sphinx.linguist.flat.GrammarState: 1
               [java] ... done changing

               [java] REF:       zero
               [java] HYP:       zero

               [java]    Accuracy: 100.000%    Errors: 0  (Sub: 0  Ins: 0  Del: 0)
               [java]    Words: 1   Matches: 1    WER: 0.000%
               [java]    Sentences: 1   Matches: 1   SentenceAcc: 100.000%
               [java]    This  Time Audio: 5.44s  Proc: 5.12s  Speed: 0.94 X real time
               [java]    Total Time Audio: 5.44s  Proc: 5.12s  Speed: 0.94 X real time
               [java]    Mem  Total: 9.47 Mb  Free: 2.39 Mb
               [java]    Used: This: 7.09 Mb  Avg: 7.09 Mb  Max: 7.09 Mb
               [java] SimpleAcousticScorer: Data is null (agm)
               [java] SimpleAcousticScorer: Data is null (agm)

          BUILD SUCCESSFUL
          Total time: 1 minute 0 seconds

          $ ant live-ep
          Buildfile: build.xml

          all:
              [javac] Compiling 3 source files to C:\src\java\sphinx4\bld\classes

          live-ep:
               [java] Parsing file decoders.list ......done parsing decoders.list
               [java] Initializing first decoder: Isolated Digits ...
               [java] ... done initializing
               [java] Changing to Isolated Digits recognizer ...
               [java] # ----------- linguist stats ------------
               [java] # Total states: 256
               [java] # class edu.cmu.sphinx.linguist.flat.BranchState: 12
               [java] # class edu.cmu.sphinx.linguist.flat.GrammarState: 1
               [java] # class edu.cmu.sphinx.linguist.flat.NonEmittingHMMState: 46
               [java] # class edu.cmu.sphinx.linguist.flat.HMMStateState: 138
               [java] # class edu.cmu.sphinx.linguist.flat.PronunciationState: 13
               [java] # class edu.cmu.sphinx.linguist.flat.ExtendedUnitState: 46
               [java] ... done changing

               [java] REF:       zero
               [java] HYP:       zero

               [java]    Accuracy: 100.000%    Errors: 0  (Sub: 0  Ins: 0  Del: 0)
               [java]    Words: 1   Matches: 1    WER: 0.000%
               [java]    Sentences: 1   Matches: 1   SentenceAcc: 100.000%
               [java]    This  Time Audio: 5.40s  Proc: 5.62s  Speed: 1.04 X real time
               [java]    Total Time Audio: 5.40s  Proc: 5.62s  Speed: 1.04 X real time
               [java]    Mem  Total: 9.45 Mb  Free: 2.15 Mb
               [java]    Used: This: 7.30 Mb  Avg: 7.30 Mb  Max: 7.30 Mb

          BUILD SUCCESSFUL
          Total time: 50 seconds

          The tidigits.config.xml, as monkeyed with by me (without any apparent effect):

          <?xml version="1.0" encoding="UTF-8"?>

          <!--
             Sphinx-4 Configuration file
          -->

          <!-- ******************************************************** -->
          <!--  tidigits configuration file                             -->
          <!-- ******************************************************** -->

          <config>   
             
              <!-- ******************************************************** -->
              <!-- frequently tuned properties                              -->
              <!-- ******************************************************** -->
             
              <property name="absoluteBeamWidth"         value="200"/>
              <property name="wordInsertionProbability"     value="1E-36"/>
              <property name="silenceInsertionProbability"    value="1"/>
              <property name="languageWeight"     value="7"/>
              <property name="frontend"     value="mfcFrontEnd"/>
             
             
              <!-- ******************************************************** -->
              <!-- The recognizer configuration                             -->
              <!-- ******************************************************** -->
             
              <component name="recognizer" type="edu.cmu.sphinx.recognizer.Recognizer">
                  <property name="decoder" value="digitsDecoder"/>
                  <propertylist name="monitors">
                      <item>accuracyTracker </item>
                      <item>speedTracker </item>
                      <item>memoryTracker </item>
                      <item>recognizerMonitor </item>
                  </propertylist>
              </component>
             
              <!-- ******************************************************** -->
              <!-- The Decoder   configuration                              -->
              <!-- ******************************************************** -->
             
              <component name="digitsDecoder" type="edu.cmu.sphinx.decoder.Decoder">
                  <property name="searchManager" value="searchManager"/>
                  <property name="featureBlockSize" value="50"/>
              </component>
             
              <component name="searchManager"
                  type="edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager">
                  <property name="logMath" value="logMath"/>
                  <property name="linguist" value="flatLinguist"/>
                  <property name="pruner" value="trivialPruner"/>
                  <property name="scorer" value="simpleScorer"/>
                  <property name="activeListFactory" value="activeList"/>
              </component>
             
             
              <component name="activeList"
                   type="edu.cmu.sphinx.decoder.search.SortingActiveListFactory">
                  <property name="logMath" value="logMath"/>
                  <property name="absoluteBeamWidth" value="${absoluteBeamWidth}"/>
              </component>
             
              <component name="trivialPruner"
                  type="edu.cmu.sphinx.decoder.pruner.SimplePruner"/>
             
              <component name="simpleScorer"
                  type="edu.cmu.sphinx.decoder.scorer.SimpleAcousticScorer">
                  <property name="frontend" value="${frontend}"/>
              </component>
             
             
              <!-- ******************************************************** -->
              <!-- The linguist  configuration                              -->
              <!-- ******************************************************** -->
             
              <component name="flatLinguist"
                  type="edu.cmu.sphinx.linguist.flat.FlatLinguist">
                  <property name="logMath" value="logMath"/>
                  <property name="grammar" value="wordListGrammar"/>
                  <property name="acousticModel" value="acousticModel"/>
                  <property name="wordInsertionProbability"
                      value="${wordInsertionProbability}"/>
                  <property name="silenceInsertionProbability"
                       value="${silenceInsertionProbability}"/>
                  <property name="languageWeight" value="${languageWeight}"/>
                  <property name="unitManager" value="unitManager"/>
              </component>
             
              <component name="wordListGrammar"
                  type="edu.cmu.sphinx.linguist.language.grammar.SimpleWordListGrammar">
                  <property name="path" value="../performance/tidigits/tidigits.wordlist"/>
                  <property name="isLooping" value="true"/>
                  <property name="dictionary" value="dictionary"/>
                  <property name="logMath" value="logMath"/>
              </component>
             
              <component name="acousticModel"
                         type="edu.cmu.sphinx.model.acoustic.TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model">
                  <property name="loader" value="sphinx3Loader"/>
                  <property name="unitManager" value="unitManager"/>
              </component>

              <component name="sphinx3Loader"
                         type="edu.cmu.sphinx.model.acoustic.TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader">
                  <property name="logMath" value="logMath"/>
                  <property name="unitManager" value="unitManager"/>
              </component>

              <component name="dictionary"
                         type="edu.cmu.sphinx.linguist.dictionary.FullDictionary">
                  <property name="dictionaryPath"
               value="resource:/edu.cmu.sphinx.model.acoustic.TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model!/edu/cmu/sphinx/model/acoustic/TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz/dictionary"/>
                  <property name="fillerPath"
               value="resource:/edu.cmu.sphinx.model.acoustic.TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model!/edu/cmu/sphinx/model/acoustic/TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz/fillerdict"/>
                  <property name="addSilEndingPronunciation" value="false"/>
                  <property name="unitManager" value="unitManager"/>
              </component>
             
             
              <!-- ******************************************************** -->
              <!-- The unit manager configuration                           -->
              <!-- ******************************************************** -->

              <component name="unitManager"
                  type="edu.cmu.sphinx.linguist.acoustic.UnitManager"/>

             
              <!-- ******************************************************** -->
              <!-- The frontend configuration                               -->
              <!-- ******************************************************** -->
             
              <component name="mfcFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
                  <propertylist name="pipeline">
                      <item>speechClassifier </item>
                      <item>speechMarker </item>
                      <item>nonSpeechDataFilter </item>
                      <item>microphone </item>
                      <item>premphasizer </item>
                      <item>windower </item>
                      <item>fft </item>
                      <item>melFilterBank </item>
                      <item>dct </item>
                      <item>liveCMN </item>
                      <item>featureExtraction </item>
                  </propertylist>
              </component>

              <!-- ******************************************************** -->
              <!-- The live frontend configuration                          -->
              <!-- ******************************************************** -->
              <component name="epFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
                  <propertylist name="pipeline">
                      <item>speechClassifier </item>
                      <item>speechMarker </item>
                      <item>nonSpeechDataFilter </item>
                      <item>microphone </item>
                      <item>premphasizer </item>
                      <item>windower </item>
                      <item>fft </item>
                      <item>melFilterBank </item>
                      <item>dct </item>
                      <item>liveCMN </item>
                      <item>featureExtraction </item>
                  </propertylist>
              </component>
             
              <component name="microphone"
                          type="edu.cmu.sphinx.frontend.util.Microphone">
                  <property name="keepLastAudio" value="false"/>
              </component>

             
              <component name="speechClassifier"
                          type="edu.cmu.sphinx.frontend.endpoint.SpeechClassifier">
                  <property name="threshold" value="13"/>
              </component>
             
              <component name="nonSpeechDataFilter"
                          type="edu.cmu.sphinx.frontend.endpoint.NonSpeechDataFilter"/>
             
              <component name="speechMarker"
                          type="edu.cmu.sphinx.frontend.endpoint.SpeechMarker" >
                  <!-- <property name="speechTrailer" value="50"/> -->
              </component>
             

              <component name="premphasizer"
                  type="edu.cmu.sphinx.frontend.filter.Preemphasizer"/>
             
              <component name="windower"
                  type="edu.cmu.sphinx.frontend.window.RaisedCosineWindower">
              </component>
             
              <component name="fft"
                      type="edu.cmu.sphinx.frontend.transform.DiscreteFourierTransform"/>
             
              <component name="melFilterBank"
                      type="edu.cmu.sphinx.frontend.frequencywarp.MelFrequencyFilterBank">
              </component>
             
              <component name="dct"
                      type="edu.cmu.sphinx.frontend.transform.DiscreteCosineTransform"/>
             
              <component name="liveCMN"
                          type="edu.cmu.sphinx.frontend.feature.LiveCMN"/>
             
              <component name="featureExtraction"
                  type="edu.cmu.sphinx.frontend.feature.DeltasFeatureExtractor"/>
             
              <!-- ******************************************************* -->
              <!--  monitors                                               -->
              <!-- ******************************************************* -->
             
              <component name="accuracyTracker"
                  type="edu.cmu.sphinx.instrumentation.AccuracyTracker">
                  <property name="recognizer" value="recognizer"/>
                  <property name="showAlignedResults" value="false"/>
                  <property name="showRawResults" value="false"/>
              </component>
             
              <component name="memoryTracker"
                  type="edu.cmu.sphinx.instrumentation.MemoryTracker">
                  <property name="recognizer" value="recognizer"/>
              </component>
             
              <component name="speedTracker"
                  type="edu.cmu.sphinx.instrumentation.SpeedTracker">
                  <property name="recognizer" value="recognizer"/>
                  <property name="frontend" value="${frontend}"/>
              </component>
             
              <component name="recognizerMonitor"
                  type="edu.cmu.sphinx.instrumentation.RecognizerMonitor">
                  <property name="recognizer" value="recognizer"/>
                  <propertylist name="allocatedMonitors">
                      <item>linguistStats
                      </item>
                  </propertylist>
              </component>
             
              <component name="linguistStats"
                  type="edu.cmu.sphinx.linguist.util.LinguistStats">
                  <property name="linguist" value="flatLinguist"/>
              </component>
             
              <!-- ******************************************************* -->
              <!--  Miscellaneous components                               -->
              <!-- ******************************************************* -->
             
              <component name="logMath" type="edu.cmu.sphinx.util.LogMath">
                  <property name="logBase" value="1.0001"/>
              </component>
             
          </config>

           
    • Philip Kwok

      Philip Kwok - 2004-09-01

      Hi Andrew,

      I should have been more clear about this - sorry about it. I didn't mean that you add the 3 components to the front end, I just meant changing the 'threshold' property of 'edu.cmu.sphinx.frontend.endpoint.SpeechClassifier' to some other value. So, I would recommend using the original tidigits.config.xml, and then just changing the threshold to a smaller or bigger value (in increments of 1).

      I also found a bug in the latest live test. I'll fix it and let you know. Also, you have tried upgrading to JDK 1.5?

      philip

       
      • Andrew MacGinitie

        Philip,
        I've left off testing since Wednesday last. I did install JDK 1.5, but when I installed that it broke ant; ant now complains that it can't find tools.jar. If you happen to know a quick fix for that, could you let me know? Otherwise I probably will resume my efforts Thursday or Friday.

        Thanks,
        Andrew

         
    • Philip Kwok

      Philip Kwok - 2004-09-01

      Hi Andrew,

      The bug I mentioned above has been fixed, although I don't think it has anything to do with the problem you're seeing. In case you're accessing Sphinx-4 via cvs, you might want to do update your code.

      There's another thing you can try: in tidigits/tidigits.config.xml, find the configuration for the 'edu.cmu.sphinx.frontend.util.Microphone'. Then add the following line after that for 'keepLastAudio':

      <property name="closeBetweenUtterances" value="false"/>

      We put this flag in due to some JavaSound problems we see on Linux.

      Also, you should set 'keepLastAudio' to true. Doing so allows you to play back what you just said ('Play' button). Try playing back what you said to see if any recording actually happened.

      philip

       
    • Philip Kwok

      Philip Kwok - 2004-09-07

      Andrew,

      Check two things:

      1) you're using Ant 1.6.0 or better
      2) that you've set your JAVA_HOME environment variable to your JDK 1.5 installation. I think ANT is trying to find $JAVA_HOME/lib/tools.jar

      hope that helps.

      philip

       
      • Andrew MacGinitie

        I am using ant 1.6.2. I downloaded and installed the J2SDK 1.5 Beta in what I thought was the default manner [except I put it in c:/jdk instead of under the "Program Files" directory that it would have put itself in by default]; there is no c:/jdk/lib/tools.jar (as there was in the 1.4.2 JDK installation). I'll try to locate the 1.5 tools.jar, and if it exists, copy it to jdk/lib as a quick fix until I learn how to reconfigure ant.

        Thanks!

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.