Menu

Alphabet and Digits

Help
Anonymous
2006-08-13
2012-09-22
  • Anonymous

    Anonymous - 2006-08-13

    Hello,

    I am trying to detect letters and numbers in a telephonic recording, and I'm finding that the configuration is a little complex for me.

    I have installed Ant, and the 1.0-beta sources, and I can compile the demos. My idea was to use the WavFile demo as a base point, and modify it's configuration file to include a larger dictionary, then to create a grammar including the letters as well as digits.

    My first step was to replace TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800Hz in the configuration file with WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz. I did notice that the name of the 'dictionary' in each of those .jar archives is different, and I changed it accordingly.

    I then ran ant -buildfile demo.xml

    and on running the newly compiled wavfile.jar I received the following output:

    bash-3.00$ java -jar WavFile.jar 10001-90210-01803.wav
    Loading Recognizer...

    Exception in thread "main" java.lang.NullPointerException
    at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.ModelLoader.loadProperties(ModelLoader.java:372)
    at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.ModelLoader.getIsBinaryDefault(ModelLoader.java:386)
    at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.ModelLoader.newProperties(ModelLoader.java:346)
    at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:214)
    at edu.cmu.sphinx.util.props.ValidatingPropertySheet.getComponent(ValidatingPropertySheet.java:403)
    at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.Model.newProperties(Model.java:159)
    at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:214)
    at edu.cmu.sphinx.util.props.ValidatingPropertySheet.getComponent(ValidatingPropertySheet.java:403)
    at edu.cmu.sphinx.linguist.flat.FlatLinguist.setupAcousticModel(FlatLinguist.java:299)
    at edu.cmu.sphinx.linguist.flat.FlatLinguist.newProperties(FlatLinguist.java:246)
    at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:214)
    at edu.cmu.sphinx.util.props.ValidatingPropertySheet.getComponent(ValidatingPropertySheet.java:403)
    at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.newProperties(SimpleBreadthFirstSearchManager.java:180)
    at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:214)
    at edu.cmu.sphinx.util.props.ValidatingPropertySheet.getComponent(ValidatingPropertySheet.java:403)
    at edu.cmu.sphinx.decoder.Decoder.newProperties(Decoder.java:71)
    at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:214)
    at edu.cmu.sphinx.util.props.ValidatingPropertySheet.getComponent(ValidatingPropertySheet.java:403)
    at edu.cmu.sphinx.recognizer.Recognizer.newProperties(Recognizer.java:93)
    at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:214)
    at demo.sphinx.wavfile.WavFile.main(WavFile.java:62)

    I've included my current config.xml below. I feel I've made a fairly dumb mistake, but I'm afraid this is all a little over my head :( I'd greatly appreciate any assistance, even a smack and a shove towards the bit of the docs which plainly states what it is I am missing or doing wrong.

    Thank you!
    Campbell

    config.xml

    <?xml version="1.0" encoding="UTF-8"?>

    <!--
    Sphinx-4 Configuration file
    -->

    <!-- ******** -->
    <!-- an4 configuration file -->
    <!-- ******** -->

    <config>

    &lt;!-- ******************************************************** --&gt;
    &lt;!-- frequently tuned properties                              --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;property name=&quot;logLevel&quot; value=&quot;WARNING&quot;/&gt;
    
    &lt;property name=&quot;absoluteBeamWidth&quot;  value=&quot;-1&quot;/&gt;
    &lt;property name=&quot;relativeBeamWidth&quot;  value=&quot;1E-80&quot;/&gt;
    &lt;property name=&quot;wordInsertionProbability&quot; value=&quot;1E-36&quot;/&gt;
    &lt;property name=&quot;languageWeight&quot;     value=&quot;8&quot;/&gt;
    
    &lt;property name=&quot;frontend&quot; value=&quot;epFrontEnd&quot;/&gt;
    &lt;property name=&quot;recognizer&quot; value=&quot;recognizer&quot;/&gt;
    &lt;property name=&quot;showCreations&quot; value=&quot;false&quot;/&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- word recognizer configuration                            --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;recognizer&quot; type=&quot;edu.cmu.sphinx.recognizer.Recognizer&quot;&gt;
        &lt;property name=&quot;decoder&quot; value=&quot;decoder&quot;/&gt;
        &lt;propertylist name=&quot;monitors&quot;&gt;
            &lt;item&gt;accuracyTracker &lt;/item&gt;
            &lt;item&gt;speedTracker &lt;/item&gt;
            &lt;item&gt;memoryTracker &lt;/item&gt;
        &lt;/propertylist&gt;
    &lt;/component&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The Decoder   configuration                              --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;decoder&quot; type=&quot;edu.cmu.sphinx.decoder.Decoder&quot;&gt;
        &lt;property name=&quot;searchManager&quot; value=&quot;searchManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;searchManager&quot;  
        type=&quot;edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;linguist&quot; value=&quot;flatLinguist&quot;/&gt;
        &lt;property name=&quot;pruner&quot; value=&quot;trivialPruner&quot;/&gt;
        &lt;property name=&quot;scorer&quot; value=&quot;threadedScorer&quot;/&gt;
        &lt;property name=&quot;activeListFactory&quot; value=&quot;activeList&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;activeList&quot;  
             type=&quot;edu.cmu.sphinx.decoder.search.PartitionActiveListFactory&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;absoluteBeamWidth&quot; value=&quot;${absoluteBeamWidth}&quot;/&gt;
        &lt;property name=&quot;relativeBeamWidth&quot; value=&quot;${relativeBeamWidth}&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;trivialPruner&quot;  
                type=&quot;edu.cmu.sphinx.decoder.pruner.SimplePruner&quot;/&gt;
    
    &lt;component name=&quot;threadedScorer&quot;  
                type=&quot;edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer&quot;&gt;
        &lt;property name=&quot;frontend&quot; value=&quot;${frontend}&quot;/&gt;
        &lt;property name=&quot;isCpuRelative&quot; value=&quot;true&quot;/&gt;
        &lt;property name=&quot;numThreads&quot; value=&quot;0&quot;/&gt;
        &lt;property name=&quot;minScoreablesPerThread&quot; value=&quot;10&quot;/&gt;
        &lt;property name=&quot;scoreablesKeepFeature&quot; value=&quot;true&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The linguist  configuration                              --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;flatLinguist&quot;  
                type=&quot;edu.cmu.sphinx.linguist.flat.FlatLinguist&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;grammar&quot; value=&quot;jsgfGrammar&quot;/&gt;
        &lt;property name=&quot;acousticModel&quot; value=&quot;wsj&quot;/&gt;
        &lt;property name=&quot;wordInsertionProbability&quot;  
                value=&quot;${wordInsertionProbability}&quot;/&gt;
        &lt;property name=&quot;languageWeight&quot; value=&quot;${languageWeight}&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The Grammar  configuration                               --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;jsgfGrammar&quot; type=&quot;edu.cmu.sphinx.jsapi.JSGFGrammar&quot;&gt;
        &lt;property name=&quot;dictionary&quot; value=&quot;dictionary&quot;/&gt;
        &lt;property name=&quot;grammarLocation&quot;  
             value=&quot;resource:/demo.sphinx.wavfile.WavFile!/demo/sphinx/wavfile/&quot;/&gt;
        &lt;property name=&quot;grammarName&quot; value=&quot;digits&quot;/&gt;
    &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The Dictionary configuration                            --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;dictionary&quot;  
        type=&quot;edu.cmu.sphinx.linguist.dictionary.FastDictionary&quot;&gt;
        &lt;property name=&quot;dictionaryPath&quot; 
     value=&quot;resource:/edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.Model!/edu/cmu/sphinx/model/acoustic/WSJ_
        &lt;property name=&quot;fillerPath&quot;  
     value=&quot;resource:/edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.Model!/edu/cmu/sphinx/model/acoustic/WSJ_
        &lt;property name=&quot;addSilEndingPronunciation&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;allowMissingWords&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The acoustic model configuration                         --&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;component name=&quot;wsj&quot;  
      type=&quot;edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.Model&quot;&gt;
        &lt;property name=&quot;loader&quot; value=&quot;wsjLoader&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;wsjLoader&quot; 
               type=&quot;edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_8kHz_31mel_200Hz_3500Hz.ModelLoader&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The unit manager configuration                           --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;unitManager&quot;  
        type=&quot;edu.cmu.sphinx.linguist.acoustic.UnitManager&quot;/&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The frontend configuration                               --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;frontEnd&quot; type=&quot;edu.cmu.sphinx.frontend.FrontEnd&quot;&gt;
        &lt;propertylist name=&quot;pipeline&quot;&gt;
            &lt;item&gt;microphone &lt;/item&gt;
            &lt;item&gt;premphasizer &lt;/item&gt;
            &lt;item&gt;windower &lt;/item&gt;
            &lt;item&gt;fft &lt;/item&gt;
            &lt;item&gt;melFilterBank &lt;/item&gt;
            &lt;item&gt;dct &lt;/item&gt;
            &lt;item&gt;liveCMN &lt;/item&gt;
            &lt;item&gt;featureExtraction &lt;/item&gt;
        &lt;/propertylist&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The live frontend configuration                          --&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;component name=&quot;epFrontEnd&quot; type=&quot;edu.cmu.sphinx.frontend.FrontEnd&quot;&gt;
        &lt;propertylist name=&quot;pipeline&quot;&gt;
            &lt;item&gt;microphone &lt;/item&gt;
            &lt;item&gt;speechClassifier &lt;/item&gt;
            &lt;item&gt;speechMarker &lt;/item&gt;
            &lt;item&gt;nonSpeechDataFilter &lt;/item&gt;
            &lt;item&gt;premphasizer &lt;/item&gt;
            &lt;item&gt;windower &lt;/item&gt;
            &lt;item&gt;fft &lt;/item&gt;
            &lt;item&gt;melFilterBank &lt;/item&gt;
            &lt;item&gt;dct &lt;/item&gt;
            &lt;item&gt;liveCMN &lt;/item&gt;
            &lt;item&gt;featureExtraction &lt;/item&gt;
        &lt;/propertylist&gt;
    &lt;/component&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The frontend pipelines                                   --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;speechClassifier&quot;  
               type=&quot;edu.cmu.sphinx.frontend.endpoint.SpeechClassifier&quot;&gt;
        &lt;property name=&quot;threshold&quot; value=&quot;13&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;nonSpeechDataFilter&quot;  
               type=&quot;edu.cmu.sphinx.frontend.endpoint.NonSpeechDataFilter&quot;/&gt;
    
    &lt;component name=&quot;speechMarker&quot;  
               type=&quot;edu.cmu.sphinx.frontend.endpoint.SpeechMarker&quot; &gt;
        &lt;property name=&quot;speechTrailer&quot; value=&quot;50&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;premphasizer&quot;  
               type=&quot;edu.cmu.sphinx.frontend.filter.Preemphasizer&quot;/&gt;
    
    &lt;component name=&quot;windower&quot;  
               type=&quot;edu.cmu.sphinx.frontend.window.RaisedCosineWindower&quot;&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;fft&quot;  
            type=&quot;edu.cmu.sphinx.frontend.transform.DiscreteFourierTransform&quot;&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;melFilterBank&quot;  
        type=&quot;edu.cmu.sphinx.frontend.frequencywarp.MelFrequencyFilterBank&quot;&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;dct&quot;  
            type=&quot;edu.cmu.sphinx.frontend.transform.DiscreteCosineTransform&quot;/&gt;
    
    &lt;component name=&quot;liveCMN&quot;  
               type=&quot;edu.cmu.sphinx.frontend.feature.LiveCMN&quot;/&gt;
    
    &lt;component name=&quot;featureExtraction&quot;  
               type=&quot;edu.cmu.sphinx.frontend.feature.DeltasFeatureExtractor&quot;/&gt;
    
    &lt;component name=&quot;microphone&quot;  
               type=&quot;edu.cmu.sphinx.frontend.util.Microphone&quot;&gt;
        &lt;property name=&quot;closeBetweenUtterances&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************* --&gt;
    &lt;!--  monitors                                               --&gt;
    &lt;!-- ******************************************************* --&gt;
    
    &lt;component name=&quot;accuracyTracker&quot;  
                type=&quot;edu.cmu.sphinx.instrumentation.AccuracyTracker&quot;&gt;
        &lt;property name=&quot;recognizer&quot; value=&quot;${recognizer}&quot;/&gt;
        &lt;property name=&quot;showAlignedResults&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;showRawResults&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;memoryTracker&quot;  
                type=&quot;edu.cmu.sphinx.instrumentation.MemoryTracker&quot;&gt;
        &lt;property name=&quot;recognizer&quot; value=&quot;${recognizer}&quot;/&gt;
    &lt;property name=&quot;showSummary&quot; value=&quot;false&quot;/&gt;
    &lt;property name=&quot;showDetails&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;speedTracker&quot;  
                type=&quot;edu.cmu.sphinx.instrumentation.SpeedTracker&quot;&gt;
        &lt;property name=&quot;recognizer&quot; value=&quot;${recognizer}&quot;/&gt;
        &lt;property name=&quot;frontend&quot; value=&quot;${frontend}&quot;/&gt;
    &lt;property name=&quot;showSummary&quot; value=&quot;true&quot;/&gt;
    &lt;property name=&quot;showDetails&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************* --&gt;
    &lt;!--  Miscellaneous components                               --&gt;
    &lt;!-- ******************************************************* --&gt;
    
    &lt;component name=&quot;logMath&quot; type=&quot;edu.cmu.sphinx.util.LogMath&quot;&gt;
        &lt;property name=&quot;logBase&quot; value=&quot;1.0001&quot;/&gt;
        &lt;property name=&quot;useAddTable&quot; value=&quot;true&quot;/&gt;
    &lt;/component&gt;
    

    </config>

     
    • Anonymous

      Anonymous - 2006-08-15

      Right. Scratch that, and sorry for the lengthy, clueless post. This one's not a lot better but I've got some questions, at least.

      1) I've set it to use wsj 8khz data file. But since I only want to detect a-z and 0-9 (I will be able to scan the database of expected values to increase accuracy), I wonder - how can I decrease the size of the memory requirements? I'm running it with java -Xmx256m, and it's running out of memory.

      2) My grammar file looks like this:

      JSGF V1.0;

      grammar transreport;
      public <alphanumeric> = (oh | zero | one | two | three | four | five | six | seven | eight | nine | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z) * ;

      Does that seem like it is correct?

      3) Is there not an available set of data files for this limited capability? This javascript compile/build setup is a lot of complexity for me, and I don't think I'm doing very well.

      I won't post my config again unless it's required - it's a bit long and seems impolite.

      Thanks in advance for any pointers you can give me..

      Campbell

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.