Menu

Help with setting grammar

Help
franchan
2006-03-31
2012-09-22
  • franchan

    franchan - 2006-03-31

    Hi all,

    I am trying to convert english (.wav) into text using transcriber in sphinx-4.

    I have modified in the config.xml the dictionary configuration to use the WSJ model, and also the path in the manifest to the WSJ jar file under /lib.

    I got everything compiled and able to run, thought it only outputs the digits.

    I tried to modified the .gram file, but I am not sure how to ignore the grammar rules? Or make it to match anything? such that it will output the text of the word said in the .wav.

    Can anyone able give me some ideas on it?

    Thanks in advance.

    -Fran

    Below is the modified config.xml for reference.

    <config>

    &lt;!-- ******************************************************** --&gt;
    &lt;!-- frequently tuned properties                              --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;property name=&quot;logLevel&quot; value=&quot;WARNING&quot;/&gt;
    
    &lt;property name=&quot;absoluteBeamWidth&quot;  value=&quot;-1&quot;/&gt;
    &lt;property name=&quot;relativeBeamWidth&quot;  value=&quot;1E-80&quot;/&gt;
    &lt;property name=&quot;wordInsertionProbability&quot; value=&quot;1E-36&quot;/&gt;
    &lt;property name=&quot;languageWeight&quot;     value=&quot;8&quot;/&gt;
    
    &lt;property name=&quot;frontend&quot; value=&quot;epFrontEnd&quot;/&gt;
    &lt;property name=&quot;recognizer&quot; value=&quot;recognizer&quot;/&gt;
    &lt;property name=&quot;showCreations&quot; value=&quot;false&quot;/&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- word recognizer configuration                            --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;recognizer&quot; type=&quot;edu.cmu.sphinx.recognizer.Recognizer&quot;&gt;
        &lt;property name=&quot;decoder&quot; value=&quot;decoder&quot;/&gt;
        &lt;propertylist name=&quot;monitors&quot;&gt;
            &lt;item&gt;accuracyTracker &lt;/item&gt;
            &lt;item&gt;speedTracker &lt;/item&gt;
            &lt;item&gt;memoryTracker &lt;/item&gt;
        &lt;/propertylist&gt;
    

    </component>

    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The Decoder   configuration                              --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;decoder&quot; type=&quot;edu.cmu.sphinx.decoder.Decoder&quot;&gt;
        &lt;property name=&quot;searchManager&quot; value=&quot;searchManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;searchManager&quot; 
        type=&quot;edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;linguist&quot; value=&quot;flatLinguist&quot;/&gt;
        &lt;property name=&quot;pruner&quot; value=&quot;trivialPruner&quot;/&gt;
        &lt;property name=&quot;scorer&quot; value=&quot;threadedScorer&quot;/&gt;
        &lt;property name=&quot;activeListFactory&quot; value=&quot;activeList&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;activeList&quot; 
             type=&quot;edu.cmu.sphinx.decoder.search.PartitionActiveListFactory&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;absoluteBeamWidth&quot; value=&quot;${absoluteBeamWidth}&quot;/&gt;
        &lt;property name=&quot;relativeBeamWidth&quot; value=&quot;${relativeBeamWidth}&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;trivialPruner&quot; 
                type=&quot;edu.cmu.sphinx.decoder.pruner.SimplePruner&quot;/&gt;
    
    &lt;component name=&quot;threadedScorer&quot; 
                type=&quot;edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer&quot;&gt;
        &lt;property name=&quot;frontend&quot; value=&quot;${frontend}&quot;/&gt;
        &lt;property name=&quot;isCpuRelative&quot; value=&quot;true&quot;/&gt;
        &lt;property name=&quot;numThreads&quot; value=&quot;0&quot;/&gt;
        &lt;property name=&quot;minScoreablesPerThread&quot; value=&quot;10&quot;/&gt;
        &lt;property name=&quot;scoreablesKeepFeature&quot; value=&quot;true&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The linguist  configuration                              --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;flatLinguist&quot; 
                type=&quot;edu.cmu.sphinx.linguist.flat.FlatLinguist&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;grammar&quot; value=&quot;jsgfGrammar&quot;/&gt;
        &lt;property name=&quot;acousticModel&quot; value=&quot;wsj&quot;/&gt;
        &lt;property name=&quot;wordInsertionProbability&quot; 
                value=&quot;${wordInsertionProbability}&quot;/&gt;
        &lt;property name=&quot;languageWeight&quot; value=&quot;${languageWeight}&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The Grammar  configuration                               --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;jsgfGrammar&quot; type=&quot;edu.cmu.sphinx.jsapi.JSGFGrammar&quot;&gt;
        &lt;property name=&quot;dictionary&quot; value=&quot;dictionary&quot;/&gt;
        &lt;property name=&quot;grammarLocation&quot; 
             value=&quot;resource:/demo.sphinx.transcriber.Transcriber!/demo/sphinx/transcriber/&quot;/&gt;
        &lt;property name=&quot;grammarName&quot; value=&quot;digits&quot;/&gt;
    &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The Dictionary configuration                            --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;dictionary&quot; 
        type=&quot;edu.cmu.sphinx.linguist.dictionary.FastDictionary&quot;&gt;
        &lt;property name=&quot;dictionaryPath&quot; 
     value=&quot;resource:/edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model!/edu/cmu/sphinx/model/acoustic/WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz/dict/cmudict.0.6d&quot;/&gt;
        &lt;property name=&quot;fillerPath&quot; 
     value=&quot;resource:/edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model!/edu/cmu/sphinx/model/acoustic/WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz/dict/fillerdict&quot;/&gt;
        &lt;property name=&quot;addSilEndingPronunciation&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;allowMissingWords&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The acoustic model configuration                         --&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;component name=&quot;wsj&quot;
               type=&quot;edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model&quot;&gt;
        &lt;property name=&quot;loader&quot; value=&quot;wsjLoader&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;wsjLoader&quot; type=&quot;edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader&quot;&gt;
        &lt;property name=&quot;logMath&quot; value=&quot;logMath&quot;/&gt;
        &lt;property name=&quot;unitManager&quot; value=&quot;unitManager&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The unit manager configuration                           --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;unitManager&quot; 
        type=&quot;edu.cmu.sphinx.linguist.acoustic.UnitManager&quot;/&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The live frontend configuration                          --&gt;
    &lt;!-- ******************************************************** --&gt;
    &lt;component name=&quot;epFrontEnd&quot; type=&quot;edu.cmu.sphinx.frontend.FrontEnd&quot;&gt;
        &lt;propertylist name=&quot;pipeline&quot;&gt;
            &lt;item&gt;streamDataSource &lt;/item&gt;
            &lt;item&gt;speechClassifier &lt;/item&gt;
            &lt;item&gt;speechMarker &lt;/item&gt;
            &lt;item&gt;nonSpeechDataFilter &lt;/item&gt;
            &lt;item&gt;premphasizer &lt;/item&gt;
            &lt;item&gt;windower &lt;/item&gt;
            &lt;item&gt;fft &lt;/item&gt;
            &lt;item&gt;melFilterBank &lt;/item&gt;
            &lt;item&gt;dct &lt;/item&gt;
            &lt;item&gt;liveCMN &lt;/item&gt;
            &lt;item&gt;featureExtraction &lt;/item&gt;
        &lt;/propertylist&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************** --&gt;
    &lt;!-- The frontend pipelines                                   --&gt;
    &lt;!-- ******************************************************** --&gt;
    
    &lt;component name=&quot;streamDataSource&quot;
                type=&quot;edu.cmu.sphinx.frontend.util.StreamDataSource&quot;&gt;
        &lt;property name=&quot;sampleRate&quot; value=&quot;16000&quot;/&gt;
        &lt;property name=&quot;bitsPerSample&quot; value=&quot;16&quot;/&gt;
        &lt;property name=&quot;bigEndianData&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;signedData&quot; value=&quot;true&quot;/&gt;
        &lt;property name=&quot;bytesPerRead&quot; value=&quot;320&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;speechClassifier&quot; 
               type=&quot;edu.cmu.sphinx.frontend.endpoint.SpeechClassifier&quot;&gt;
        &lt;property name=&quot;threshold&quot; value=&quot;13&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;nonSpeechDataFilter&quot; 
               type=&quot;edu.cmu.sphinx.frontend.endpoint.NonSpeechDataFilter&quot;/&gt;
    
    &lt;component name=&quot;speechMarker&quot; 
               type=&quot;edu.cmu.sphinx.frontend.endpoint.SpeechMarker&quot; &gt;
        &lt;property name=&quot;speechTrailer&quot; value=&quot;50&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;premphasizer&quot; 
               type=&quot;edu.cmu.sphinx.frontend.filter.Preemphasizer&quot;/&gt;
    
    &lt;component name=&quot;windower&quot; 
               type=&quot;edu.cmu.sphinx.frontend.window.RaisedCosineWindower&quot;&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;fft&quot; 
            type=&quot;edu.cmu.sphinx.frontend.transform.DiscreteFourierTransform&quot;&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;melFilterBank&quot; 
        type=&quot;edu.cmu.sphinx.frontend.frequencywarp.MelFrequencyFilterBank&quot;&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;dct&quot; 
            type=&quot;edu.cmu.sphinx.frontend.transform.DiscreteCosineTransform&quot;/&gt;
    
    &lt;component name=&quot;liveCMN&quot; 
               type=&quot;edu.cmu.sphinx.frontend.feature.LiveCMN&quot;/&gt;
    
    &lt;component name=&quot;featureExtraction&quot; 
               type=&quot;edu.cmu.sphinx.frontend.feature.DeltasFeatureExtractor&quot;/&gt;
    
    &lt;!-- ******************************************************* --&gt;
    &lt;!--  monitors                                               --&gt;
    &lt;!-- ******************************************************* --&gt;
    
    &lt;component name=&quot;accuracyTracker&quot; 
                type=&quot;edu.cmu.sphinx.instrumentation.AccuracyTracker&quot;&gt;
        &lt;property name=&quot;recognizer&quot; value=&quot;${recognizer}&quot;/&gt;
        &lt;property name=&quot;showAlignedResults&quot; value=&quot;false&quot;/&gt;
        &lt;property name=&quot;showRawResults&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;memoryTracker&quot; 
                type=&quot;edu.cmu.sphinx.instrumentation.MemoryTracker&quot;&gt;
        &lt;property name=&quot;recognizer&quot; value=&quot;${recognizer}&quot;/&gt;
    &lt;property name=&quot;showSummary&quot; value=&quot;false&quot;/&gt;
    &lt;property name=&quot;showDetails&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;component name=&quot;speedTracker&quot; 
                type=&quot;edu.cmu.sphinx.instrumentation.SpeedTracker&quot;&gt;
        &lt;property name=&quot;recognizer&quot; value=&quot;${recognizer}&quot;/&gt;
        &lt;property name=&quot;frontend&quot; value=&quot;${frontend}&quot;/&gt;
    &lt;property name=&quot;showSummary&quot; value=&quot;true&quot;/&gt;
    &lt;property name=&quot;showDetails&quot; value=&quot;false&quot;/&gt;
    &lt;/component&gt;
    
    &lt;!-- ******************************************************* --&gt;
    &lt;!--  Miscellaneous components                               --&gt;
    &lt;!-- ******************************************************* --&gt;
    
    &lt;component name=&quot;logMath&quot; type=&quot;edu.cmu.sphinx.util.LogMath&quot;&gt;
        &lt;property name=&quot;logBase&quot; value=&quot;1.0001&quot;/&gt;
        &lt;property name=&quot;useAddTable&quot; value=&quot;true&quot;/&gt;
    &lt;/component&gt;
    

    </config>

     
    • Anonymous

      Anonymous - 2006-04-04

      Could it be that here is the Problem ?

      <property name="grammarName" value="digits"/>

       
    • franchan

      franchan - 2006-04-04

      Hi Chris,

      Thanks for the reply. However, maybe I have an unclear title or even in my post.

      In particular, the digit.gram contains the digit grammar, ie: one | two |.... However, in the case which I want to include all the english word in the translation, do I need to do ie: a | aaa | aaberg | until z? <= which is a stupid solution I can think of at the moment.

      Is there an easier way of doing it?

      Thanks a lot.

      -Francis

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.