Menu

sphinxtrain run - run into problem

Help
2016-03-20
2016-10-03
  • Jens Kallup

    Jens Kallup - 2016-03-20

    only one ERROR:
    No word from the language model has pronunciation in the dictionary
    What is wrong?
    How to fix?
    Jens

    console output (snippet) :

    MODULE: DECODE Decoding using models previously trained
            Decoding 7 segments starting at 0 (part 1 of 1) 
            0% 
    ERROR: This step had 1 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Aligning results to find error rate
            SENTENCE ERROR: 0.0% (0/7)   WORD ERROR RATE: 0.0% (0/9)
    

    vocdeu.dic

    A               a
    ACHT            acht
    B               b
    DATEI           datei
    DER             der
    SPEICHERN       speichern
    
    #vocdeu.lm
    #############################################################################
    ## Copyright (c) 1996, Carnegie Mellon University, Cambridge University,
    ## Ronald Rosenfeld and Philip Clarkson
    ## Version 3, Copyright (c) 2006, Carnegie Mellon University 
    ## Contributors includes Wen Xu, Ananlada Chotimongkol, 
    ## David Huggins-Daines, Arthur Chan and Alan Black 
    #############################################################################
    =============================================================================
    ===============  This file was produced by the CMU-Cambridge  ===============
    ===============     Statistical Language Modeling Toolkit     ===============
    =============================================================================
    This is a 3-gram language model, based on a vocabulary of 9 words,
      which begins "</s>", "<s>", "A"...
    This is a CLOSED-vocabulary model
      (OOVs eliminated from training data and are forbidden in test data)
    Good-Turing discounting was applied.
    1-gram frequency of frequency : 3 
    2-gram frequency of frequency : 3 0 0 0 0 0 0 
    3-gram frequency of frequency : 3 0 0 0 0 0 0 
    1-gram discounting ratios : 0.33 
    2-gram discounting ratios : 
    3-gram discounting ratios : 
    This file is in the ARPA-standard format introduced by Doug Paul.
    
    p(wd3|wd1,wd2)= if(trigram exists)           p_3(wd1,wd2,wd3)
                    else if(bigram w1,w2 exists) bo_wt_2(w1,w2)*p(wd3|wd2)
                    else                         p(wd3|w2)
    
    p(wd2|wd1)= if(bigram exists) p_2(wd1,wd2)
                else              bo_wt_1(wd1)*p_1(wd2)
    
    All probs and back-off weights (bo_wt) are given in log10 form.
    
    Data formats:
    
    Beginning of data mark: \data\
    ngram 1=nr            # number of 1-grams
    ngram 2=nr            # number of 2-grams
    ngram 3=nr            # number of 3-grams
    
    \1-grams:
    p_1     wd_1 bo_wt_1
    \2-grams:
    p_2     wd_1 wd_2 bo_wt_2
    \3-grams:
    p_3     wd_1 wd_2 wd_3 
    
    end of data mark: \end\
    
    \data\
    ngram 1=9
    ngram 2=3
    ngram 3=3
    
    \1-grams:
    -0.9542 </s>    -0.4260
    -0.9542 <s> -0.4260
    -0.9542 A   -0.4260
    -0.9542 ACHT    0.0000
    -0.9542 B   0.0000
    -0.9542 DATEI   0.0000
    -0.9542 DER 0.0000
    -0.9542 SIL 0.0000
    -0.9542 SPEICHERN   0.0000
    
    \2-grams:
    -0.1761 </s> <s> 0.1761
    -0.1761 <s> A 0.1761
    -0.1761 A ACHT -0.2499
    
    \3-grams:
    -0.3010 </s> <s> A 
    -0.3010 <s> A ACHT 
    -0.3010 A ACHT B 
    
    \end\
    
     
    • Nickolay V. Shmyrev

      It is better to provide a complete log, not just error message. For now it's not quite clear what happened there but I suspect you used different dictionary.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.