Menu

Question about Sphinx4 Lexicon

Help
Anonymous
2004-08-18
2012-09-22
  • Anonymous

    Anonymous - 2004-08-18

    If I am right, for the isolated words recognition, the Sphinx4 use one whole word as a unit. For LVCSR, may I know if the Sphinx4 use triphones as units? If so, may I know how many triphones the Sphinx4 uses?

    Thanks a lot!

    --Larry

     
    • Paul Lamere

      Paul Lamere - 2004-08-19

      Larry:

      Sphinx-4 can use arbitrary sized contexts in recognition.  The current linguists used by sphinx-4 look for a single left and right surrounding context.   The type of units used is defined by the acoustic model. For instance for TIDIGITS the units are 'phone within a word' units:
      AX_one  
      AY_five  
      AY_nine  
      EH_seven  
      EY_eight  
      E_seven  
      F_five  
      F_four  
      II_three  
      II_zero  
      I_six  
      K_six  
      N_nine  
      N_nine_2  
      N_one  
      N_seven  
      OO_two  
      OW_four  
      OW_oh  
      OW_zero  
      R_four  
      R_three  
      R_zero  
      SIL
      S_seven  
      S_six  
      S_six_2  
      TH_three  
      T_eight  
      T_two  
      V_five  
      V_seven  
      W_one  
      Z_zero  

      There are 35 of these units.   The TIDIGIT acoustic model defines about 350 context dependent units.

      The acoustic models used for general speech recognition (WSJ, RM1, HUB4) use about 40 phonemes and about 30,000 triphones.   

      If you are interested in looking closer at this, unpack one of the acoustic models and take a look at the file with the name that ends in ".mdef".  This file contains the information about the units, including the context independent units and the context dependent (triphones) units.

      paul

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.