Menu

Training models with different topologies

Help
2005-11-09
2012-09-22
  • Tanel Alumäe

    Tanel Alumäe - 2005-11-09

    Hello,

    In a current recognition task, I need to recognize mainly digits and spelled letters (with high accuracy), but also a small limited set of other words. I was thinking of training special HMMs for each number and letter and also generic phoneme models for word recognition.

    So I wonder, if it's possible to train models with different number of states: e.g. normal 3-state models for phonemes but longer models for letters and digits? I understand that this is not supported by the normal SphinxTrain script workflow, but could I create the model architecture file by hand (or by a script)... I was just wondering if the bw and other training programs won't choke upon it, and if Sphinx3.5 and/or Sphinx4 can theoretically handle it... Oh, and I was planning to only use CI models.

    Anybody?

    Thanks in advance

     
    • Tanel Alumäe

      Tanel Alumäe - 2005-11-10

      Replying to myself:

      I solved the problem by creating special intra-word 3-state HMMs for each digit/letter, in a similar way of the tidigits acoustic models.

       
  • osman b

    osman b - 2009-09-15

    Hello,
    I also want to train two HMM's with different topologies. Can you explain your
    solution a bit more? Is it possible to train models with different number of
    states? Is it possible to decode with such HMM topology using Sphinx3 decoder?

    Thanks for your help

     
  • osman b

    osman b - 2009-09-16

    Anybody has any idea about training and decoding with models which have
    different number of states? Can Sphinx handle it?

    Thanks in advance

     
  • Nickolay V. Shmyrev

    > Anybody has any idea about training and decoding with models which have
    different number of states? Can Sphinx handle it?

    No, sphinxtrain can only build a models with uniform topology. You can easily
    model nonuniform topology with word-dependent phones. TIDIGITS example
    mentioned above is good. Please take a look at it

    eight EY_eight T_eight
    five F_five AY_five V_five
    four F_four OW_four R_four
    nine N_nine AY_nine N_nine_2
    oh OW_oh
    one W_one AX_one N_one
    seven S_seven EH_seven V_seven E_seven N_seven
    six S_six I_six K_six S_six_2
    three TH_three R_three II_three
    two T_two OO_two
    zero Z_zero II_zero R_zero OW_zero

    Please also avoid posting a reply to 5-year-old post. Start a new topic if
    needed.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.