Menu

how to use allphone mode in sphinx3 ?

Help
2009-06-03
2012-09-22
  • aaron chang

    aaron chang - 2009-06-03

    Hi~

    I am using sphinx v3.7 on Linux.
    I want to try allphone mode, but the result is always "SIL".

    If I use the default mode, the result is OK.
    But if I add " -mode allphone ", the result is like:

    Backtrace(arctic_0006)
    FV:arctic_0006> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
    fv:arctic_0006> SIL 0 134 4057135 0 4057135 5829650
    FV:arctic_0006> TOTAL 4057135 0

    Here is how I use sphinx3_decode:

    sphinx3_decode \
    -lm ./my_dic/alarm.sent.arpabo.DMP \
    -dict ./my_dic/alarm.dic \
    -fdict ./my_dic/lm_giga_5k_nvp.sphinx.filler \
    -mdef /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef \
    -mean /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means \
    -var /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances \
    -mixw /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights \
    -tmat /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices \
    -ctl a \
    -cepdir /usr/src/sphinx/adapt \
    -mode allphone

    Could someone please tell me what I miss?
    Thank you very much.

     
    • Nickolay V. Shmyrev

      In allphone mode language model should contain phones:

      Language model created by QuickLM on Птн Июн 5 00:20:32 MSD 2009
      Copyright (c) 1996-2002
      Carnegie Mellon University and Alexander I. Rudnicky

      This model based on a corpus of 105 sentences and 31 words
      The (fixed) discount mass is 0.5

      \data\
      ngram 1=31
      ngram 2=131
      ngram 3=184

      \1-grams:
      -2.1690 AA -0.2526
      -1.3450 AE -0.2537
      -1.1276 AH -0.1877
      -2.3450 AO -0.2948
      -3.1232 AW -0.2813
      -2.5211 AY -0.2813
      -1.0979 B -0.1708
      -1.3523 D -0.1890
      -1.9471 EH -0.2332
      -1.8010 ER -0.1791
      -1.6760 EY -0.1936
      -3.1232 F -0.2941
      -2.8222 G -0.2918

       
      • aaron chang

        aaron chang - 2009-06-08

        Thank you, Nickolay.
        Would you please tell me how to make a LM just like yours?
        I have tried QuickLM http://www.speech.cs.cmu.edu/tools/lm.html
        I input some sentences, and get result like these:

        Language model created by QuickLM on Sun Jun 7 22:09:00 EDT 2009
        Carnegie Mellon University (c) 1996

        This model based on a corpus of 2 sentences and 26 words
        The (fixed) discount mass is 0.5

        \data\
        ngram 1=22
        ngram 2=23
        ngram 3=22

        \1-grams:
        -1.7160 Because -0.2840
        -1.7160 This -0.2926
        -1.4150 a -0.2840
        -1.7160 based -0.2926
        -1.7160 be -0.2926

        I don't know how to make the LM contians phones.

        I have also tried allphone of Sphinx v3.6. It doesn't need to input LM.
        And it worked just fine.

         
        • Nickolay V. Shmyrev

          > Would you please tell me how to make a LM just like yours?

          Create a text with phonetic transcription:

          <s> W AH N </s>

          then use QuickLM to generate language model from it. You can also get one in an4:

          sphinx3/model/lm/an4/an4.tg.phone.arpa

          > I have also tried allphone of Sphinx v3.6. It doesn't need to input LM.
          And it worked just fine.

          This also works just fine if you don't pass -lm altogether. But to increase accuracy you need a phone lm. And if you pass -lm, use proper lm with phones.

           
    • aaron chang

      aaron chang - 2009-06-08

      I have tried the LM of sphinx3/model/lm/an4/an4.tg.phone.arpa, and it really works.
      Thanks a lot.

      If I want to build a system to check the accuracy of user's pronunciation.
      For example: when the user speaks "apple", I want to find out he speaks
      "AE P AH L" or "AA P AH L".

      Is the allphone mode applicable to this job, or I should find another way?
      Thank you.

       
      • Nickolay V. Shmyrev

        > Is the allphone mode applicable to this job, or I should find another way?

        Not quite, it's better to use forced alignment and include all possible variants in the dictionary.

         
    • aaron chang

      aaron chang - 2009-06-11

      Thank you very much.
      I will try forced alignment.
      It would be very nice if you can tell me where can I find detail information about "forced alignment".

       
      • Nickolay V. Shmyrev

        Use sphinx3_align, it takes a sentence and a dicitonary and checks if spoken recording matches any of the variants in the dictionary suggested. Say you have two variants:

        HELLO H AH L OW
        HELLO(2) H OH L OW

        After sphinx3_align run you get the actual result

        HELLO(2)

        For more information on how to run it and what is forced alignement please read the docs/google.

         
        • aaron chang

          aaron chang - 2009-06-11

          I got it. I'll study that.
          Thank you.

           

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.