Menu

Sphinxtrain: Training for Alphabet

Help
2004-08-25
2012-09-22
  • danial ibrahim

    danial ibrahim - 2004-08-25

    Hi,
    I want to use sphinxtrain for alphabet data corpus. I have trained it before but the accuracy was 0% when recognition. so, before i start the training again, i want to know whether my dictionary,phonelist and transcription are right or wrong. can anyone here give me some guides or samples for the alphabet data training?

    alpha.dict
    ----------
    A    EY
    B    B IY
    C    S IY
    D    D IY
    E    IY
    F    EH F
    G    JH IY
    H    EY CH
    I    AY
    J    JH EY
    K    K EY
    L    EH L
    M    EH M
    N    EH N
    O    OW
    P    P IY
    Q    K Y UW
    R    AA R
    S    EH S
    T    T IY
    U    Y UW
    V    V IY
    W    D AH B AX L Y UW
    X    EH K S
    Y    W AY
    Z    Z IY

    alpha.phone
    ------------
    AA
    AH
    AX
    AY
    B
    CH
    D
    EH
    EY
    F
    IY
    JH
    K
    L
    M
    N
    OW
    P
    R
    S
    SIL
    T
    UW
    V
    W
    Y
    Z

    alpha.transcription
    ------------------
    <s> A </s> (0AF1SET0)
    <s> A </s> (0AF1SET1)
    <s> A </s> (0AF1SET2)
    <s> A </s> (0AF1SET3)
    <s> A </s> (0AF1SET4)
    <s> A </s> (0AF1SET5)
    .
    .
    .
    <s> Z </s> (0ZF1SET3)
    <s> Z </s> (0ZF1SET4)
    <s> Z </s> (0ZF1SET5)
    <s> Z </s> (0ZF1SET6)
    <s> Z </s> (0ZF1SET7)
    <s> Z </s> (0ZF1SET8)
    <s> Z </s> (0ZF1SET9)

    any help will be very appreciated.

    thanks.

     
    • The Grand Janitor

      I think that makes sense if you are doing training. -Arthur

       
    • danial ibrahim

      danial ibrahim - 2004-08-26

      another issue, is when i used verify_all.pl the message said phone AA (say) occurs in dictionary but not in the phonelist. but, obviously it is occurs in my phonelist file. why it can't found that phone and others? is it any error i should fix in the pearl scripts?

      thanks for your time.

       
    • Anonymous

      Anonymous - 2004-09-16

      I've got the same problem: i was training models for digits and 2 control words (start and end) for Lithuanian language.

      My files seem to be ok:
      transcription file
      <s> <sil> VIENAS <sil></s> (a0908_01)
      <s> <sil> DU <sil></s> (a0908_02)
      <s> <sil> TRYS <sil></s> (a0908_03)
      etc.

      dictionary
      VIENAS V IE N AX S
      DU    D UH
      TRYS    T R IY S
      etc.

      it seems i have enough data to get certain results (each word is repeated in approx 120 files)

      but recognition results are 0% :(

      I really can't figure out why.

       
    • The Grand Janitor

      Hi Daniel and Thoams,
      I cannot debug either one of your problem because the information given is not enough.   One thing I am not sure is that the problems you got are two different types of problem. 

      Daniel's problem seems to relate only to verify.pl .  It is usually not that difficult to figure out why.

      Thomas' problem is a little bit different.  Thomas, do you also have the same problem as Daniel in verify_all.pl? Or do you have other problems? I am not sure.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.