Menu

Sphinxtrain does not continue without error after generating mfc files

Help
wunna
2015-08-18
2015-08-21
  • wunna

    wunna - 2015-08-18

    I try to create a new acoustic model for my language. I used the latest versions of sphinxtrain, sphinxbase and pocketsphinx: sphinxtrain-5prealpha, sphinxbase-5prealpha and pocketsphinx-5prealpha. When I train my model, the sphinxtrain stop after phase 6 Checking that all the words in the transcript are in the dictionary. My dictionary has a lot of duplicated phones because of the nature of our language, Myanmar. I shared my logdir folder.
    [https://www.dropbox.com/s/6qcl9ndl6wpqcuy/logdir.rar?dl=0]In that folder, only 000.comp_feat folder contains. Please advise and help me.

     
  • wunna

    wunna - 2015-08-18

    This is my logdir folder.

     
    • Nickolay V. Shmyrev

      Phones must be unique, you should also fix other warnings reported.

      There is nothing in Myanmar different from other languages, you still can properly select a phoneset according to the wikipedia page

      https://en.wikipedia.org/wiki/Burmese_language

      You need to share your etc folder if you need help on the database setup, logdir folder is not enough since all your mistakes are in etc folder.

       
  • wunna

    wunna - 2015-08-18

    Some words and syllables have two different phones and I faced a lot of warnings in this case.
    This is etc folder.

     
    • Nickolay V. Shmyrev

      Please open the wikipedia page linked above and read about phonemes of your language. Please do not use syllables as phones.

       
  • wunna

    wunna - 2015-08-18

    There is no phonetic dictionary in Myanmar. Therefore, I try to created one syllable to one phone mapping. To create phonetic dictionary, MLC (Myanmar Language Commession) dictionary has over 200000 words and I cannot write those words to phonetic dictionary manually. I also used Phonetisaurus tools to create g2p mapping but the results are wrong. So I created phonetic dictionary for Myanmar syllables and there are over 2000 syllables in Myanmar. Please guide me for creating phonetic dictionary.

     
    • Nickolay V. Shmyrev

      Your dictionary should look like this:

         T I'
         H N I'
         TH O U N:
         L E I:
         NG A:
         CH A U'
         KH U N
         SH I'
         K O U:
         TH O U N N J A.
      က   K A.
      ကာ  K A
      ကား K A:
      ကိ  K I.
      ကီ  K I
      ကီး K I:
      ကု  K U.
      ကူ  K U
      ကူး K U:
      ေက  K E I
      ကဲ  K E:
      ကဲ့ K E.
      ေကာ K O:
      ေကာ့    K O.
      ေကာ္    K O
      ကံ  K A N
      ကံ့ K A N.
      က့ံ K A N.
      ကို K O U
      ကိုး    K O U:
      ကက္ K E'
      ကုက္    K O U'
      ေကာက္   K A U'
      ကိုက္   K A I'
      ကင္ K I N
      ကင္း    K I N:
      ေကာင္   K A U N
      

      Phones must be separated by spaces. Syllables are not phones.

       
  • wunna

    wunna - 2015-08-21

    Now I created new phonetic dictionary and phones files. I setup my model in Ubuntu 12.04 at VMware. But when I train my model, he sphinxtrain stop after phase 6 Checking that all the words in the transcript are in the dictionary. At this stage, I faced many warnings.Can those warnings stop training process?
    I posted my etc folder and logdir folder.

     
    • Nickolay V. Shmyrev

      All the words from the train transcription must be in a dictionary

       
  • wunna

    wunna - 2015-08-21

    This is logdir folder.

     

Log in to post a comment.