Menu

Sphinx 3 force-alignment problem for few words

Help
2016-10-26
2016-11-01
  • nayan kalita

    nayan kalita - 2016-10-26

    Hi Everyone,

    I am having an issue of wrong alignment and low phone score on Sphinx3 force-alignment for few sets of words. and the alignment is fine for all others words excepts these words.

    The list of such words are Father(F AA DH ER), Polish(P AA L IH SH),Swiss(S W IH S),Miss(M IH1 S) etc.
    The major problem i am facing for word "Father", the alignment is incorrect and score of F is low for all occurance. I am giving a example below.(file name : 91_kash_androidOctober_1477365933528_91) .

     SFrm  EFrm   SegAScr Phone
        0    80  -3574250 SIL
       81   101   -173969 SIL
      102   104 -247371495 F SIL AA b
      105   107   -262382 AA F DH i
      108   110   -224049 DH AA ER i
      111   113   -140673 ER DH SIL e
      114   130    -65163 SIL
    

    Total score: -251811981

    The pronouncation for word Father ( F AA DH ER) seems to be okay for me.

    For your reffernce i have uploaded the few wave files and the acoustic mode used on the below link.

    https://drive.google.com/drive/folders/0B00sD6w9Ioe9VFNtOC1rWlR5XzQ?usp=sharing

    Please help me on this to improve the alignment.

     
    • Nickolay V. Shmyrev

      You need to provide the command line, sphinx3 and sphinxbase version, the dictionary and all other data files to reproduce your problem.

       
  • nayan kalita

    nayan kalita - 2016-10-27

    Hi Nickolay,

    Thanks for your reply. I am using sphinx3 version 3.08 and sphinxbase version 0.6.

    I am using sphinx_fe with following Parameter to genarate the mfcc file.

    sphinx_fe -i Data/91_kash_androidOctober_1477365933528_91.wav -argfile etc/feat.params -ei wav -mswav yes -eo mfc -o feat/91_kash_androidOctober_1477365933528_91.mfc

    and then run the following command for force-alignment.

    sphinx3_align -hmm models/ -dict etc/voxeduEnglish.dic -fdict etc/words.filler -ctl temp/91_kash_androidOctober_1477365933528_91.ctl -insent temp/91_kash_androidOctober_1477365933528_91.sent -cepdir feat -phsegdir temp -phlabdir temp -wdsegdir temp -outsent 91_kash_androidOctober_1477365933528_91.out

    The detail log report for command line display is given in the attached .zip file . File name is Log.txt

     
    • Nickolay V. Shmyrev

      Same command with default en-us model aligns fine, so it should be the issue with your acoustic model not trained appropriately.

       
  • nayan kalita

    nayan kalita - 2016-10-31

    Hi Nickolay,

    Thank you for your reply.

    1) Please suggest us the best acoustic model for US english for sphinx3.

    2) If any open source databases are available to train a acoustic model

     
    • Nickolay V. Shmyrev

      1) Please suggest us the best acoustic model for US english for sphinx3.

      default models available in downloads are ok

      2) If any open source databases are available to train a acoustic model

      tedlium, librispeech

       
  • nayan kalita

    nayan kalita - 2016-10-31

    Hi Nickolay,

    Thank you for your suggestions.

    1) default models available in downloads.Here you are referring to latest cmu acoustic model cmusphinx-en-us-5.2 .

    2) I am able to find out the download link for librispeech database. http://www.openslr.org/12/ . Please share the download page link for tedlium database.

     
    • Nickolay V. Shmyrev

      1) default models available in downloads.Here you are referring to latest cmu acoustic model cmusphinx-en-us-5.2 .

      Yes

      2) I am able to find out the download link for librispeech database. http://www.openslr.org/12/ . Please share the download page link for tedlium database.

      http://www.openslr.org/19/

       
  • nayan kalita

    nayan kalita - 2016-11-01

    Hi Nickolay,

    Thank you for your help.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.