Sphinx 3 force-alignment problem for few words

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Sphinx 3 force-alignment problem for few words

Forum: Help

Creator: nayan kalita

Created: 2016-10-26

Updated: 2016-11-01

nayan kalita - 2016-10-26

Hi Everyone,

I am having an issue of wrong alignment and low phone score on Sphinx3 force-alignment for few sets of words. and the alignment is fine for all others words excepts these words.

The list of such words are Father(F AA DH ER), Polish(P AA L IH SH),Swiss(S W IH S),Miss(M IH1 S) etc.
The major problem i am facing for word "Father", the alignment is incorrect and score of F is low for all occurance. I am giving a example below.(file name : 91_kash_androidOctober_1477365933528_91) .

SFrm EFrm SegAScr Phone 0 80 -3574250 SIL 81 101 -173969 SIL 102 104 -247371495 F SIL AA b 105 107 -262382 AA F DH i 108 110 -224049 DH AA ER i 111 113 -140673 ER DH SIL e 114 130 -65163 SIL

Total score: -251811981

The pronouncation for word Father ( F AA DH ER) seems to be okay for me.

For your reffernce i have uploaded the few wave files and the acoustic mode used on the below link.

https://drive.google.com/drive/folders/0B00sD6w9Ioe9VFNtOC1rWlR5XzQ?usp=sharing

Please help me on this to improve the alignment.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-26
  
  You need to provide the command line, sphinx3 and sphinxbase version, the dictionary and all other data files to reproduce your problem.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nayan kalita - 2016-10-27

Hi Nickolay,

Thanks for your reply. I am using sphinx3 version 3.08 and sphinxbase version 0.6.

I am using sphinx_fe with following Parameter to genarate the mfcc file.

sphinx_fe -i Data/91_kash_androidOctober_1477365933528_91.wav -argfile etc/feat.params -ei wav -mswav yes -eo mfc -o feat/91_kash_androidOctober_1477365933528_91.mfc

and then run the following command for force-alignment.

sphinx3_align -hmm models/ -dict etc/voxeduEnglish.dic -fdict etc/words.filler -ctl temp/91_kash_androidOctober_1477365933528_91.ctl -insent temp/91_kash_androidOctober_1477365933528_91.sent -cepdir feat -phsegdir temp -phlabdir temp -wdsegdir temp -outsent 91_kash_androidOctober_1477365933528_91.out

The detail log report for command line display is given in the attached .zip file . File name is Log.txt

productionReleaseTemp.zip

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-30
  
  Same command with default en-us model aligns fine, so it should be the issue with your acoustic model not trained appropriately.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nayan kalita - 2016-10-31

Hi Nickolay,

Thank you for your reply.

1) Please suggest us the best acoustic model for US english for sphinx3.

2) If any open source databases are available to train a acoustic model

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-31
  
  1) Please suggest us the best acoustic model for US english for sphinx3.
  
  default models available in downloads are ok
  
  2) If any open source databases are available to train a acoustic model
  
  tedlium, librispeech
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nayan kalita - 2016-10-31

Hi Nickolay,

Thank you for your suggestions.

1) default models available in downloads.Here you are referring to latest cmu acoustic model cmusphinx-en-us-5.2 .

2) I am able to find out the download link for librispeech database. http://www.openslr.org/12/ . Please share the download page link for tedlium database.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-31
  
  1) default models available in downloads.Here you are referring to latest cmu acoustic model cmusphinx-en-us-5.2 .
  
  Yes
  
  2) I am able to find out the download link for librispeech database. http://www.openslr.org/12/ . Please share the download page link for tedlium database.
  
  http://www.openslr.org/19/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nayan kalita - 2016-11-01

Hi Nickolay,

Thank you for your help.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.