About content of model file mdef

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

About content of model file mdef

Forum: Help

Created: 2018-11-28

Updated: 2019-05-06

Willy - 2018-11-28

Hi, I've used the pretrained model in pocketsphinx/model/en-us/en-us to do KWS, which is proveided by the package itself, and it worked very well in my application.
For the interest of model content, I open the 'mdef' file by using command 'pocketsphinx_mdef_convert', and it showed me these information

0.3
42 n_base
137053 n_tri
548380 n_state_map
5126 n_tied_state
126 n_tied_ci_state
42 n_tied_tmat

which should mean there are 5126 senones in the acoustic model.
However when I open 'means' and 'variances' files (by printp) they showed only the means and variances of states of 42 monophones, not 5126 senones.
So I want to know where are the Gaussian parameters of those remaining senones.
Also, I'm cusrious about what database was used to train this model.
Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2019-05-06
  
  The model is PTM model, so the gaussians are shared across senones of the same core phone. Only mixture weights differ.
  
  The database for the model is not public.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.