I am going through the mdef.txt file to understand its components and I have the following questions.
Q1] How are the combinations of phones (base, left and right) in the mdef.txt file created. Is there any liguistic knowledge applied ?
> For example, the mdef of cmusphinx-en-in-5.2 has
> 46 n_base
> 176918 n_tri
> 707856 n_state_map
> 5138 n_tied_state
>
> whereas the mdef of cmusphinx-en-us-5.2 has
> 46 n_base
> 137053 n_tri
> 548396 n_state_map
> 5138 n_tied_state
My question is, after selecting 46 base phones, how are you determining how many trigrams have to be created?
Q2] I have the same question about n_tied_state = 5138. These are the senones right? How is this number fixed?
> Additionally, in the forum, I found one more:
> 42 n_base
> 137053 n_tri
> 548380 n_state_map
> 5126 n_tied_state
This is exactly 4 phones less - which means number of senones = 5126 (5138 - [43]) So, is there an arithmetic between the number of CI phones to the number of senones?
Q3] Is there a mapping between base/lft/rt phones to stateids 1/2/3?
Thank you.
Balaji.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> > Sphinxtrain lists all possible triphones from the dictionary.
From dictionary, each word is picked up and the triphones are created from its transcription. For example, abductor AE B D AH K T ER
We will create 7 triphones like this:
~~~
<sil> AE B
AE B D
B D AH
D AH K
AH K T
K T ER
T ER <sil>
~~~</sil></sil>
Is my understanding correct?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Could somebody please explain textually what does each line mean - the state IDs particularly.
During training (bw program), if there is a training word with Phonemes AA, AE, AH, then will the parameter files (means, variances, mixed weights) be updated against these state ids (152, 156, 192, 193, 205 and 221) by the bw program?
Thank you.
Last edit: Balaji 2020-11-07
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I haven't got any response, so making the question a little more specific:
1. How is this mdef file generated - is there any manual editing or a program generates this, after reading the contents of dict file?
2. In the above 3 lines of model definition, All three lines have AA AA as left and base phones. Then, should the state id (1) be 152, 152 and 152? How come 156 is there on the third line.
Sorry, if these are very primitive questions. I understand these state ids are the keys to access acoustic model files like means and variance. So, I need to understand how these IDs are assigned to triphones.
Please help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am going through the mdef.txt file to understand its components and I have the following questions.
Q1] How are the combinations of phones (base, left and right) in the mdef.txt file created. Is there any liguistic knowledge applied ?
> For example, the mdef of cmusphinx-en-in-5.2 has
> 46 n_base
> 176918 n_tri
> 707856 n_state_map
> 5138 n_tied_state
>
> whereas the mdef of cmusphinx-en-us-5.2 has
> 46 n_base
> 137053 n_tri
> 548396 n_state_map
> 5138 n_tied_state
My question is, after selecting 46 base phones, how are you determining how many trigrams have to be created?
Q2] I have the same question about n_tied_state = 5138. These are the senones right? How is this number fixed?
> Additionally, in the forum, I found one more:
> 42 n_base
> 137053 n_tri
> 548380 n_state_map
> 5126 n_tied_state
This is exactly 4 phones less - which means number of senones = 5126 (5138 - [43])
So, is there an arithmetic between the number of CI phones to the number of senones?
Q3] Is there a mapping between base/lft/rt phones to stateids 1/2/3?
Thank you.
Balaji.
Triphones, not trigrams. Sphinxtrain lists all possible triphones from the dictionary.
5000 senones from sphinxtrain configuration + 42 * 3 for base phones = 5126
I don't get this question.
Last edit: Nickolay V. Shmyrev 2019-10-13
> > Sphinxtrain lists all possible triphones from the dictionary.
From dictionary, each word is picked up and the triphones are created from its transcription. For example,
abductor AE B D AH K T ER
We will create 7 triphones like this:
~~~
<sil> AE B
AE B D
B D AH
D AH K
AH K T
K T ER
T ER <sil>
~~~</sil></sil>
Is my understanding correct?
I tried searching the sphinxtrain configuration for the 5000 senones. I got the following from sphinxtrain\etc\sphinx_train.cfg :
Can you please mention where the 5000 senones are specified.
Hello,
I am trying to understand the contents of mdef.txt and need this help:
Sample lines in mdef.txt are like this:
Thank you.
Last edit: Balaji 2020-11-07
I haven't got any response, so making the question a little more specific:
1. How is this mdef file generated - is there any manual editing or a program generates this, after reading the contents of dict file?
2. In the above 3 lines of model definition, All three lines have AA AA as left and base phones. Then, should the state id (1) be 152, 152 and 152? How come 156 is there on the third line.
Sorry, if these are very primitive questions. I understand these state ids are the keys to access acoustic model files like means and variance. So, I need to understand how these IDs are assigned to triphones.
Please help.