The training is Ok.
But I have some questions about the files resulting of the training.
Do I need to keep all the files to run with sphinx2
files that I keep :
sendump
phone
map
*.chmm
*.var
*.vec
But what about the .xcod .ccode .d2code .p3code ?
Do I need to keep these files ?
In order to reducing the size of my sendump file, I want to build 8bit senone dump file but I don't understand the manual. I can't find the pdf32to8 program.
Any others ideas about reducing the size of my acoustic models ?
thx
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2001-11-01
Hi there,
I am using Sphinx 3 trainer to train limited words (say 50). I got the impression from the trainer manual (scriptman1) that I need to do only CI training for them.
what I did:
- built .raw file (audio training data) from "addrec" Sphinx 2 executable.
- built set of .mfc files using Sphinx 3 "wave2feat" from the set of .raw files.
- built ci_model_def file from "mk_mdef_gen" where the inputs are phonelist, maindictionary and fillerdictonary.
- I guessed the content of "transcript" file is nothing but the taining data written in English. Suppose I wanted to train the words "TRAIN" and "WORD", then the content of the transcript file would be :
TRAIN WORD
Am I right ?
- Then I followed the training manual to do the flat initializations of transition matrices, mixture weights, global means and global variances of the vectors.
- Then I followed the instructions to do the "CI Training" with the help of Buam Welch iteration (the executable is bw).
- Now I got tmatfn, mixwtfn, meanfn and varfn files as the output.
Could you please let me how do you get the *.chmm files (for me "TRAIN.chmm" and "WORD.chmm") ?
And also how do you get *.vec and *.var files for different vectors ?
And how to use them (files) as the inputs of the Sphinx decoder, so that we can test whether the words "TRAIN" and "WORD" are getting recognized.
Any inputs and guidance from you will be highly appreciated.
Many thanks in advance,
Palash,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I has a question about training step. When i train the CI model, i found crush in forward step.
Then BW algorithm error due to Init_gau. When you initialize the gaussian, you have two choices, one is to initialize mean and variance based on phone model (came from mk_model_def) and alignment speech data. Another way is to generate a global mean/var, it need nothing. But the dimension has problem. So, the result is we got a 1*4*1 mean/var matrix. It cannot fit for BW as forward algorithm will begin use mapping between gaussian mix with phone. But problem is on help manual. It recommends using second way in init_gau. I also met this problem, as I cannot use force alignment to get alignment data, as turtle acoustic model, which distributed by sphinx cannot align real data effectively. This is a dilemma for me. Can you give me some suggestion about it?
if (mdef)
{
acmod_set = mdef->acmod_set;
n_ts = mdef->n_tied_state;
}
else
{
acmod_set = NULL;
n_ts = 1; /* Global mean/var - 628580 */
}
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The training is Ok.
But I have some questions about the files resulting of the training.
Do I need to keep all the files to run with sphinx2
files that I keep :
sendump
phone
map
*.chmm
*.var
*.vec
But what about the .xcod .ccode .d2code .p3code ?
Do I need to keep these files ?
In order to reducing the size of my sendump file, I want to build 8bit senone dump file but I don't understand the manual. I can't find the pdf32to8 program.
Any others ideas about reducing the size of my acoustic models ?
thx
Hi there,
I am using Sphinx 3 trainer to train limited words (say 50). I got the impression from the trainer manual (scriptman1) that I need to do only CI training for them.
what I did:
- built .raw file (audio training data) from "addrec" Sphinx 2 executable.
- built set of .mfc files using Sphinx 3 "wave2feat" from the set of .raw files.
- built ci_model_def file from "mk_mdef_gen" where the inputs are phonelist, maindictionary and fillerdictonary.
- I guessed the content of "transcript" file is nothing but the taining data written in English. Suppose I wanted to train the words "TRAIN" and "WORD", then the content of the transcript file would be :
TRAIN WORD
Am I right ?
- Then I followed the training manual to do the flat initializations of transition matrices, mixture weights, global means and global variances of the vectors.
- Then I followed the instructions to do the "CI Training" with the help of Buam Welch iteration (the executable is bw).
- Now I got tmatfn, mixwtfn, meanfn and varfn files as the output.
Could you please let me how do you get the *.chmm files (for me "TRAIN.chmm" and "WORD.chmm") ?
And also how do you get *.vec and *.var files for different vectors ?
And how to use them (files) as the inputs of the Sphinx decoder, so that we can test whether the words "TRAIN" and "WORD" are getting recognized.
Any inputs and guidance from you will be highly appreciated.
Many thanks in advance,
Palash,
Hi,
I has a question about training step. When i train the CI model, i found crush in forward step.
Then BW algorithm error due to Init_gau. When you initialize the gaussian, you have two choices, one is to initialize mean and variance based on phone model (came from mk_model_def) and alignment speech data. Another way is to generate a global mean/var, it need nothing. But the dimension has problem. So, the result is we got a 1*4*1 mean/var matrix. It cannot fit for BW as forward algorithm will begin use mapping between gaussian mix with phone. But problem is on help manual. It recommends using second way in init_gau. I also met this problem, as I cannot use force alignment to get alignment data, as turtle acoustic model, which distributed by sphinx cannot align real data effectively. This is a dilemma for me. Can you give me some suggestion about it?
if (mdef)
{
acmod_set = mdef->acmod_set;
n_ts = mdef->n_tied_state;
}
else
{
acmod_set = NULL;
n_ts = 1; /* Global mean/var - 628580 */
}