I have build a triphone model
0.3
71 n_base
37800 n_tri
151484 n_state_map
3213 n_tied_state
213 n_tied_ci_state
71 n_tied_tmat
since the model size size is huge,I takes more time to initialize....
1.is it possible to reduce the number of senones ?
2.how to identify the triphones which are not properly trained?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1.is it possible to reduce the number of senones ?
Numberof senones is small. It's only 3213. Your problem is huge number of
triphones in triphone-to-senone mapping.
2.how to identify the triphones which are not properly trained?
Acoustic model doesn't contain triphones, it contains senones. Most likely all
your senones are properly trained since there are only 3k of them
If your problem is size of mdef file you can do the following to reduce it:
1) Convert it to binary format with pocketsphinx_mdef_convert to speedup
loading and reduce size
2) Drop rare triphones from the mdef file. You can strip rare triphones from
mapping according to triphone frequency in training database. You need to
write an application to calculate frequencies yourself. But this can reduce
accuracy when you will decode rare triphones.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I converted the mdef from txt format to bin using pocketsphinx_convert_mdef
but I get the following error...
where to change the parameter to indicate that it is a binary file...
live:
Parsing file decoders.list ...done parsing decoders.list
Initializing first decoder: Speaker Independent Model ...
... done initializing
Changing to Speaker Independent Model recognizer ...
Warning: Can't allocate recognizer java.io.StreamCorruptedException:
error matching expected string '0.3' in line: 'BMDF' at line 1 in file null
... done changing
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have both the files mdef and mdef.txt but I still get the error:
error matching expected string '0.3' in line: 'BMDF' at line 1 in file null
What should I do?
Provide more details on what are you running
Also the mdef file in the folder en-us and the folder en-us-adapt are identical. Does this mean that I have not done adapting properly?
No, it is expected to have same mdef file, only means should change.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
By now I have understood why the error "error matching expected string '0.3' in line: 'BMDF' at line 1 in file null" was coming.
I just changed the name of file mdef.txt to only mdef and created a backup of the original mdef which was in binary format (and mdef.txt wasn't in binary format).
The program for conversion from speech (from a wav file) to text is working fine. But even after following the steps of adaptation I am still seeing the same results as before. For example "Hello, how are you?" is being converted as "oh! how are you are". I think it is because of my Indian accent. But even after following the adaptation tutorial I am seeing the same tutorial as it is. No change at all.
./mllr_solve shows output as following from 0th to 12th regerssion three times :
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: mllr.c(186): Estimation of 0 th regression in MLLR failed
./map_adapt has the following output in the end :
INFO: s3mixw_io.c(116): Read ./mixw_counts [5126x3x128 array]
INFO: s3tmat_io.c(115): Read ./tmat_counts [42x3x4 array]
INFO: main.c(77): Estimating tau hyperparameter from variances and observations
Segmentation fault (core dumped)
I have also read somewhere else that map_adapt is not required for training sphinx4. Does the Segmentation fault still matters?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for answering my queries till now.
What do you mean by "update sphinxtrain instead"
I tried upgrading sphinxtrain from terminal. It says that sphinxtrain is already in its newest version:
Reading package lists... Done
Building dependency tree
Reading state information... Done
sphinxtrain is already the newest version.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your support.
I have completed adaptation without any errors.
I installed sphinxbase and sphinxtrain from the Github repositories using makefiles.
Everything is working fine. The means files are different in adapted folder than the original. But still there is no change in the output of the wav files which I had recorded.
My main aim is to adapt sphinx to my accent. Will I need something other than adaptation for accepting my accent?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have build a triphone model
0.3
71 n_base
37800 n_tri
151484 n_state_map
3213 n_tied_state
213 n_tied_ci_state
71 n_tied_tmat
since the model size size is huge,I takes more time to initialize....
1.is it possible to reduce the number of senones ?
2.how to identify the triphones which are not properly trained?
Numberof senones is small. It's only 3213. Your problem is huge number of
triphones in triphone-to-senone mapping.
Acoustic model doesn't contain triphones, it contains senones. Most likely all
your senones are properly trained since there are only 3k of them
If your problem is size of mdef file you can do the following to reduce it:
1) Convert it to binary format with pocketsphinx_mdef_convert to speedup
loading and reduce size
2) Drop rare triphones from the mdef file. You can strip rare triphones from
mapping according to triphone frequency in training database. You need to
write an application to calculate frequencies yourself. But this can reduce
accuracy when you will decode rare triphones.
This is not a triphone model. it's tied state acoustic model
I converted the mdef from txt format to bin using pocketsphinx_convert_mdef
but I get the following error...
where to change the parameter to indicate that it is a binary file...
live:
Parsing file decoders.list ...done parsing decoders.list
Initializing first decoder: Speaker Independent Model ...
... done initializing
Changing to Speaker Independent Model recognizer ...
Warning: Can't allocate recognizer java.io.StreamCorruptedException:
error matching expected string '0.3' in line: 'BMDF' at line 1 in file null
... done changing
BMDF means you are still trying to load binary mdef.
yes ,I try to load binary mdef.
Is not possible to load binary mdef in Sphinx4?..i get following error:
Warning: Can't allocate recognizer java.io.StreamCorruptedException:
error matching expected string '0.3' in line: 'BMDF' at line 1 in file null
Binary mdef is not supported in sphinx4
I have adapted en-us as given in the link http://cmusphinx.sourceforge.net/wiki/tutorialadapt
I have both the files mdef and mdef.txt but I still get the error:
error matching expected string '0.3' in line: 'BMDF' at line 1 in file null
What should I do?
Also the mdef file in the folder en-us and the folder en-us-adapt are identical. Does this mean that I have not done adapting properly?
Provide more details on what are you running
No, it is expected to have same mdef file, only means should change.
Actually I am trying to adapt Indian accent of English for use in sphinx4.
I am following the tutorial given in the link http://cmusphinx.sourceforge.net/wiki/tutorialadapt
I have a good source of Indian accent wav files of CMU Arctic at following link:
http://festvox.org/cmu_arctic/cmu_arctic/packed/. The cmu_us_ksp_arctic-0.95-release.tar.bz2 file present in the above link has the Indian accent.
I have also found additional readymade files which are needed for adaptation - arcticAll.fileids and arcticAll.transcription from the link:
https://github.com/romanows/Sphinx-4-Acoustic-Model-Adaptation-Data
By now I have understood why the error "error matching expected string '0.3' in line: 'BMDF' at line 1 in file null" was coming.
I just changed the name of file mdef.txt to only mdef and created a backup of the original mdef which was in binary format (and mdef.txt wasn't in binary format).
The program for conversion from speech (from a wav file) to text is working fine. But even after following the steps of adaptation I am still seeing the same results as before. For example "Hello, how are you?" is being converted as "oh! how are you are". I think it is because of my Indian accent. But even after following the adaptation tutorial I am seeing the same tutorial as it is. No change at all.
The commands I have run till now are :
$ cp -a /usr/local/share/pocketsphinx/model/en-us/en-us .
$ cp -a /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict .
$ cp -a /usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin .
$ sphinx_fe -argfile en-us/feat.params \ -samprate 16000 -c arctic20.fileids \ -di . -do . -ei wav -eo mfc -mswav yes
$ pocketsphinx_mdef_convert -text en-us/mdef en-us/mdef.txt
$ cp /usr/lib/sphinxtrain/sphinxtrain/bw .
$ cp /usr/lib/sphinxtrain/sphinxtrain/map_adapt .
$ cp /usr/lib/sphinxtrain/sphinxtrain/mk_s2sendump .
$ python /usr/lib/sphinxtrain/python/cmusphinx/sendump.py en-us/sendump en-us/mixture_weights
$ ./bw \ -hmmdir en-us \ -moddeffn en-us/mdef.txt \ -ts2cbfn .ptm. \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -cmn current \ -agc none \ -dictfn cmudict-en-us.dict \ -ctlfn arcticAll.fileids \ -lsnfn arcticAll.transcription \ -accumdir .
$ cp /usr/lib/sphinxtrain/sphinxtrain/mllr_solve .
$ ./mllr_solve \ -meanfn en-us/means \ -varfn en-us/variances \ -outmllrfn mllr_matrix -accumdir .
$ cp -a en-us en-us-adapt
$ ./map_adapt -meanfn en-us/means -varfn en-us/variances -mixwfn en-us/mixture_weights -tmatfn en-us/transition_matrices -accumdir . -mapmeanfn en-us-adapt/means -mapvarfn en-us-adapt/variances -mapmixwfn en-us-adapt/mixture_weights -maptmatfn en-us-adapt/transition_matrices
$ ./mk_s2sendump \ -pocketsphinx yes \ -moddeffn en-us-adapt/mdef.txt \ -mixwfn en-us-adapt/mixture_weights \ -sendumpfn en-us-adapt/sendump
As far as important part of output is concerned:
./mllr_solve shows output as following from 0th to 12th regerssion three times :
INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
INFO: mllr.c(186): Estimation of 0 th regression in MLLR failed
./map_adapt has the following output in the end :
INFO: s3mixw_io.c(116): Read ./mixw_counts [5126x3x128 array]
INFO: s3tmat_io.c(115): Read ./tmat_counts [42x3x4 array]
INFO: main.c(77): Estimating tau hyperparameter from variances and observations
Segmentation fault (core dumped)
I have also read somewhere else that map_adapt is not required for training sphinx4. Does the Segmentation fault still matters?
Map adaptation command in tutorial is different from the map adaptation command you are using, you need to use a proper command from tutorial.
It shows the error that the two arguments are unknown:
-moddeffn en-us/mdef.txt and -ts2cbfn .ptm.
and the command doesn't execute at all.
So I removed the first two arguments from it
Last edit: Tejesh Raut 2016-06-29
You should have update sphinxtrain instead.
Thank you for answering my queries till now.
What do you mean by "update sphinxtrain instead"
I tried upgrading sphinxtrain from terminal. It says that sphinxtrain is already in its newest version:
Reading package lists... Done
Building dependency tree
Reading state information... Done
sphinxtrain is already the newest version.
You need to install sphinxtrain from source or ask your package maintainer to update sphinxtrain in your distribution.
Thanks for your support.
I have completed adaptation without any errors.
I installed sphinxbase and sphinxtrain from the Github repositories using makefiles.
Everything is working fine. The means files are different in adapted folder than the original. But still there is no change in the output of the wav files which I had recorded.
My main aim is to adapt sphinx to my accent. Will I need something other than adaptation for accepting my accent?
To get help on accuracy issues you can provide all the data, adaptation files, test files, models, logs.