However, lda & mllt steps are complete and i get as excepted the
'training_xxx.lda' file (39x39 matrix) and the 'training_xxx.mllt' file (29x39
matrix).
But i don't understand the meaning of those errors and i don't know if those
errors can be resolved ?
3/ In the wiki page, the author says : "The reason is that it's necessary to
do some parts of training several times over. ... This has to be done for each
feature transformation (currently there are two of them as they have been
found to have additive effects)."
I don't really understand this chapter.
Did i've got to re-run the LDA step with a new bootstrap acoustic model ( from
the preceding training ) ?
If true, for the LDA & MLLT step , the options of 'bw' to modify are :
1/ i "only" have a absolute gain of 1.5 % (WER = 51.5 % with lda/mllt , WER
= 53 % without)
It means that the quality of your acoustic model doen't really contribute to
WER. Most likely your model is overtrained or language model/language weight
aren't properly tuned
./../sphinxtrain_cont8000_numgau32_ldamllt/logdir/01.lda_train/callsurf.N-1.
bw.log:ERROR: "s3gau_full_io.c", line 129: Failed to read full covariance file
But i don't understand the meaning of those errors and i don't know if those
errors can be resolved ?
This error is expected and you can ignore it. Even more, in recent trunk it's
not shown
Did i've got to re-run the LDA step with a new bootstrap acoustic model (
from the preceding training ) ?
No, scripts already care about that. Training on stage 01 is repeated on stage
02 and repeated on stage 20.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It means that the quality of your acoustic model doen't really contribute to
WER. Most likely your model is overtrained or language model/language weight
aren't properly tuned
Ok for language model / language weight tuning but you mean that an
overtrained acoustic model could explain that ?
Why sphinx3
and not Sphinx4 ?
Mostly because i'm not familiar with java code ... but i'll try of course
This error is expected and you can ignore it. Even more, in recent trunk
it's not shown
Ok
No, scripts already care about that. Training on stage 01 is repeated on
stage 02 and repeated on stage 20.
Ok
Thanks again for your help !
Stephan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I suggest you try pocketsphinx instead of sphinx3 for reasons outlined on the
website.
By over trained he means in regards to the data its already seen. So one
speaker could be a disproportionate percentage of the audio. So having even
more audio from that person wont benefit from help adaptation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I suggest you try pocketsphinx instead of sphinx3 for reasons outlined on
the website.
Ok but i was understanding that pocketsphinx is dedicated for mobile devices
and light models.
My goal here is to make an acoustic model for conversational speech, large
vocabulary, noisy environment, 2 speakers (overlap speech). I know it's hard
but it's just a try.
So i think i'll switch to Sphinx4 in order to evaluate things not available
with Sphinx3 :
Lattice rescoring, PLP extraction, unsupervised and online acoustic adaptation
( as soon as it's available ) ...
Thanks again,
Stephan
By over trained he means in regards to the data its already seen. So one
speaker could be a disproportionate percentage of the audio. So having even
more audio from that person wont benefit from help adaptation.
Ok.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello !
First of all, thanks to the team to provide a such great toolkit !
I have try LDA/MLLT step in my training (only one iteration, lda_dim = 29 ),
it seems to work but :
1/ i "only" have a absolute gain of 1.5 % (WER = 51.5 % with lda/mllt , WER =
53 % without)
2/ i've got two similar errors in my log files (for lda & mllt steps) :
However, lda & mllt steps are complete and i get as excepted the
'training_xxx.lda' file (39x39 matrix) and the 'training_xxx.mllt' file (29x39
matrix).
But i don't understand the meaning of those errors and i don't know if those
errors can be resolved ?
3/ In the wiki page, the author says : "The reason is that it's necessary to
do some parts of training several times over. ... This has to be done for each
feature transformation (currently there are two of them as they have been
found to have additive effects)."
I don't really understand this chapter.
Did i've got to re-run the LDA step with a new bootstrap acoustic model ( from
the preceding training ) ?
If true, for the LDA & MLLT step , the options of 'bw' to modify are :
Thanks by advance for your response !
Stephan
Why sphinx3?
It means that the quality of your acoustic model doen't really contribute to
WER. Most likely your model is overtrained or language model/language weight
aren't properly tuned
This error is expected and you can ignore it. Even more, in recent trunk it's
not shown
No, scripts already care about that. Training on stage 01 is repeated on stage
02 and repeated on stage 20.
Ok for language model / language weight tuning but you mean that an
overtrained acoustic model could explain that ?
and not Sphinx4 ?
Mostly because i'm not familiar with java code ... but i'll try of course
Ok
Ok
Thanks again for your help !
Stephan
I suggest you try pocketsphinx instead of sphinx3 for reasons outlined on the
website.
By over trained he means in regards to the data its already seen. So one
speaker could be a disproportionate percentage of the audio. So having even
more audio from that person wont benefit from help adaptation.
Ok but i was understanding that pocketsphinx is dedicated for mobile devices
and light models.
My goal here is to make an acoustic model for conversational speech, large
vocabulary, noisy environment, 2 speakers (overlap speech). I know it's hard
but it's just a try.
So i think i'll switch to Sphinx4 in order to evaluate things not available
with Sphinx3 :
Lattice rescoring, PLP extraction, unsupervised and online acoustic adaptation
( as soon as it's available ) ...
Thanks again,
Stephan
Ok.