During one of discussions on voxforge you have mentioned that incremental
training is possible in SphinxTrain (at least I understand the conclusion this
way). By incremental training I mean a training in which a previous model is
given and new audio and transcription files are provided to extend this model
(no adaptation but full training only on this extra files in order to get). Is
it really possible? Could you sketch an algorithm how to use Sphinx train
modules to get such result ? Or any suggestions where to point to in order to
get such a functionality?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is nothing tricky here. SphinxTrain is a collection of binary utilities
to work with models. Main are 'bw' to accumulate counts and 'norm' to
reestimate models using those counts. So in your scripts you can just use
those two commands to update your existing model using new data.
A simple case would be just to replace fileids and transcription and run
reestimation iteration using sphinxtrain perl scripts.
dear sir,
i have tried an incremental training experiment and got an error. i have used
script perl and trained in semi mode.
first step, i train semi model with large training database. after finish, i
have cd_semi_5000 model.
second step, i replace large training database with a small database to train
some special models, then run script 50.cd_hmm_tied/slave_convg.pl from
iterator 2 (to skip initial step). following that, the last trained model are
kept .
nothing happen util done.
i has changed in script_pl/50.cd_hmm_tied/slave_convg.pl line 58 : my $iter =
1; to my $iter = 2;
i copied all files in directory "model_structure", "model_parameters" on
previous trained to this project and run script:
script_pl/50.cd_hmm_tied/slave_convg.pl
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm afraid that current scripts are not ready for incremental training, so
$iter variable has nothing to do with the concept I have described in the
initial post. I believe that *_slave scripts of most of the training scripts
are to be modified in order to get expected functionality.
Marek
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Nicolay,
During one of discussions on voxforge you have mentioned that incremental
training is possible in SphinxTrain (at least I understand the conclusion this
way). By incremental training I mean a training in which a previous model is
given and new audio and transcription files are provided to extend this model
(no adaptation but full training only on this extra files in order to get). Is
it really possible? Could you sketch an algorithm how to use Sphinx train
modules to get such result ? Or any suggestions where to point to in order to
get such a functionality?
Hello Marek
There is nothing tricky here. SphinxTrain is a collection of binary utilities
to work with models. Main are 'bw' to accumulate counts and 'norm' to
reestimate models using those counts. So in your scripts you can just use
those two commands to update your existing model using new data.
A simple case would be just to replace fileids and transcription and run
reestimation iteration using sphinxtrain perl scripts.
Very outdated docs are here:
http://www.speech.cs.cmu.edu/sphinxman/fr4.html
But probably it's easier to cook something yourself.
dear sir,
i have tried an incremental training experiment and got an error. i have used
script perl and trained in semi mode.
first step, i train semi model with large training database. after finish, i
have cd_semi_5000 model.
second step, i replace large training database with a small database to train
some special models, then run script 50.cd_hmm_tied/slave_convg.pl from
iterator 2 (to skip initial step). following that, the last trained model are
kept .
nothing happen util done.
but when i check model parameter, i got zero transition matrix in
transition_matrices files.
here is my transition_matrices file : http://dl.dropbox.com/u/5137777/transit
ion_matrices
is it an error ? could you tell me why it 's happen ?
thank in advance.
No idea, you definitely need to provide way more information like logs, files,
training folder
here is my project folder.
http://dl.dropbox.com/u/5137777/dec082010_fanctrl.tar.gz
i has changed in script_pl/50.cd_hmm_tied/slave_convg.pl line 58 : my $iter =
1; to my $iter = 2;
i copied all files in directory "model_structure", "model_parameters" on
previous trained to this project and run script:
script_pl/50.cd_hmm_tied/slave_convg.pl
I'm afraid that current scripts are not ready for incremental training, so
$iter variable has nothing to do with the concept I have described in the
initial post. I believe that *_slave scripts of most of the training scripts
are to be modified in order to get expected functionality.
Marek