So I'm trying to train a new model for mi language. For now just doing it using 5.6 hrs of data to debug the process (intend to scale to more data once it's debugged)
Using the same configuration (except for binary paths) and same exact data (the 5.6 hours) on my mac this process runs to completion, and I get a model with 18% WER. However when I try to run exact same setup inside a Docker container Sphinxtrain stops at the Phase 2: Flat initialize step, waiting for the mixture weights file - and doesn't continue from there no matter how long I wait.
FWIW it gets stuck at this line in 20.ci_hmm/baum_welch.pl ..
Log("Baum welch starting for $n_gau Gaussian(s), iteration: $iter ($part of $npart)",
'result');
# Sometimes NFS causes the mixture weight file to not be visible yet (?!!)
until (-f $mixwfn) {
print "Waiting for $mixwfn\n";
sleep 1;
}
This is ubuntu:17.10, with sphinxtrain and sphinxbase installed from github source - per the Dockerfile which is also included in the attached. I have included the logdir, etc and other directories (though obviously not all the training data) in the attached.
Thanks in advance for any help. And if it's Nickolay who reads this let me say in advance thanks so much for all your help - I've been getting a lot of help along the way with reading your other comments but seem to be stuck on this one!
FWIW it gets stuck at this line in 20.ci_hmm/baum_welch.pl ..
Most likely it still fails to find the shared library, you need to run the sphinxtrain binaries in libexec from command line inside docker and see what happens.
Thanks in advance for any help. And if it's Nickolay who reads this let me say in advance thanks so much for all your help - I've been getting a lot of help along the way with reading your other comments but seem to be stuck on this one!
Thank you for trying cmusphinx.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This sorted it out - I think. At least we're past that step successfully.
For other's who come after, I found this, sort of per your suggestion, by adding some debugging to the RunTool method in /usr/local/lib/sphinxtrain/scripts/lib/SphinxTrain/Util.pm - I eventually realised that it was failing to find the binary at all (mk_mdef_gen in this case).
It might be helpful if Util.pm RunTool output the path of the binary (fcmd) that it's looking for - especially in the case that it doesnt look like its's there as a file - instead of just running cmd as a bare command and hoping its on the path.
I also think I got into this problem by running the "sphinxtrain -t an4 setup" step (which is referenced on https://cmusphinx.github.io/wiki/tutorialam/) too early. I had not yet run make install at that stage and just ran it directly in the place I had installed it - resuliting in wrong paths in the template for the config.
Anyway I'm good now. Thanks for your help!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For instance RunTool could add just a warning to the log "Cannot find /usr/lib/sphinxtrain/mk_mdef_gen. Did you set CFG_BIN_DIR correctly? Trying to exec 'mk_mdef_gen' and see if it is in the path" ? Would save a ton of time for some people???
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hey all,
So I'm trying to train a new model for mi language. For now just doing it using 5.6 hrs of data to debug the process (intend to scale to more data once it's debugged)
Using the same configuration (except for binary paths) and same exact data (the 5.6 hours) on my mac this process runs to completion, and I get a model with 18% WER. However when I try to run exact same setup inside a Docker container Sphinxtrain stops at the Phase 2: Flat initialize step, waiting for the mixture weights file - and doesn't continue from there no matter how long I wait.
FWIW it gets stuck at this line in 20.ci_hmm/baum_welch.pl ..
This is ubuntu:17.10, with sphinxtrain and sphinxbase installed from github source - per the Dockerfile which is also included in the attached. I have included the logdir, etc and other directories (though obviously not all the training data) in the attached.
Thanks in advance for any help. And if it's Nickolay who reads this let me say in advance thanks so much for all your help - I've been getting a lot of help along the way with reading your other comments but seem to be stuck on this one!
Thanks
https://www.dropbox.com/sh/njy6wqaiz3tsena/AAAaJJ_p5a-zvsicoiMRGt80a?dl=0
PS For anyone who reads this in future the ^^ above ^^ link will only work for a period of time, sorry.
Most likely it still fails to find the shared library, you need to run the sphinxtrain binaries in libexec from command line inside docker and see what happens.
Thank you for trying cmusphinx.
Ah! Yes it was failing to find the relevant file. I've now fixed it by changing my config (in etc/sphinx_train.cfg) from
to
This sorted it out - I think. At least we're past that step successfully.
For other's who come after, I found this, sort of per your suggestion, by adding some debugging to the RunTool method in /usr/local/lib/sphinxtrain/scripts/lib/SphinxTrain/Util.pm - I eventually realised that it was failing to find the binary at all (mk_mdef_gen in this case).
It might be helpful if Util.pm RunTool output the path of the binary (fcmd) that it's looking for - especially in the case that it doesnt look like its's there as a file - instead of just running cmd as a bare command and hoping its on the path.
I also think I got into this problem by running the "sphinxtrain -t an4 setup" step (which is referenced on https://cmusphinx.github.io/wiki/tutorialam/) too early. I had not yet run make install at that stage and just ran it directly in the place I had installed it - resuliting in wrong paths in the template for the config.
Anyway I'm good now. Thanks for your help!
For instance RunTool could add just a warning to the log "Cannot find /usr/lib/sphinxtrain/mk_mdef_gen. Did you set CFG_BIN_DIR correctly? Trying to exec 'mk_mdef_gen' and see if it is in the path" ? Would save a ton of time for some people???