No I want to extend the model with more .wavs and words/phones.
So I restarted the whole thing with the two folders;
BaseFolder
etc
wav
All files are in place (without the new files, just to test it first) and I get much errors that it can't find .mfc files, that have identical names with the .wav files.
I did run the training setup first and fixed locations for WIndows OS.
Where do the .mfc files come from?
The turtorial never says anything about creating files besides the one in \ETC and \WAV ?
(when I put the .mfc files back, from the model created by Nickolay, it al works fine again)
PS: as extra note: the training says it hasn't anough data, when training your version half a year ago it said that it had not much data but it continued. Maybe this is triggert by the errors in previous stage?
Last edit: Toine db 2014-11-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
And now the .mfc files are getting build but still I get errors on this stage.
You can ignore this error I suppose if everything else works
PS: as extra note: the training says it hasn't anough data, when training your version half a year ago it said that it had not much data but it continued. Maybe this is triggert by the errors in previous stage?
We currently check data amount and require at least 30 mins of data for training.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
We currently check data amount and require at least 30 mins of data for training.
Can this check be disabled? (hopefully)
Because I will never get that amount of data, but the previous training and testing proved Sphinx works with my data.
I have found in verify_all.pl
if ($total_training_hours < 0.5) {
$status = 'FAILED';
$ret_value = -5;
LogWarning("Not enough data for the training");
} else {
Can I remove this check without consequence?
Last edit: Toine db 2014-11-17
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I wanted to create a new Model from scratch with my own .wav samples.
At http://sourceforge.net/p/cmusphinx/discussion/help/thread/b58daa42
Nickolay helped me with creating a trial Model, and that still works great and I can retake the training steps with succes.
No I want to extend the model with more .wavs and words/phones.
So I restarted the whole thing with the two folders;
BaseFolder
All files are in place (without the new files, just to test it first) and I get much errors that it can't find .mfc files, that have identical names with the .wav files.
I did run the training setup first and fixed locations for WIndows OS.
Where do the .mfc files come from?
The turtorial never says anything about creating files besides the one in \ETC and \WAV ?
(when I put the .mfc files back, from the model created by Nickolay, it al works fine again)
This is what I work with: https://onedrive.live.com/redir?resid=53DF68CA92747BA6%2164709
Last edit: Toine db 2014-11-09
http://cmusphinx.sourceforge.net/wiki/tutorialam
Last edit: Nickolay V. Shmyrev 2014-11-09
mfc files are created with sphinx_fe binary on stage 000.comp_feats. It's the first stage of execution.
If files were not created it means there was error on this stage. You can find additional details in the logs in logdir folder.
You can share your training folder in order to get help on this issue.
Nickolay,
I took the last version of the Trunk and build new versions of SphinxBase, SphinxTraining and PocketSphinx.
And now the .mfc files are getting build but still I get errors on this stage.
Here is the exact situation I'm trying to train: https://onedrive.live.com/redir?resid=53DF68CA92747BA6%2168006
(exactly the same as half a year ago when you did it for me)
Hope to hear from you.
And many thanks in advance.
PS: as extra note: the training says it hasn't anough data, when training your version half a year ago it said that it had not much data but it continued. Maybe this is triggert by the errors in previous stage?
Last edit: Toine db 2014-11-14
You can ignore this error I suppose if everything else works
We currently check data amount and require at least 30 mins of data for training.
Can this check be disabled? (hopefully)
Because I will never get that amount of data, but the previous training and testing proved Sphinx works with my data.
I have found in verify_all.pl
Can I remove this check without consequence?
Last edit: Toine db 2014-11-17
Yes, you can remove this check. However, you still need quite a lot of training data if you want to train your model properly.
Tnx for the comment.
It seems to work properly, but maybe I'm misusing sphinx a litle bit for my concept.... so not realy speech recognition.
I'll let you know how it ends with testing and fine tuning.
Thanks again for the help.