Top level :
/root/project/SphinxTrain/scripts_pl/Test
Structure of etc dir :
/root/project/SphinxTrain/scripts_pl/Test/etc/sphinx_train.cfg
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.dic
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.fileids
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.filler
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.phone
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.transcription
/root/project/SphinxTrain/scripts_pl/Test/etc/word.known
/root/project/SphinxTrain/scripts_pl/Test/etc/word.unknown
Structure of feat dir :
/root/project/SphinxTrain/scripts_pl/Test/feat/test0001.feat
Struture of wav dir :
/root/project/SphinxTrain/scripts_pl/Test/wav/test0001.raw
Data of : /root/project/SphinxTrain/scripts_pl/Test/etc/Test.fileids
test0001
Data of : /root/project/SphinxTrain/scripts_pl/Test/etc/Test.transcription
APRIL AUGUST DECEMBER EIGHT EIGHTEEN EIGHTEENTH EIGHTH EIGHTY ELEVEN ELEVENTH ENTER ERASE FEBRUARY (test0001)
First I used sox to convert the test0001.wav to test0001.raw.
Then I used bin/make_feats to generate the feat file. And I modified the content/parameter in make_feats to
bin/wave2feat -verbose -c -raw -di wav -ei raw -do feat -eo feat.
I run bin/make_feats etc/Test.fileids.And test0001.feat is generated.
Thanks in advanced.
Regards,CL
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2004-02-12
The structure of your directories appears to be correct, and the method you used to compute the .feat file is also good. The contents of your one-line Test.fileids file *appears* to be correct, yet the script warns that it was unable to parse that single line.
In the script Test/scripts_pl/00.verify/verify_all.pl, the regular expression that parses the lines of the .fileids file is on line 195, and it requires a string of non-whitespece characters followed by one whitespace character (<space>, \r, \n, \t, or \f) followed by zero or more other characters. Therefore I conclude that this line in your Test.fileids file must not have any end-of-line character following "test0001". That's the only thing that I can think of that would cause the parse of that line to fail. I hope that this is your problem!
Let me also comment that the example you have shown is much too little data for training of any kind of acoustic model. I believe that some of the later steps (such as the Baum-Welch computations in step 02) are likely to fail because there's so little data, and the results will be meaningless. If you intend to use this example in order to verify the process of running scripts, OK, but please understand that much more data is needed to train a meaningful acoustic model.
cheers,
jerry
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2004-02-16
Hi Jerry,
First of all thanks for your reply and guilde lines regarding to my problems .
I checked to test0001.fileids and there is not empty line or empty space after test0001. I also don't know why i have this errors.
But I managed to run all the scripts file. I m not tryting to prepare more data for the training process. Anyway, can I have training data less than 8 hours ?? This is becasue referring to my research scoping, I don't have 8 hours data .
Thanks .
Regards,Chee Leong
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2004-02-20
Someone on this forum has proposed a figure of 8 hours of speech as a minimum amount of training data. If the training data is unrelated to your application and you wish to make an acoustic model that can generalize to unseen contexts, then I agree that about 8 hours may be the minimum needed (and more would be better).
But if your training data is closely related to your application, and even contains all of the same words and word-pairs as your application, then I believe that you may be able to use much less than 8 hours of training data. You would be training some triphones well and others not at all; that may be OK if your application is limited to the well-trainid triphones only.
How much less? I cannot say, as I have no experience with doing that.
cheers,
jerry
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My directoy struture is as below :
Top level :
/root/project/SphinxTrain/scripts_pl/Test
Structure of etc dir :
/root/project/SphinxTrain/scripts_pl/Test/etc/sphinx_train.cfg
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.dic
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.fileids
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.filler
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.phone
/root/project/SphinxTrain/scripts_pl/Test/etc/Test.transcription
/root/project/SphinxTrain/scripts_pl/Test/etc/word.known
/root/project/SphinxTrain/scripts_pl/Test/etc/word.unknown
Structure of feat dir :
/root/project/SphinxTrain/scripts_pl/Test/feat/test0001.feat
Struture of wav dir :
/root/project/SphinxTrain/scripts_pl/Test/wav/test0001.raw
Data of : /root/project/SphinxTrain/scripts_pl/Test/etc/Test.fileids
test0001
Data of : /root/project/SphinxTrain/scripts_pl/Test/etc/Test.transcription
APRIL AUGUST DECEMBER EIGHT EIGHTEEN EIGHTEENTH EIGHTH EIGHTY ELEVEN ELEVENTH ENTER ERASE FEBRUARY (test0001)
First I used sox to convert the test0001.wav to test0001.raw.
Then I used bin/make_feats to generate the feat file. And I modified the content/parameter in make_feats to
bin/wave2feat -verbose -c -raw -di wav -ei raw -do feat -eo feat.
I run bin/make_feats etc/Test.fileids.And test0001.feat is generated.
Thanks in advanced.
Regards,CL
The structure of your directories appears to be correct, and the method you used to compute the .feat file is also good. The contents of your one-line Test.fileids file *appears* to be correct, yet the script warns that it was unable to parse that single line.
In the script Test/scripts_pl/00.verify/verify_all.pl, the regular expression that parses the lines of the .fileids file is on line 195, and it requires a string of non-whitespece characters followed by one whitespace character (<space>, \r, \n, \t, or \f) followed by zero or more other characters. Therefore I conclude that this line in your Test.fileids file must not have any end-of-line character following "test0001". That's the only thing that I can think of that would cause the parse of that line to fail. I hope that this is your problem!
Let me also comment that the example you have shown is much too little data for training of any kind of acoustic model. I believe that some of the later steps (such as the Baum-Welch computations in step 02) are likely to fail because there's so little data, and the results will be meaningless. If you intend to use this example in order to verify the process of running scripts, OK, but please understand that much more data is needed to train a meaningful acoustic model.
cheers,
jerry
Hi Jerry,
First of all thanks for your reply and guilde lines regarding to my problems .
I checked to test0001.fileids and there is not empty line or empty space after test0001. I also don't know why i have this errors.
But I managed to run all the scripts file. I m not tryting to prepare more data for the training process. Anyway, can I have training data less than 8 hours ?? This is becasue referring to my research scoping, I don't have 8 hours data .
Thanks .
Regards,Chee Leong
Someone on this forum has proposed a figure of 8 hours of speech as a minimum amount of training data. If the training data is unrelated to your application and you wish to make an acoustic model that can generalize to unseen contexts, then I agree that about 8 hours may be the minimum needed (and more would be better).
But if your training data is closely related to your application, and even contains all of the same words and word-pairs as your application, then I believe that you may be able to use much less than 8 hours of training data. You would be training some triphones well and others not at all; that may be OK if your application is limited to the well-trainid triphones only.
How much less? I cannot say, as I have no experience with doing that.
cheers,
jerry