So, let's consider we can record 40 commands in 1 minute, and we need only these 40 commands. And our system is command and control system for only one user! so we should repeat each of our commands 60times to get one hour of recorded file!
for example I should record 60times my sound and say "go forward" or "how are you", etc. And do it for each command!
Am I right?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What should be the structure of train.fileids file?
struct1:
123
or
struct2:
111.(60times).122(60times).23(60times)3
I think I read somewhere in tutorial, it said the number of lines in transcription and fileids file should be equal! So, shall I repaet each number(the name of recorded wav file) for 60 times like structure 2?
Last edit: rezaee 2016-10-23
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Let's consider I have 2 commands and 10 recorded file for each command (1 to 10 for command 1 and 11 to 20 for command 2), ok?
Should be the structrurs of .fileids and .transcription as bellow?
In the tutorial says:
So, let's consider we can record 40 commands in 1 minute, and we need only these 40 commands. And our system is command and control system for only one user! so we should repeat each of our commands 60times to get one hour of recorded file!
for example I should record 60times my sound and say "go forward" or "how are you", etc. And do it for each command!
Am I right?
You are correct.
Very thank you Nikolay!
What should be the structure of the "train.transcription" file?
For example, the wav files from 1 to 60 are the "hello world" recorded sound, so the structure of train.transcription file will looks like as bellow?
Last edit: rezaee 2016-10-22
You need space after
<s>
and before</s>
What should be the structure of train.fileids file?
struct1:
or
struct2:
I think I read somewhere in tutorial, it said the number of lines in transcription and fileids file should be equal! So, shall I repaet each number(the name of recorded wav file) for 60 times like structure 2?
Last edit: rezaee 2016-10-23
May you answer the last question pls
It would be useful if you reviewed the tutorial once again: http://cmusphinx.sourceforge.net/wiki/tutorialam#data_preparation
Each line in ".fileids" should have the associated line in ".transcription"
The transcription file should have a text line and the file name inside braces. The exact structure depends on how you name your files
Thank you!
Let's consider I have 2 commands and 10 recorded file for each command (1 to 10 for command 1 and 11 to 20 for command 2), ok?
Should be the structrurs of .fileids and .transcription as bellow?
.fileids:
.transcroption:
yes, the example you provided seems OK