I repeated every word by average 10 and I recorded these sentences by 4 people
and I get 3% word error rate and 10 % sentence error rate
is it sufficient to do that or should I record more sounds ?
can I send you the result to take overall look on it ( I mean the model
parameter and model architecture and wav file ) all the database file ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I mean I have 5 speaker and each speaker record all the 160 sentences then the
error rate become 3% word and 10%sent
but in documentation I see that we should record 20 hours , do you thing that
this is sufficient ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have 5 speaker and each speaker record all the 160 sentences then the
error rate become 3% word and 10%sent
It seems you didn't follow proper instructions about the test set, and you
test on the same audio you train. If you test properly error rate will be
bigger
documentation I see that we should record 20 hours
Documentation is correct if you are asking about that.
do you thing that this is sufficient
Sufficiency is only defined by requirements you have which I do not know.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
my requirements is that I have graduate project about speech recognition and I
want to make 86 word available for multi speaker , I notice that rm1 database
has about 1200 words with only 1600 sentence and only 80 speaker and this is
not like documentation , can you explain why they do that ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have the following sentences used for training the acoustic model
I repeated every word by average 10 and I recorded these sentences by 4 people
and I get 3% word error rate and 10 % sentence error rate
is it sufficient to do that or should I record more sounds ?
can I send you the result to take overall look on it ( I mean the model
parameter and model architecture and wav file ) all the database file ?
Suffucient for what. Your requirements are not clear.
Before sending something to look on you need to describe the problem.
I mean I have 5 speaker and each speaker record all the 160 sentences then the
error rate become 3% word and 10%sent
but in documentation I see that we should record 20 hours , do you thing that
this is sufficient ?
It seems you didn't follow proper instructions about the test set, and you
test on the same audio you train. If you test properly error rate will be
bigger
Documentation is correct if you are asking about that.
Sufficiency is only defined by requirements you have which I do not know.
my requirements is that I have graduate project about speech recognition and I
want to make 86 word available for multi speaker , I notice that rm1 database
has about 1200 words with only 1600 sentence and only 80 speaker and this is
not like documentation , can you explain why they do that ?
can I get samples of rm1 wav files to know what is the quality of recording
that they use ?
In 1989 when RM1 was designed it was not really easy to record more than they
recorded. RM1 is not sufficient for real dictation though.
The quality is similar to the quality in an4 database.