Hello,
I'm preparing data for the acoustic model to recognize the speech of kids reading primary school stories. my question is about that sentence in the tutorial:
"the database should have recording of enough speakers, a variety of recording conditions, enough acoustic variations and all possible linguistic sentences"
the database should have recording of enough speakers:
how many speakers are enough? we thought that 10 hours of soeech would be enough for us. but can't decide how we should distribute the speakers. is it okay to make many speakers read the same sentences, or is it better to find a few speakers and make them read many sentences?
a variety of recording conditions, enough acoustic variations:
what's the difference between them? I understand these are about recording inside a studio, outdoors, in a noisy environment etc. do i get it right?
all possible linguistic sentences:
doesn't it mean infinitely many sentences? how can we cover all the possible sentences in a language?
thank you
burak
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm preparing data for the acoustic model to recognize the speech of kids reading primary school stories. my question is about that sentence in the tutorial:
"the database should have recording of enough speakers, a variety of recording conditions, enough acoustic variations and all possible linguistic sentences"
the database should have recording of enough speakers:
how many speakers are enough? we thought that 10 hours of soeech would be enough for us. but can't decide how we should distribute the speakers. is it okay to make many speakers read the same sentences, or is it better to find a few speakers and make them read many sentences?
a variety of recording conditions, enough acoustic variations:
what's the difference between them? I understand these are about recording inside a studio, outdoors, in a noisy environment etc. do i get it right?
all possible linguistic sentences:
doesn't it mean infinitely many sentences? how can we cover all the possible sentences in a language?
thank you
burak
Tutorial says at least 200
Modern databases are 50+ hours, ideally 500+ hours
You understand correctly
No, it does not mean infinite amount of sentences. You can download any example database like librispeech and follow the example.