Hi
I am trying to build an acoustic model for large vocabulary corpus (for Arabic
language )
Corpus description :
3000 hr of audio file (wav file 16000)
73,000 words (22,000 unique words)
Number of speakers 150 speakers
Each speaker speaks the 22,000 word (or so)
Scenario 1) The system will be trained for some reader,. Then, the MLLR
adaptation algorithm is applied to the remaining readers
Scenario 2) to train the system with all readers at once
Which one is better
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You try to compare both 1) and 2) on smaller database (100 hours) and report
the results here. Then you apply the best of 1) and 2) to your large database.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi
I am trying to build an acoustic model for large vocabulary corpus (for Arabic
language )
Corpus description :
3000 hr of audio file (wav file 16000)
73,000 words (22,000 unique words)
Number of speakers 150 speakers
Each speaker speaks the 22,000 word (or so)
Scenario 1) The system will be trained for some reader,. Then, the MLLR
adaptation algorithm is applied to the remaining readers
Scenario 2) to train the system with all readers at once
Which one is better
Best scenario is 3)
You try to compare both 1) and 2) on smaller database (100 hours) and report
the results here. Then you apply the best of 1) and 2) to your large database.
thank you Nickolay
i will do, and i will let you know