Hi,
I am very new so please bear with me. I have gone thru the tutorial as a new bie I sort of get the idea but dont exactly know where to begin with. I tried all other video tutorial but most of them are either broken or for english. I want to create one for my own language. I have read I need a recording for 20 hours or so. But how exactly do I use this recording or what exactly is the method itself ? The recording should be compared with english words ...?
Its been weeks since I tried to get the information working so I hope someone could provide me a link or some tips on how exactly to begin for a new language ? Or if there is any sub topic or forum where this things can be discussed. My goal eventually is to create an android app.
Regards,
Pannam
Last edit: pannam 2016-05-10
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This is pretty straightforward, you actually just need to follow the documentation and you can get to the point. You also need to have a knowledge of the scripting language which will help you to cut manual work on some steps.
Read Introduction - CMUSphinx Tutorial introduction to become familiar with concepts of speech recognition - features, acoustic models, language models, etc.
Try CMUSphinx with US English model to understand how things work. Try to train with sample US English AN4 database.
Read about your language in Wikipedia
Collect a set of transcribed recordings for your language - interviews, audiobooks or record them yourself.
Based on the data you collected, create a list of words and a phonetic dictionary. Most phonetic dictionaries could be created with a simple rules with a small script in your favorite scripting language like Python. See Generating a dictionary for details.
Segment the audio to short sentences manually or with sphinx4 aligner, create a database with required files as described in training tutorial Training Acoustic Model For CMUSphinx
Integrate new model into your application and design a data collection to improve your model.
If you have questions, feel free to ask.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I am very new so please bear with me. I have gone thru the tutorial as a new bie I sort of get the idea but dont exactly know where to begin with. I tried all other video tutorial but most of them are either broken or for english. I want to create one for my own language. I have read I need a recording for 20 hours or so. But how exactly do I use this recording or what exactly is the method itself ? The recording should be compared with english words ...?
Its been weeks since I tried to get the information working so I hope someone could provide me a link or some tips on how exactly to begin for a new language ? Or if there is any sub topic or forum where this things can be discussed. My goal eventually is to create an android app.
Regards,
Pannam
Last edit: pannam 2016-05-10
This is pretty straightforward, you actually just need to follow the documentation and you can get to the point. You also need to have a knowledge of the scripting language which will help you to cut manual work on some steps.
If you have questions, feel free to ask.