Menu

What exactly to begin with for a new language

Help
pannam
2016-05-10
2016-05-10
  • pannam

    pannam - 2016-05-10

    Hi,
    I am very new so please bear with me. I have gone thru the tutorial as a new bie I sort of get the idea but dont exactly know where to begin with. I tried all other video tutorial but most of them are either broken or for english. I want to create one for my own language. I have read I need a recording for 20 hours or so. But how exactly do I use this recording or what exactly is the method itself ? The recording should be compared with english words ...?
    Its been weeks since I tried to get the information working so I hope someone could provide me a link or some tips on how exactly to begin for a new language ? Or if there is any sub topic or forum where this things can be discussed. My goal eventually is to create an android app.
    Regards,
    Pannam

     

    Last edit: pannam 2016-05-10
    • Nickolay V. Shmyrev

      This is pretty straightforward, you actually just need to follow the documentation and you can get to the point. You also need to have a knowledge of the scripting language which will help you to cut manual work on some steps.

      • Read Introduction - CMUSphinx Tutorial introduction to become familiar with concepts of speech recognition - features, acoustic models, language models, etc.
      • Try CMUSphinx with US English model to understand how things work. Try to train with sample US English AN4 database.
      • Read about your language in Wikipedia
      • Collect a set of transcribed recordings for your language - interviews, audiobooks or record them yourself.
      • Based on the data you collected, create a list of words and a phonetic dictionary. Most phonetic dictionaries could be created with a simple rules with a small script in your favorite scripting language like Python. See Generating a dictionary for details.
      • Segment the audio to short sentences manually or with sphinx4 aligner, create a database with required files as described in training tutorial Training Acoustic Model For CMUSphinx
      • Integrate new model into your application and design a data collection to improve your model.
        If you have questions, feel free to ask.
       

Log in to post a comment.