Menu

Resampling to properly train acoustic model

Help
Ernest
2018-11-10
2018-11-12
  • Ernest

    Ernest - 2018-11-10

    Dear All,

    PROBLEM:
    I want to train acoustic model for Polish language and for that I have to collect recordings first.
    BTW. - I have sucesfully trained a model to recognize my own speech. As I wanted to primarly use it to recognize speech over a telephone I have used recordings with 8kHz sample rate.

    QUESTION:
    Can I collect recordings with sample rate of 16kHz , so that I can use them to train acustic model for desktop applications AND after downsampling to 8kHz (e.g. using sox or audacity) use it as well to train acustic model for telephone speech recognition?
    I simply want to avoid collecting recordings twice, for different sample rates.

    REMARKS:
    Here https://cmusphinx.github.io/wiki/tutorialam/#data-preparation I've found this: "Please note that you cannot upsample your audio, that means you can not train 16 kHz model with 8 kHz data." Does it mean that I can do downsampling?

    Thank you in advance,
    Ernest

     

    Last edit: Ernest 2018-11-10
    • Nickolay V. Shmyrev

      Usually telephone audio is quite different from wideband audio due to different codecs and corruptions. That is why telephone model has to be trained on telephone data. Downsampled wideband data can be used for boostrap or for initial model but results in low quality.

       
      • Ernest

        Ernest - 2018-11-10

        Hello Nickolay,

        thank you for your prompt reply!

        I would be grateful for further suggestions from an expert like you, on how can I proceed then:

        1. I have already developed a website (https://naukait.com:7171/) to collect recordings. Somehow I skipped this part of the tutorial: "if you are going to recognize telephone speech it is preferred to use telephone recordings. ". Can't I still make recordings with my website -> 16kHz sampling rate, and downsample to 8kHz plus use some "trick" to make audio similar to this obtained over a telephone?

        2. I understand that for desktop apps my 16kHz recordings should be ok?

        3. What is bootstrap and what is initial model? Are these some kind of models which then has to be further improved? If yes, then how could this be done?

        REMARKS:
        My website for recording asks users to record specific sentences. They result from the research (not mine) and come from so called "CORPORA dictionary". These specific sentences are supposed to improve trainign of an acoustic model for Polish language.

        Best regards,
        Ernest

         

        Last edit: Ernest 2018-11-10
        • Nickolay V. Shmyrev

          I would be grateful for further suggestions from an expert like you, on how can I proceed then:

          To give you the suggestion I need to understand the goal of your development and your resources.

          Overall, recording specific data is not reasonable these days, you just get thousands of hours of speech from the external sources, not necessary transcribed. You can check https://github.com/jimregan/wolnelektury-audio-corpus for example.

           
          • Ernest

            Ernest - 2018-11-12

            Hello Nickolay,

            thank you for your valuable remark and the link.
            My initial goal was to develop a service over a telephone which provides schedules for public transportation. I already have a prototype based on Asterisk and Unimrcp and my acoustic model for Polish language but able to only recognize my own speech. I haven't managed to find free acoustic model for Polish, so I have decided to create my own. Therefore I've developed this website to collect more recordings, in order to build a complete acoustic model for Polish. I have also decided that it would be great if I could use these recordings to also create acoustic model for desktop apps. Therefore I thought that maybe it's better if I collect recordings with 16kHz sample rate, which I can then downsample.

            Best regards,
            Ernest

             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.