I am a beginner, so pardon me if this has been discussed here umpteen times. I will be glad to be directed to appropriate threads or sites.
I want to implement a forced phoneme alignment system using Sphinx. I have audio (English text)and the corresponding text. All I want is the timing for the phonemes.
I don't understand how to begin. Will I need to train models, or I can use the ones that exist.
Thanks a lot
UP
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I am a beginner, so pardon me if this has been discussed here umpteen times. I will be glad to be directed to appropriate threads or sites.
I want to implement a forced phoneme alignment system using Sphinx. I have audio (English text)and the corresponding text. All I want is the timing for the phonemes.
I don't understand how to begin. Will I need to train models, or I can use the ones that exist.
Thanks a lot
UP