Menu

pocket sphinx phenome recognition / alignment for seperation

Help
Andy Hu
2016-10-08
2016-10-09
  • Andy Hu

    Andy Hu - 2016-10-08

    Hello,
    I have been using pocket sphinx to do phoneme recognition as documented here:
    http://cmusphinx.sourceforge.net/wiki/phonemerecognition
    by adding the -time argument, I can get the timing of the phonemes, which I use to segment the source file into tiny chunks.

    The software has been easy to use and setup, however, like mentioned on the page, the accuracy is not very good for directly running phenome recognition. My goal is to build a collection of short sound files sorted by phonemes.

    If I have a transcription for the source audio, is there some way to do "alignment" on the audio to get better segmentations? The audio files are only 5 - 20 s long.

    There is a mention in another thread that you can get alignment on pocketsphinx by simply using the transcription as the grammar. https://sourceforge.net/p/cmusphinx/discussion/help/thread/dd998add/

    How would I build such a language model for allphone/phonemes? From what I understand, the allphone search only takes ngram as the model? Should I replace everything in my transcription with phonemes and feed it to srilm ngram-count to use as the language model?

    Or is there some better way to get phoneme timings through alignment in pocketsphinx?

    Thank you very much!

    Also, I think that this page has a typo near the bottom? feed this text file into [strilm] srilm
    http://cmusphinx.sourceforge.net/wiki/phonemerecognition

     
    • Nickolay V. Shmyrev

      How would I build such a language model for allphone/phonemes? From what I understand, the allphone search only takes ngram as the model?

      You can still use grammar with a dictionary of single-phone words.

      There is also ps_alignment API, example of which you can find in tests. But that requires very exact match between reference phoneme string and the actual audio contents.

      Also, I think that this page has a typo near the bottom? feed this text file into [strilm] srilm
      http://cmusphinx.sourceforge.net/wiki/phonemerecognition

      Thank you for the notice, fixed

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.