I'm using pocketsphinx to generate automated transcriptions of audio files and so far I'm getting on great, and I have output such as for each word in my audio file. Just a timestamped begin/end transcription.
Does pocketsphinx have something like this wrapper built in somewhere and if not is there either a simple way to implement it myself or another library out there I can modify?
My current code that gets my transcription is is (with my xml file encoding excluded as it's not relevant)
I'm using pocketsphinx to generate automated transcriptions of audio files and so far I'm getting on great, and I have output such as for each word in my audio file. Just a timestamped begin/end transcription.
My question is I would also like the phonetic prouninciation included in the transcription. I can see there is a python wrapper for the https://github.com/cmusphinx/cmudict dictionary at https://github.com/cmusphinx/cmudict-tools
Does pocketsphinx have something like this wrapper built in somewhere and if not is there either a simple way to implement it myself or another library out there I can modify?
My current code that gets my transcription is is (with my xml file encoding excluded as it's not relevant)
Last edit: Benjamin Gorman 2015-08-07
I came up with a solution to my problem.
My https://github.com/benjgorman/ofxAutomatedCaptions shows how I solved it for future reference.