The problem I need to solve is simpler than full speech recognition. I have a
transcript of the audio, but need to sync it and derive the timing
information. Can pocketsphinx be used for this? What about if the transcript
is mostly correct, but the audio sometimes deviates?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a transcript of the audio, but need to sync it and derive the timing
information. Can pocketsphinx be used for this?
Yes, this issue was discussed on this forum many times. You can search the
forum for details. You need to build a language model from your reference
transcription and just decode using it. More advanced technique is to build
finite-state-machine from reference.
The problem I need to solve is simpler than full speech recognition. I have a
transcript of the audio, but need to sync it and derive the timing
information. Can pocketsphinx be used for this? What about if the transcript
is mostly correct, but the audio sometimes deviates?
Yes, this issue was discussed on this forum many times. You can search the
forum for details. You need to build a language model from your reference
transcription and just decode using it. More advanced technique is to build
finite-state-machine from reference.
http://arxiv.org/abs/cs/0612139
http://www.computer.org/portal/web/csdl/doi/10.1109/ICASSP.2009.4960722
It's always so, I don't see how should it be different.