pocketsphinx speaker dependent adaptation

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

pocketsphinx speaker dependent adaptation

Forum: Help

Created: 2012-12-05

Updated: 2012-12-17

Ray - 2012-12-05

Hello,
I was wondering if it is possible to adapt pocketsphinx to be a speaker dependent voice recognition. I should recognize words or phrases spoken before. Because I'm not familiar with it can someone please tell me if it is possible?
Thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-12-05

Hello

There is no such functionality in pocketsphinx API

What you can do is to use sphinxbase library to extract MFC coefficients
first, see sphinx_fe source for example on how to do that.

Then you can apply dynamic time warping algorithm to compare the
original recording to the new one. DTW implementation is very simple,
it's just 50 lines of code:

http://en.wikipedia.org/wiki/Dynamic_time_warping

There are few libraries which implement DTW as well, you can find the
links on the wikipedia page.

It would be great to see a pocketsphinx patch demonstrating DTW
implementation.

See also

https://sourceforge.net/p/cmusphinx/discussion/sphinx4/thread/c6f3f2f3/

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Rajaram Pejaver - 2012-12-17
  
  The DTW method is remarkably effective in finding matches.
  A sample implementation is at http://pejaver.com/Temp/dtw.c
  Convert a few phrases using sphinx_fe into .mfc files and feed them to above pgm!!
  Works well for the same speaker.
  
  Can you suggest a way to further improve the accuracy?
  I tried matching velocity and acceleration values. (delta, double delta, 39 columns)
  It took longer and the computer got quite warm, but the accuracy did not improve.
  
  Last edit: Rajaram Pejaver 2012-12-18
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Ray - 2012-12-05

Thank you for the quick reply, and for the informative thread. I'll look them up.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Rajaram Pejaver - 2012-12-13

Ray, Can you explain further what you are trying to do? I think I am doing something similar and am using a different approach. It is more complicated.

I am breaking up the phrase to phonemes (using a phoneme dictionary), adding the new phrase to a dictionary, and adjusting the LM. I am currently stuck in the last step.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-12-14

I am breaking up the phrase to phonemes (using a phoneme dictionary), adding the new phrase to a dictionary, and adjusting the LM. I am currently stuck in the last step.

Rajaram, you are welcome to provide more details on your trouble in order to get help.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.