Pocketsphinx and Phoneme extraction

Help
2014-05-17
2014-05-30
  • David Boccabella

    Hi. Please be gentle as I am a programmer but C is not my strong point :) I can work with Python thou.

    I have recently downloaded Jasper (Pocketsphinx based application) onto my RPi and it may be a solution for a long term project I am working on. If someone can help me then I am willing to renumerate them.

    I do special effects as a hobby with an interest in animatronics and full body costume work. I will use a werewolf costume for my example.

    The costume has been build so that the head is more akin to a wolf's head rather than say a Lon Chanley style head (what I call a Were-Pug). The muzzle has several servos that can deform the lips and also to move the tongue.

    In general movie production these servo movements would be preprogrammed so that the character could talk and seem to enunciate the words. However...

    If the character was interacting ad-hoc with the public then that would not be possible.

    So.. using a microphone the actor inside the suit can talk. As they talk an application listens and then extracts the phonemes from the stream. These phonemes are then sent into a translation table which converts them into servo positioning co-ordinates and then to the servos.

    The effect being that when the character talks his lips give an approximation of the right kind of movements to pronounce the words.

    Accuracy is not paramount here.. I am not looking to fine tune it to the point where one can lip-read. I just want to get a general look to add realism to the character, and something more than just the jaw opening and closing.

    I just need a stream of phonemes coming from the application. I have looked at the file dictionery.dic so I can see how the words are broken up.

    My previous work on this has been to use SimpleCV to detect blobs that have been placed around the actor's lips and use that however although the concept worked - the issue of fitting a camera plus make-up for the actor proved to be unreliable. With the new version a microphone take no room at all.

    Many thanks for anyone who has reached this point and can give me some further advice.

    Take Care
    Dave

     
  • Nickolay V. Shmyrev

    Hello Dave

    This seems to be a cool challenge, however, there is no ready to use solution in pocketsphinx.

    The simplest phoneme decoder can be built with pocketsphinx if your grammar would consist of single-phone words. You can find detail on the grammar here:

    http://cmupshinx.sourceforge.net/wiki/tutoriallm

    So you need a grammar

    public <speech> = (AH | AA | AO | B | CH | ....)*;
    

    and a dictionary

    AH AH
    AA AA
    AO AO
    

    However, such decoder wouldn't provide you a good accuracy. More accurate decoder is available in sphinx3 and not ported to pocketsphinx yet:

    http://cmusphinx.sourceforge.net/wiki/phonemerecognition

    There are more serious problems here. It's almost impossible to recognize phonemes in real-time. You always will have a delay in recognition at least for 0.1 second. So you probably will need to delay the speech from the actor too.

     
  • David Boccabella

    Hello Nickolay.
    Many thanks for your reply. 100ms delay is not that bad - I am not expecting the character to be speaking rapidly, plus as I could not get the RPi to do much on blob recognition I ended up with a ODRIOD XU (quad cortex A15)

    For example - if the Character say's "Well" the response I'd like to get is
    1. Pursing of the lips (W)
    2. Partial opening of mouth (E)
    3. Edges of mouth further drawn back (LL)

    As mentioned accuracy right now is not essential. Something approximate will do.

    The the grammar. I think I understand what you saying. Essentially don't worry about the actual words but rather build a language of only the phone's I am interesting in.

    One last possibility (that I'd prefer not to use) is to have a laptop near by running Sphinx3 and sending the phone feed back to the system in the costume. But if for the time being that will work then that's what I'll try :)

    Looking forward to your suggestions and comments
    Dave

     
  • Nickolay V. Shmyrev

    I am not expecting the character to be speaking rapidly, plus as I could not get the RPi to do much on blob recognition I ended up with a ODRIOD XU (quad cortex A15)

    Raspberry Pi is ok for phoneme recognition, it's not expensive process.

    One last possibility (that I'd prefer not to use) is to have a laptop near by running Sphinx3 and sending the phone feed back to the system in the costume. But if for the time being that will work then that's what I'll try :)

    It's easier to port sphinx3 phonetic decoder to pocketsphinx.

     
  • David Boccabella

    Many thanks for the information Nickolay.

    As mentioned above C is not my strong point so I'll prob have to find someone who is willing to take commissions for programming jobs.

    Not sure how difficult it is to port the Sphinx3 phonetic decoder to pocketsphinx.
    Many thanks
    Dave

     
    • Nickolay V. Shmyrev

      As mentioned above C is not my strong point so I'll prob have to find someone who is willing to take commissions for programming jobs.

      Ok, great. Let me know if you need help on this. We were willing to port this part from sphinx3 but there are not enough insensitives

      Not sure how difficult it is to port the Sphinx3 phonetic decoder to pocketsphinx.

      It's pretty straightforward

       
      • Casey Basichis

        Casey Basichis - 2014-05-30

        Hi, I would also be willing to contribute towards a commission for phoneme alignment as I require it for my project as well.

         
        • Nickolay V. Shmyrev

          Dear Casey

          That would be great, let me know how can I contact you.

           
  • David Boccabella

    Hello Nickolay
    I am having trouble sending you a message. Sourceforge keeps thinking I am not logged in, and when I try and log in it tells me I am already logged in.
    I'd like to chat re modification of Pocketsphinx and running it on a Raspberry Pi.
    Can you let me know how I can contribute? and other details.
    Many thanks
    Dave

     
    • Nickolay V. Shmyrev

      Hi David, sourceforge sometimes has issues.

      In case you need me you can find me in these places:

      gmail: nshmyrev@gmail.com
      skype: nv_shmyrev
      irc: #cmusphinx irc channel on freenode.net

       

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks