Menu

Sphinx technology for toys.

2014-07-12
2014-07-13
  • Mukund Ghosh

    Mukund Ghosh - 2014-07-12

    Is Sphinx technology (specially Pocketsphinx ) suitable for deployment in toys ? I'd read somewhere that speech recognition in general doesn't work very well for kids and special techniques like vtln and others are required to achieve reasonable accuracies.

    Thanks.

     
  • Nickolay V. Shmyrev

    Is Sphinx technology (specially Pocketsphinx ) suitable for deployment in toys ?

    Yes

    I'd read somewhere that speech recognition in general doesn't work very well for kids and special techniques like vtln and others are required to achieve reasonable accuracies

    This is correct, it doesn't mean that you can't implement required features.

     
  • Mukund Ghosh

    Mukund Ghosh - 2014-07-12

    Is it too bad without vtln (on kids) ? Is there plan to support vtln in pocketsphix? Will existing models be usable with vtln ?

    Thanks.

     
    • Nickolay V. Shmyrev

      Is it too bad without vtln (on kids) ?

      Why don't you try first and ask if you have issues

      Is there plan to support vtln in pocketsphix?

      No

      Will existing models be usable with vtln ?

      It's not clear what do you mean by "usable"

       
  • Mukund Ghosh

    Mukund Ghosh - 2014-07-13

    Well the issue is I have some familiarity with Pocketsphinx/Sprec. I know a group who is planning to use Speech recognition for deployment in Toys and have asked me for advice. They are looking for something compact as it will be an embedded application and PS seems to fit them well. They want to do some top level timeline estimation and so we have to see what exists and what needs to be developed.
    In light of this fact my questions boil down to:

    a) Can PS be used "as is" with reasonable accuracy with kids with an existing acoustic model ?
    b) Would "An acoustic model trained on kids" help over an existing model (say en-us) ?
    c) Would vtln be absolutely necessary to do something reasonably demoable and if yes, is it possible to use one of the existing models (say en-us) with vtln implemented in decoding or new models need to be trained (vtln used in both training and decoding) ?

    We'll of course be doing rigorous testing on it later but for now just need some ballpark answers for planning.

    Thanks.

     
  • Nickolay V. Shmyrev

    Can PS be used "as is" with reasonable accuracy with kids with an existing acoustic model ?

    No

    b) Would "An acoustic model trained on kids" help over an existing model (say en-us) ?

    Yes

    c) Would vtln be absolutely necessary to do something reasonably demoable

    No

    is it possible to use one of the existing models (say en-us) with vtln implemented in decoding or new models need to be trained (vtln used in both training and decoding) ?

    You need new models for children

     

Log in to post a comment.