Out Of vocabulary Words

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Out Of vocabulary Words

Forum: Speech Recognition Theory

Creator: Madhav Kishore

Created: 2010-03-24

Updated: 2012-09-22

Madhav Kishore - 2010-03-24

Hi,
whether any methods defined in Sphinx for recognizing OOV words ?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-03-25

The common approach to deal with OOV problem is to make dictionary with
smaller units like word-fragments and phones so that being combined they could
form almost every word. The setup to decode with word fragments is not
different from usual setup it includes definition of the language model and
the dictionary. For CMU Sphinx decoders it doesn't matter if language model is
subword-based or word based.

To build subword language model specialized software is used. CMU Sphinx
doesn't provide tools to do that yet. One of frequently used free tools is
Sequitur-G2P.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.