Best way of dealing with fillers in LM

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Best way of dealing with fillers in LM

Forum: Help

Creator: osman b

Created: 2009-01-15

Updated: 2012-09-22

osman b - 2009-01-15

Hi,

I would like to ask the best way of dealing with filler models in LM.

I have trained filler models for speech samples such as DTMF, different noises (cough, breath etc...), sentence begin and end silences. Now I would like to learn what is the best way of incorporating these models in to a bi-gram language model.

I train language model with CMU SLM language model toolkit. In this toolkit, there is an option to define "contex cues" while preparing LM. The filler dictionary entries should be defined as context cues while trainign LM? Or is there a better way of dealing with filler words in LM such as not including fillers in vocabulary of LM.

Thank you very much

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2009-01-17
  
  Fillers are inserted automatically after each word. So the language model shouldn't include them at all.
  
  The only thing you should care about in language model is phrase boundaries. But they are completely different from fillers.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.