Filler dictionary

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Filler dictionary

Forum: Help

Creator: Ognjen

Created: 2015-11-24

Updated: 2015-11-26

Ognjen - 2015-11-24

I noticed that the filler dictionary in the English PTM from February doesn't contain

[BREATH]
[COUGH]
[NOISE]
[SMACK]
[UH]
[UM]

"words", only [NOISE] and [SPEECH].

I did notice that some of the speech fillers falsely trigger recognition of words recognizer is listening for. I was wondering:
1. what is the significance of [NOISE] and [SPEECH]?
2. when doing adaptation, what should be used to transcribe fillers (e.g. breath, cough, lip smack, etc.)? I'm guessing it should be [NOISE] for all of those, and [SPEECH] for background speech, correct?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-11-26
  
  Right, this is done to reduce model complexity.
  
  All non-speech noises like breath, cough, smack should go to [noise]. All speech noises like uh and um should go to [speech].
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.