Dictionary ambiguities when training an acoustic model?

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Dictionary ambiguities when training an acoustic model?

Forum: Help

Creator: Daniel Wolf

Created: 2020-11-09

Updated: 2020-11-09

Daniel Wolf - 2020-11-09

I'm thinking of training a custom acoustic model. The dictionary contains words with multiple pronunciations, like this:

either a ɪ ð ə either(2) i ð ə

Not let's suppose one of the training samples is the phrase "You say either [i ð ə] and I say either [a ɪ ð ə]". What should the transcript file contain? Is the trainer smart enough to determine the correct pronunciation from context, so that the transcript can be "<s> you say either and i say either </s>"? Or do I need to give it the exact word alternatives, like this: "<s> you say either(2) and i say either </s>"?

Last edit: Daniel Wolf 2020-11-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.