As far as I know pronunciation modeling in Sphinx can be ONLY done using a
lookup table (dictionary). is there another way like writing the grapheme-to-
phoneme rules in a function, then the trainer or the decoder call that
function to get the corresponding phonetic transcription?
Second question: in the dictionary, we can add more that one pronunciation
variant for any word. So, is it possible to specify that one of the variants
is only allowed at utterance beginning and another variant is allowed only in
the middle of the utterance?
3rd question: is it possible to specify the probability of each pronunciation
variant
instead of having an equal probability for all the variants for a given word?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
is there another way like writing the grapheme-to-phoneme rules in a
function, then the trainer or the decoder call that function to get the
corresponding phonetic transcription?
This feature is not supported. It can be implemented quite easily, for example
we have one implementation in the long audio aligner branch but hasn't been
imported in trunk yet.
Second question: in the dictionary, we can add more that one pronunciation
variant for any word. So, is it possible to specify that one of the variants
is only allowed at utterance beginning and another variant is allowed only in
the middle of the utterance?
I would separate create separate strings for such words. For example word_1
and word_2 with different pronunciation. Language model trainer could care
about position.
3rd question: is it possible to specify the probability of each
pronunciation variant
This feature is not supported.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have another question. How can I specify in the dictionary that a specific
word is not pronounced at all? In some languages, there are words that can be
written in the text but we truncate them when reading. So theses words are
only written but not pronounced. Shall we add these words in the dictionary
without any phonemes?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have another question. How can I specify in the dictionary that a specific
word is not pronounced at all? In some languages, there are words that can be
written in the text but we truncate them when reading. So theses words are
only written but not pronounced. Shall we add these words in the dictionary
without any phonemes?
You'd better remove those words from the language model training texts and
insert them automatically in recognition result. Or you can join such words
into next word. Word without pronunciation will create troubles for the
recognizer.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As far as I know pronunciation modeling in Sphinx can be ONLY done using a
lookup table (dictionary). is there another way like writing the grapheme-to-
phoneme rules in a function, then the trainer or the decoder call that
function to get the corresponding phonetic transcription?
Second question: in the dictionary, we can add more that one pronunciation
variant for any word. So, is it possible to specify that one of the variants
is only allowed at utterance beginning and another variant is allowed only in
the middle of the utterance?
3rd question: is it possible to specify the probability of each pronunciation
variant
instead of having an equal probability for all the variants for a given word?
This feature is not supported. It can be implemented quite easily, for example
we have one implementation in the long audio aligner branch but hasn't been
imported in trunk yet.
I would separate create separate strings for such words. For example word_1
and word_2 with different pronunciation. Language model trainer could care
about position.
This feature is not supported.
Many thanks for your reply.
I have another question. How can I specify in the dictionary that a specific
word is not pronounced at all? In some languages, there are words that can be
written in the text but we truncate them when reading. So theses words are
only written but not pronounced. Shall we add these words in the dictionary
without any phonemes?
You'd better remove those words from the language model training texts and
insert them automatically in recognition result. Or you can join such words
into next word. Word without pronunciation will create troubles for the
recognizer.