Mosur Ravishankar in his thesis talks about (for Sphinx II)
fanning out of right context (by having parallel hmms),
dynamic triphone mapping for left context and combination
of these two for single phone words.
Is the triphone modelling implemented the same way for
Pocketsphinx for FSG based decoding ?
Does it use flat lexicon or tree based lexicon in FSG mode?
Are there any accuracy benchmarks available for pocketsphinx
in FSG based mode ? If no is there any available database or
set of guidelines that could be used to create one ?
Thanks and regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Mosur Ravishankar in his thesis talks about (for Sphinx II) fanning out of
right context (by having parallel hmms), dynamic triphone mapping for left
context and combination of these two for single phone words. Is the triphone
modelling implemented the same way for Pocketsphinx for FSG based decoding ?
It's a little bit simplified. Fan-out is used for multiphone words. Single-
phone words are modelled by CI phones in pocketsphinx.
Does it use flat lexicon or tree based lexicon in FSG mode?
Tree
Are there any accuracy benchmarks available for pocketsphinx in FSG
based mode ? If no is there any available database or set of guidelines that
could be used to create one ?
There are not real-life datasets around available for free. So no benchmarks
too. You could do benchmark on tidigits, but I wouldn't recommend you. It
doesn't reflect real-life conditions. Maybe let'sgo data comes closer, but I'm
not sure about it's availability.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
fanning out of right context (by having parallel hmms),
dynamic triphone mapping for left context and combination
of these two for single phone words.
Is the triphone modelling implemented the same way for
Pocketsphinx for FSG based decoding ?
Does it use flat lexicon or tree based lexicon in FSG mode?
Are there any accuracy benchmarks available for pocketsphinx
in FSG based mode ? If no is there any available database or
set of guidelines that could be used to create one ?
Thanks and regards,
It's a little bit simplified. Fan-out is used for multiphone words. Single-
phone words are modelled by CI phones in pocketsphinx.
Tree
There are not real-life datasets around available for free. So no benchmarks
too. You could do benchmark on tidigits, but I wouldn't recommend you. It
doesn't reflect real-life conditions. Maybe let'sgo data comes closer, but I'm
not sure about it's availability.
So fanout is used for both "left and right conexts" ? (no dynamic triphone mapping for left phone as is sphinx II).
let'sgo database seems like a database for non-native english speakers.
Where to download the originals tidigit database from ?
Thanks and regards,
Yes, both left and right contexts are accounted
No, it's not. Make sure you are talking about the same
http://www.speech.cs.cmu.edu/letsgo
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S10