Some question about Pocketsphinx in FSG mode.

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Some question about Pocketsphinx in FSG mode.

Forum: Help

Creator: creative64

Created: 2012-03-02

Updated: 2012-09-22

creative64 - 2012-03-02

Mosur Ravishankar in his thesis talks about (for Sphinx II)
fanning out of right context (by having parallel hmms),
dynamic triphone mapping for left context and combination
of these two for single phone words.

Is the triphone modelling implemented the same way for
Pocketsphinx for FSG based decoding ?

Does it use flat lexicon or tree based lexicon in FSG mode?

Are there any accuracy benchmarks available for pocketsphinx
in FSG based mode ? If no is there any available database or
set of guidelines that could be used to create one ?

Thanks and regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-03-02

Mosur Ravishankar in his thesis talks about (for Sphinx II) fanning out of
right context (by having parallel hmms), dynamic triphone mapping for left
context and combination of these two for single phone words. Is the triphone
modelling implemented the same way for Pocketsphinx for FSG based decoding ?

It's a little bit simplified. Fan-out is used for multiphone words. Single-
phone words are modelled by CI phones in pocketsphinx.

Does it use flat lexicon or tree based lexicon in FSG mode?

Tree

Are there any accuracy benchmarks available for pocketsphinx in FSG
based mode ? If no is there any available database or set of guidelines that
could be used to create one ?

There are not real-life datasets around available for free. So no benchmarks
too. You could do benchmark on tidigits, but I wouldn't recommend you. It
doesn't reflect real-life conditions. Maybe let'sgo data comes closer, but I'm
not sure about it's availability.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2012-03-03

It's a little bit simplified. Fan-out is used for multiphone words.

So fanout is used for both "left and right conexts" ? (no dynamic triphone mapping for left phone as is sphinx II).

let'sgo database seems like a database for non-native english speakers.

Where to download the originals tidigit database from ?

Thanks and regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-03-03

So fanout is used for both "left and right conexts" ? (no dynamic triphone
mapping for left phone as is sphinx II).

Yes, both left and right contexts are accounted

let'sgo database seems like a database for non-native english speakers.

No, it's not. Make sure you are talking about the same

http://www.speech.cs.cmu.edu/letsgo

Where to download the originals tidigit database from ?

http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S10
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.