Menu

Some question about Pocketsphinx in FSG mode.

Help
creative64
2012-03-02
2012-09-22
  • creative64

    creative64 - 2012-03-02
    1. Mosur Ravishankar in his thesis talks about (for Sphinx II)
      fanning out of right context (by having parallel hmms),
      dynamic triphone mapping for left context and combination
      of these two for single phone words.

    Is the triphone modelling implemented the same way for
    Pocketsphinx for FSG based decoding ?

    1. Does it use flat lexicon or tree based lexicon in FSG mode?

    2. Are there any accuracy benchmarks available for pocketsphinx
      in FSG based mode ? If no is there any available database or
      set of guidelines that could be used to create one ?

    Thanks and regards,

     
  • Nickolay V. Shmyrev

    Mosur Ravishankar in his thesis talks about (for Sphinx II) fanning out of
    right context (by having parallel hmms), dynamic triphone mapping for left
    context and combination of these two for single phone words. Is the triphone
    modelling implemented the same way for Pocketsphinx for FSG based decoding ?

    It's a little bit simplified. Fan-out is used for multiphone words. Single-
    phone words are modelled by CI phones in pocketsphinx.

    1. Does it use flat lexicon or tree based lexicon in FSG mode?

    Tree

    1. Are there any accuracy benchmarks available for pocketsphinx in FSG
      based mode ? If no is there any available database or set of guidelines that
      could be used to create one ?

    There are not real-life datasets around available for free. So no benchmarks
    too. You could do benchmark on tidigits, but I wouldn't recommend you. It
    doesn't reflect real-life conditions. Maybe let'sgo data comes closer, but I'm
    not sure about it's availability.

     
  • creative64

    creative64 - 2012-03-03

    It's a little bit simplified. Fan-out is used for multiphone words.

    • So fanout is used for both "left and right conexts" ? (no dynamic triphone mapping for left phone as is sphinx II).

    • let'sgo database seems like a database for non-native english speakers.

    • Where to download the originals tidigit database from ?

    Thanks and regards,

     
  • Nickolay V. Shmyrev

    • So fanout is used for both "left and right conexts" ? (no dynamic triphone
      mapping for left phone as is sphinx II).

    Yes, both left and right contexts are accounted

    • let'sgo database seems like a database for non-native english speakers.

    No, it's not. Make sure you are talking about the same

    http://www.speech.cs.cmu.edu/letsgo

    • Where to download the originals tidigit database from ?

    http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S10

     

Log in to post a comment.