CMU Sphinx / Forums / Help: wsj with sphinx3 (newbie)

Hi,

I followed http://sphinx.subwiki.com/sphinx/index.php/Hello_World_Decoder_QuickStart_Guide successfully and am now trying to use wsj rather than hub. I am executing "sphinx3_livedecode _CFG" and trying different options in _CFG. When _CFG contains:

-samprate 16000
-dict /home/bshanks/prog/speech/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
-fdict /home/bshanks/prog/speech/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
-lm ./dmp/lm_giga_5k_nvp_3gram.arpa.DMP
-hmm /home/bshanks/prog/speech/hub4_cd_continuous_8gau_1s_c_d_dd

then sphinx3_livedecode recognizes each word in "yellow or red" properly (everything works). But when _CFG contains:

-samprate 16000
-dict /home/bshanks/prog/speech/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
-lm ./dmp/lm_giga_5k_nvp_3gram.arpa.DMP
-hmm /home/bshanks/prog/speech/wsj_all_cd30.mllt_cd_cont_4000
-lda /home/bshanks/prog/speech/wsj_all_cd30.mllt_cd_cont_4000/feature_transform
-fdict wsj_all_cd30.mllt_cd_cont_4000/noisedict

then sphinx3_livedecode does not recognize a single word (sometimes it incorrectly reports "the", but usually it reports nothing).

After reading http://sourceforge.net/forum/forum.php?thread_id=2000591&forum_id=5471 , I tried:

-lw 15
-feat 1s_c_d_dd
-wip 0.2
-beam 1e-120
-pbeam 1e-120
-wbeam 1e-100
-varnorm no
-cmn current

but still no words were recognized. Replacing wsj's mdef with http://www.cs.cmu.edu/~dhuggins/Projects/wsj_all_cd30.mllt.4000.mdef made no difference. Using "-fdict /home/bshanks/prog/speech/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler" made no difference. Leaving out "-lda /home/bshanks/prog/speech/wsj_all_cd30.mllt_cd_cont_4000/feature_transform" made no difference.

I would have liked to try an old version of wsj as was done in http://sourceforge.net/forum/forum.php?thread_id=2000591&forum_id=5471 , but I do not know where to get one -- http://www.speech.cs.cmu.edu/sphinx/models only links to the new version.

I am using trunk, and
http://www.speech.cs.cmu.edu/sphinx/models/wsj_jan2008/wsj_all_mllt_4000_20080104.tar.gz
http://www.speech.cs.cmu.edu/sphinx/models/hub4opensrc_jan2002/4000senones/1s_c_d_dd/hub4_cd_continuous_8gau_1s_c_d_dd.tar.gz
http://www.inference.phy.cam.ac.uk/kv227/lm_giga/lm_giga_5k_nvp_3gram.zip

Did I do something wrong, or is this a problem with the new wsj? Thanks,
bayle

wsj with sphinx3 (newbie)

Speech Recognition Toolkit

Forums

Help

wsj with sphinx3 (newbie) document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

wsj with sphinx3 (newbie)