Menu

Segmentation fault in sphinx2

2002-07-01
2012-09-22
  • Jessica P. Hekman

    I have been trying to get sphinx2 to accept general dictation (not a limited vocabulary, but anything in English). I downloaded the LM and vocabulary at <http://www.speech.cs.cmu.edu/sphinx/models/hub4opensrc_jan2002/>. I am using language_model.arpaformat.gz, unzipped, as my LM; I wrote a script to edit language_model.vocabulary so that it is in the same format as cmudict.06d and am using that as my .dic file (see the result at <http://www.arborius.net/~jphekman/sphinx/full.dic>).

    I edited sphinx2-simple like so:

    #!/bin/sh
    S2CONTINUOUS=/usr/local/bin/sphinx2-continuous
    HMM=/usr/local/share/sphinx2/model/hmm/6k
    #TASK=/usr/local/share/sphinx2/model/lm/turtle
    #DICT=/usr/local/share/sphinx2/model/lm/turtle/turtle.dic

    TASK=/home/jphekman/src/sphinx/jph/model/lm/full
    DICT=/home/jphekman/src/sphinx/jph/model/lm/full/full.dic

    echo " "
    echo "sphinx2-simple:"
    echo "  Demo CMU Sphinx2 decoder called with command line arguments."
    echo " "

    echo "<executing $S2CONTINUOUS, please wait>"
    $S2CONTINUOUS -live TRUE -ctloffset 0 -ctlcount 100000000 -cepdir ${TASK}/ctl -datadir ${TASK}/ctl -agcemax TRUE -langwt 6.5 -fwdflatlw 8.5 -rescorelw 9.5 -ugwt 0.5 -fillpen 1e-10 -silpen 0.005 -inspen 0.65 -top 1 -topsenfrm 3 -topsenthresh -70000 -beam 2e-06 -npbeam 2e-06 -lpbeam 2e-05 -lponlybeam 0.0005 -nwbeam 0.0005 -fwdflat FALSE -fwdflatbeam 1e-08 -fwdflatnwbeam 0.0003 -bestpath TRUE -kbdumpdir ${TASK} -lmfn ${TASK}/full.lm -dictfn ${DICT} -noisedict ${HMM}/noisedict -phnfn ${HMM}/phone -mapfn ${HMM}/map -hmmdir ${HMM} -hmmdirlist ${HMM} -8bsen TRUE -sendumpfn ${HMM}/sendump -cbdir ${HMM} -verbose 9

    When I run it, I get a "ready" prompt and say a sentence, at which point it immediately segfaults. I have made the error output available at <http://www.arborius.net/~jphekman/sphinx/error.log>, but the final message is "line 16: 20933 Segmentation fault".

    The machine is RedHat Linux 7.2, gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98). The sphinx is the 12/13/01, 0.4 release on sourceforge.

    What am I doing wrong? (Note that my vocabulary *is* less than 65,000 words.) Is there a better way to get sphinx2 to recognize general speech?

    Thanks,
    Jessica

     
    • Ken Olum

      Ken Olum - 2002-07-05

      The problem is caused by having a word "SIL" in the dictionary.  It confuses the system that thinks that that is some meta-word for some kind of silence.  It's a bug, but easy to work around by just removing this word from full.dic.

      After that, it worked for me, but I only got about 50% accuracy.

       

Log in to post a comment.