Menu

PocketSphinx on YellowDog Linux

Help
2011-03-01
2012-09-22
  • Dheryta Jaisinghani

    I have installed the sphinxbase and pocketsphinx on yellowdog linux.

    But When I try to execute (on yellow dog linux) $pocketsphinx_continuous

    Following error is displayed:

    INFO: acmod.c(238): Parsed model-specific feature parameters from
    /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//feat.params
    INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition:
    /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef
    file
    INFO: bin_mdef.c(330): Reading binary model definition:
    /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//mdef
    INFO: bin_mdef.c(343): Must byte-swap
    /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//mdef
    WARNING: "bin_mdef.c", line 399: -mmap specified, but mdef is other-endian.
    Will not memory-map.
    INFO: bin_mdef.c(508): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150
    CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/sha
    re/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//transition_matrices
    FATAL_ERROR: "tmat.c", line 254: /usr/local/share/pocketsphinx/model/hmm/en_US
    /hub4wsj_sc_8k//transition_matrices: #float32s(600) doesn't match dimensions:
    0 x 3 x 4

    Please guide what shall be done.

     
  • Nickolay V. Shmyrev

    Please guide what shall be done.

    Please provide more information about this issue. What exact steps have you
    done during the installation. What version are you trying to run. What is the
    CPU you are trying to run on. By Yellowdog people usually mean systems running
    on Cell processor. Is that the case? Which compiler have you used.

     
  • Dheryta Jaisinghani

    hi

    Yes you are right, yellow dog is on CBE only,
    I have followed exactly all the steps as per : http://cmusphinx.sourceforge.n
    et/wiki/tuturialpocketsphinx
    .

    I have tried

    sphinxbase-0.6.1

    pocketsphinx-0.6.1

    GCC version : gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)

     
  • Dheryta Jaisinghani

    When I try to execute on Ubuntu, it works fine.

     
  • Nickolay V. Shmyrev

    This is a bug which was fixed just recently

    r10743 | nshmyrev | 2010-12-14 11:09:37 -0500 (Tue, 14 Dec 2010) | 3 lines
    Fixes the bug with reading transition matrix on big-endian platforms. thanks to Chen Tao.
    

    You can download pocketsphinx snapshot or checkout pocketsphinx from
    subversion to get it working.

     
  • Dheryta Jaisinghani

    I downloaded snapshot,
    but now while installing pocketsphinx it gives me following error :

    ../../libtool: line 480: CDPATH: command not found
    libtool: Version mismatch error. This is libtool 2.4, but the
    libtool: definition of this LT_INIT comes from an older release.
    libtool: You should recreate aclocal.m4 with macros from libtool 2.4
    libtool: and run autoconf again.
    make: *** Error 1
    make: Leaving directory /home/osproj2011/Desktop/9b/9bosproj2011/sphinx/pocke tsphinx-0.6.1/src/libpocketsphinx' make: *** Error 1 make: Leaving directory/home/osproj2011/Desktop/9b/9bosproj2011/sphinx/pocketsphinx-0.6.1/src'
    make: *** Error 1

     
  • Dheryta Jaisinghani

    Ok.

    This was an error because of libtoolize ....

    Its working now :)

    Thanks for the support.

     
  • Dheryta Jaisinghani

    Hi

    It got installed but it is not converting audio to text properly, I dont know
    what to do now. Please help.

     
  • Dheryta Jaisinghani

    I am trying to execute pocketsphinx_batch on wav files

     
  • Nickolay V. Shmyrev

    Maybe your input audio has wrong sampling rate, maybe you are using some
    strange options. Please describe your problem in more details.

     
  • Dheryta Jaisinghani

    I have installed on Ubuntu 10.04 as well, it works fine there, for same audio
    files at 16000 Hz. But on Ubuntu machine Pocketsphinx snapshot version is not
    installed, rather its 0.6.1, I am not sure if that makes a difference.

    Also, same language model I am trying to use on both machines, it does not
    work on Yellowdog.

    If there are any special steps to be followed when installing snapshot
    version, kindly mention.

     
  • Nickolay V. Shmyrev

    Pocketsphinx snapshot version is not installed, rather its 0.6.1, I am not
    sure if that makes a difference.

    It makes a difference. You need to install snapshot. Or wait a week when
    pocketsphinx 0.7 will be released.

     
  • Dheryta Jaisinghani

    Pocketsphinx 0.7 will be released next week for sure ? Thanks.

     
  • Nickolay V. Shmyrev

    Unfortunatly nothing is for sure in this world. There are two bugs which needs
    to be fixed. On is to build Sphinxtrain using Visual Studio 2010, another one
    is to build cmuclmtk. Release will be done as soon as they will be fixed.

    I suggest you to download snapshot, there is no difference between it and
    release.

     
  • Dheryta Jaisinghani

    Actually we have to submit our college project next week, we are using
    pocketsphinx in one of the modules to convert audio to text. It is working
    perfectly fine on ubuntu 10.04 machine. But we need it to work on Yellowdog
    Linux (CBE Processor). There pocketsphinx 0.6.1 could get
    could not get installed, as suggested above, we tried snapshot version which
    got installed on CBE with yellowdog, but accuracy of conversion is nil. Kindly
    suggest what shall we do ?

     
  • Nickolay V. Shmyrev

    Ok, the pocketsphinx snapshot is installed. Good to know that.

    About accuracy, maybe you run it incorrectly. Provide the details as I said
    before - the audio you are trying to recognize. The log of the decoder, all
    other information.

     
  • Dheryta Jaisinghani

    Hi

    I cannot attach wav file on this. Please find the pocketsphinx log below :

    Log Starts

    $ pocketsphinx_batch -hmm
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/ -adcin yes
    -cepdir . -cepext .wav -ctl record.ctl -hyp exp.hyp
    INFO: cmd_ln.c(512): Parsing command line:
    pocketsphinx_batch \
    -hmm ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/ \
    -adcin yes \
    -cepdir . \
    -cepext .wav \
    -ctl record.ctl \
    -hyp exp.hyp

    Current configuration:

    -adchdr 0 0
    -adcin no yes
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -argfile
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -bghist no no
    -build_outdirs yes yes
    -cepdir .
    -cepext .mfc .wav
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -ctl record.ctl
    -ctlcount -1 -1
    -ctlincr 1 1
    -ctloffset 0 0
    -ctm
    -debug 0
    -dict
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgctl
    -fsgdir
    -fsgext
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/
    -hyp exp.hyp
    -hypseg
    -input_endian big big
    -jsgf
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm
    -lmctl
    -lmname default default
    -lmnamectl
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf -1 -1
    -maxnewoov 20 20
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mllrctl
    -mllrdir
    -mllrext
    -mmap yes yes
    -nbest 0 0
    -nbestdir
    -nbestext .hyp .hyp
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -outlatbeam 1e-5 1.000000e-05
    -outlatdir
    -outlatext .lat .lat
    -outlatfmt s3 s3
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-5 1.000000e-05
    -pl_window 0 0
    -rawlogdir
    -remove_dc no no
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senin no no
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -usewdphones no no
    -uw 1.0 1.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: cmd_ln.c(512): Parsing command line:
    \
    -nfilt 20 \
    -lowerf 1 \
    -upperf 4000 \
    -wlen 0.025 \
    -transform dct \
    -round_filters no \
    -remove_dc yes \
    -svspec 0-12/13-25/26-38 \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -cmninit 56,-3,1 \
    -varnorm no

    Current configuration:

    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 56,-3,1
    -dither no no
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian big big
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.000000e+00
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 20
    -remove_dc no yes
    -round_filters yes no
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -svspec 0-12/13-25/26-38
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 4.000000e+03
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.500000e-02

    INFO: acmod.c(238): Parsed model-specific feature parameters from
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//feat.params
    INFO: feat.c(860): Initializing feature stream to type: '1s_c_d_dd',
    ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(142): mean= 12.00, mean= 0.0
    INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(520): Reading model definition:
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef
    file
    INFO: bin_mdef.c(330): Reading binary model definition:
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//mdef
    INFO: bin_mdef.c(343): Must byte-swap
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//mdef
    WARNING: "bin_mdef.c", line 399: -mmap specified, but mdef is other-endian.
    Will not memory-map.
    INFO: bin_mdef.c(507): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150
    CI-sen, 5150 Sen, 27135 Sen-Seq
    INFO: tmat.c(205): Reading HMM transition probability matrices: ../sphinxSnaps
    hot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//transition_matrices
    INFO: acmod.c(117): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(294): 256x13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(908): Loading senones from dump file
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//sendump
    INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(306): Allocating 137542 * 20 bytes (2686 KiB) for word entries
    INFO: dict.c(321): Reading main dictionary:
    /usr/local/share/pocketsphinx/model/lm/en_US/cmu07a.dic
    INFO: dict.c(212): Allocated 1010 KiB for strings, 1664 KiB for phones
    INFO: dict.c(324): 133436 words read
    INFO: dict.c(330): Reading filler dictionary:
    ../sphinxSnapshot/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k//noisedict
    INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(333): 11 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial
    triphones
    INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
    INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word
    triphones
    INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(137): Byteswapping required, will not use memory-
    mapped I/O for LM file
    INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
    INFO: ngram_model_dmp.c(242): 5001 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(291): 436879 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(317): 418286 = LM.trigrams read
    INFO: ngram_model_dmp.c(342): 37293 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(362): 14370 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(382): 36094 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(410): 854 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(466): 5001 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
    INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-
    phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60
    single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
    INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels,
    26 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: cmn.c(175): CMN: 78.48 -7.24 -1.06 -0.98 -0.33 -0.39 -0.15 -0.21 -0.07
    -0.10 -0.09 -0.11 -0.05
    INFO: ngram_search.c(463): Resized backpointer table to 10000 entries
    INFO: ngram_search.c(471): Resized score stack to 200000 entries
    INFO: ngram_search.c(463): Resized backpointer table to 20000 entries
    INFO: ngram_search_fwdtree.c(1549): 17062 words recognized (21/fr)
    INFO: ngram_search_fwdtree.c(1551): 3224016 senones evaluated (3995/fr)
    INFO: ngram_search_fwdtree.c(1553): 5857427 channels searched (7258/fr),
    366971 1st, 558180 last
    INFO: ngram_search_fwdtree.c(1557): 40943 words for which last channels
    evaluated (50/fr)
    INFO: ngram_search_fwdtree.c(1560): 787112 candidate words for entering last
    phone (975/fr)
    INFO: ngram_search_fwdtree.c(1562): fwdtree 13.72 CPU 1.701 xRT
    INFO: ngram_search_fwdtree.c(1565): fwdtree 13.87 wall 1.718 xRT
    INFO: ngram_search_fwdflat.c(305): Utterance vocabulary contains 99 words
    INFO: ngram_search_fwdflat.c(940): 2516 words recognized (3/fr)
    INFO: ngram_search_fwdflat.c(942): 487829 senones evaluated (604/fr)
    INFO: ngram_search_fwdflat.c(944): 482071 channels searched (597/fr)
    INFO: ngram_search_fwdflat.c(946): 24624 words searched (30/fr)
    INFO: ngram_search_fwdflat.c(948): 20572 word transitions (25/fr)
    INFO: ngram_search_fwdflat.c(951): fwdflat 0.43 CPU 0.053 xRT
    INFO: ngram_search_fwdflat.c(954): fwdflat 0.43 wall 0.053 xRT
    INFO: ngram_search.c(1198): not found in last frame, using look.805
    instead
    INFO: ngram_search.c(1250): lattice start node .0 end node look.788
    INFO: ngram_search.c(1278): Eliminated 14 nodes before end node
    INFO: ngram_search.c(1383): Lattice has 259 nodes, 165 links
    INFO: ps_lattice.c(1352): Normalizer P(O) = alpha(look:788:805) = -4695955
    INFO: ps_lattice.c(1390): Joint P(O,S) = -4838308 P(S|O) = -142353
    INFO: ngram_search.c(872): bestpath 0.03 CPU 0.004 xRT
    INFO: ngram_search.c(875): bestpath 0.03 wall 0.004 xRT
    INFO: batch.c(760): 1: 8.06 seconds speech, 14.19 seconds CPU, 14.33 seconds
    wall
    INFO: batch.c(762): 1: 1.76 xRT (CPU), 1.78 xRT (elapsed)
    INFO: batch.c(774): TOTAL 8.06 seconds speech, 14.19 seconds CPU, 14.33
    seconds wall
    INFO: batch.c(776): AVERAGE 1.76 xRT (CPU), 1.78 xRT (elapsed)
    INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 13.72 CPU 1.703 xRT
    INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 13.87 wall 1.720 xRT
    INFO: ngram_search_fwdflat.c(174): TOTAL fwdflat 0.43 CPU 0.053 xRT
    INFO: ngram_search_fwdflat.c(177): TOTAL fwdflat 0.43 wall 0.053 xRT
    INFO: ngram_search.c(314): TOTAL bestpath 0.03 CPU 0.004 xRT
    INFO: ngram_search.c(317): TOTAL bestpath 0.03 wall 0.004 xRT

    Log Ends

    $ cat exp.hyp
    and and and and look (1 -91146)

    However when we execute same command on same in Ubuntu, we get following
    output in exp.hyp :

    this is an example that the p. and t. natural place to speak and in good come
    out here in sending checks to speak and you know world (1 -104118691)

     
  • Nickolay V. Shmyrev

    Hello

    You can upload file to public file sharing service and give a link here.
    Without a file it's hard to say something, sorry.

    Or you can try to run some tests first. Run 'make check' in sphinxbase and in
    pocketsphinx to check everything works fine. Provide the test log if something
    went wrong.

     
  • Anonymous

    Anonymous - 2011-04-06

    @dheryta: Rewinding back to this error:

    ../../libtool: line 480: CDPATH: command not found
    libtool: Version mismatch error. This is libtool 2.4, but the
    libtool: definition of this LT_INIT comes from an older release.
    libtool: You should recreate aclocal.m4 with macros from libtool 2.4
    libtool: and run autoconf again.
    make: *** Error 1
    make: Leaving directory /home/osproj2011/Desktop/9b/9bosproj2011/sphinx/pocke tsphinx-0.6.1/src/libpocketsphinx' make: *** Error 1 make: Leaving directory/home/osproj2011/Desktop/9b/9bosproj2011/sphinx/pocketsphinx-0.6.1/src'
    make: *** Error 1

    You mentioned that you had a problem with libtoolize that was causing the
    error. What was your problem?

     
  • Dheryta Jaisinghani

    Yes, libtools error was coming for installation. We resolved that. It was
    coming due to versioning. But still there is no conversion accuracy on YDL,
    however its 95% on Ubuntu machine.

     
  • Anonymous

    Anonymous - 2011-04-06

    Okay. I got the same error on Red Hat 2.16.0 with installing pocketsphinx...no
    problems with sphinxbase. I'm using libtool 2.4 (latest) and was trying to
    rebuild the source.

     
  • Nickolay V. Shmyrev

    But still there is no conversion accuracy on YDL

    Again, I suggest you to run pocketsphinx tests.

     

Log in to post a comment.