Menu

Low performance with the official German model and a flac/wav audio sample.

Help
inktrap
2017-01-30
2017-01-30
  • inktrap

    inktrap - 2017-01-30

    I am trying to use pocketsphinx_continuous but I get bad results. I am using the official model. I created a GitHub repository with the exact models and cli-arguments I used and with the results I got. See: https://github.com/inktrap/cmusphinx-de

     
    • Nickolay V. Shmyrev

      Update pocketsphinx & sphinxbase from github, results would be

      INFO: cmn_live.c(120): Update from < 40.00  3.00 -1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 >
      INFO: cmn_live.c(138): Update to   < 45.11  8.09 -10.98 13.57 -0.91 -7.97  3.55 -7.29  0.96  1.32 -2.45 -4.99 -2.84 >
      INFO: ngram_search_fwdtree.c(1550):     1102 words recognized (3/fr)
      INFO: ngram_search_fwdtree.c(1552):   514495 senones evaluated (1234/fr)
      INFO: ngram_search_fwdtree.c(1556):   632761 channels searched (1517/fr), 123146 1st, 30223 last
      INFO: ngram_search_fwdtree.c(1559):     2287 words for which last channels evaluated (5/fr)
      INFO: ngram_search_fwdtree.c(1561):    10760 candidate words for entering last phone (25/fr)
      INFO: ngram_search_fwdtree.c(1564): fwdtree 0.63 CPU 0.152 xRT
      INFO: ngram_search_fwdtree.c(1567): fwdtree 0.63 wall 0.152 xRT
      INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 47 words
      INFO: ngram_search_fwdflat.c(948):      700 words recognized (2/fr)
      INFO: ngram_search_fwdflat.c(950):    42290 senones evaluated (101/fr)
      INFO: ngram_search_fwdflat.c(952):    26430 channels searched (63/fr)
      INFO: ngram_search_fwdflat.c(954):     3013 words searched (7/fr)
      INFO: ngram_search_fwdflat.c(957):     2376 word transitions (5/fr)
      INFO: ngram_search_fwdflat.c(960): fwdflat 0.04 CPU 0.010 xRT
      INFO: ngram_search_fwdflat.c(963): fwdflat 0.04 wall 0.010 xRT
      INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.412
      INFO: ngram_search.c(1276): Eliminated 0 nodes before end node
      INFO: ngram_search.c(1381): Lattice has 104 nodes, 97 links
      INFO: ps_lattice.c(1380): Bestpath score: -8458
      INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:412:415) = -633741
      INFO: ps_lattice.c(1441): Joint P(O,S) = -646082 P(S|O) = -12341
      INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
      INFO: ngram_search.c(875): bestpath 0.00 wall 0.000 xRT
      eine gewisse temperatur ist notwendig
      

      Make sure you get the same cmn_live.c line.

       
  • inktrap

    inktrap - 2017-01-30

    Wow! That was fast! Thanks! (I used the debian version because stable is recommended in the documentation and they are the same).

     
  • inktrap

    inktrap - 2017-01-30
    INFO: cmn_live.c(120): Update from < 22.11 -11.62 -4.30 -2.35 -4.95 -0.05 -2.76  1.20 -2.92  2.77 -2.43  0.89 -1.09 >
    INFO: cmn_live.c(138): Update to   < 46.72  8.88 -11.39 14.13 -0.57 -8.38  3.76 -7.83  0.98  1.18 -2.50 -5.38 -2.70 >
    INFO: ngram_search_fwdtree.c(1550):     4209 words recognized (13/fr)
    INFO: ngram_search_fwdtree.c(1552):   666189 senones evaluated (2044/fr)
    INFO: ngram_search_fwdtree.c(1556):  1174628 channels searched (3603/fr), 111830 1st, 106531 last
    INFO: ngram_search_fwdtree.c(1559):     6909 words for which last channels evaluated (21/fr)
    INFO: ngram_search_fwdtree.c(1561):    47796 candidate words for entering last phone (146/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.98 CPU 0.302 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 0.99 wall 0.303 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 204 words
    INFO: ngram_search_fwdflat.c(948):     3024 words recognized (9/fr)
    INFO: ngram_search_fwdflat.c(950):   232229 senones evaluated (712/fr)
    INFO: ngram_search_fwdflat.c(952):   236943 channels searched (726/fr)
    INFO: ngram_search_fwdflat.c(954):    17093 words searched (52/fr)
    INFO: ngram_search_fwdflat.c(957):    12251 word transitions (37/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.30 CPU 0.091 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.29 wall 0.090 xRT
    INFO: ngram_search.c(1197): </s> not found in last frame, using eine.324 instead
    INFO: ngram_search.c(1250): lattice start node <s>.0 end node eine.284
    INFO: ngram_search.c(1276): Eliminated 41 nodes before end node
    INFO: ngram_search.c(1381): Lattice has 270 nodes, 342 links
    INFO: ps_lattice.c(1380): Bestpath score: -17548
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(eine:284:324) = -1197107
    INFO: ps_lattice.c(1441): Joint P(O,S) = -1212093 P(S|O) = -14986
    INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(875): bestpath 0.00 wall 0.000 xRT
    nein wir sehr dankbar tue es notwendig eine
    INFO: cmn_live.c(120): Update from < 46.72  8.88 -11.39 14.13 -0.57 -8.38  3.76 -7.83  0.98  1.18 -2.50 -5.38 -2.70 >
    INFO: cmn_live.c(138): Update to   < 46.72  8.88 -11.39 14.13 -0.57 -8.38  3.76 -7.83  0.98  1.18 -2.50 -5.38 -2.70 >
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
    INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 1.18 CPU 0.305 xRT
    INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 1.18 wall 0.306 xRT
    INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.30 CPU 0.076 xRT
    INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.30 wall 0.076 xRT
    INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU 0.001 xRT
    INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall 0.000 xRT
    

    I installed the recent versions (User: cmusphinx)

     
  • inktrap

    inktrap - 2017-01-30

    The cmn_live.c lines don't match.

    Btw.: Here is the script:

    #!/bin/bash
    
    #pocketsphinx_continuous -inmic yes -lm cmusphinx-voxforge-de.lm -dict cmusphinx-voxforge-de.dic -hmm cmusphinx-de-voxforge-5.2/
    pocketsphinx_continuous -infile ../test.wav -lm cmusphinx-voxforge-de.lm.bin -dict cmusphinx-voxforge-de.dic -hmm cmusphinx-de-voxforge-5.2/
    
     

    Last edit: inktrap 2017-01-30
  • inktrap

    inktrap - 2017-01-30

    If I am using cmusphinx-de-ptm-voxforge-5.2 I get

    INFO: cmn_live.c(120): Update from < 22.11 -11.62 -4.30 -2.35 -4.95 -0.05 -2.76  1.20 -2.92  2.77 -2.43  0.89 -1.09 >
    INFO: cmn_live.c(138): Update to   < 46.72  8.88 -11.39 14.13 -0.57 -8.38  3.76 -7.83  0.98  1.18 -2.50 -5.38 -2.70 >
    INFO: ngram_search.c(467): Resized score stack to 200000 entries
    INFO: ngram_search_fwdtree.c(1550):     4414 words recognized (14/fr)
    INFO: ngram_search_fwdtree.c(1552):   478065 senones evaluated (1466/fr)
    INFO: ngram_search_fwdtree.c(1556):   961092 channels searched (2948/fr), 88996 1st, 145539 last
    INFO: ngram_search_fwdtree.c(1559):     7619 words for which last channels evaluated (23/fr)
    INFO: ngram_search_fwdtree.c(1561):    37318 candidate words for entering last phone (114/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.50 CPU 0.152 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 0.50 wall 0.154 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 282 words
    INFO: ngram_search_fwdflat.c(948):     2464 words recognized (8/fr)
    INFO: ngram_search_fwdflat.c(950):   197682 senones evaluated (606/fr)
    INFO: ngram_search_fwdflat.c(952):   287464 channels searched (881/fr)
    INFO: ngram_search_fwdflat.c(954):    19121 words searched (58/fr)
    INFO: ngram_search_fwdflat.c(957):    16076 word transitions (49/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.15 CPU 0.047 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.16 wall 0.048 xRT
    INFO: ngram_search.c(1197): </s> not found in last frame, using notwendigen.324 instead
    INFO: ngram_search.c(1250): lattice start node <s>.0 end node notwendigen.212
    INFO: ngram_search.c(1276): Eliminated 114 nodes before end node
    INFO: ngram_search.c(1381): Lattice has 350 nodes, 67 links
    INFO: ps_lattice.c(1380): Bestpath score: -10204
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(notwendigen:212:324) = -935518
    INFO: ps_lattice.c(1441): Joint P(O,S) = -943222 P(S|O) = -7704
    INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(875): bestpath 0.00 wall 0.000 xRT
    eine gewisse temperatur des notwendigen
    INFO: cmn_live.c(120): Update from < 46.72  8.88 -11.39 14.13 -0.57 -8.38  3.76 -7.83  0.98  1.18 -2.50 -5.38 -2.70 >
    INFO: cmn_live.c(138): Update to   < 46.72  8.88 -11.39 14.13 -0.57 -8.38  3.76 -7.83  0.98  1.18 -2.50 -5.38 -2.70 >
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
    INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 0.60 CPU 0.155 xRT
    INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.62 wall 0.160 xRT
    INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.16 CPU 0.040 xRT
    INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.16 wall 0.041 xRT
    INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall 0.000 xRT
    
     
    • Nickolay V. Shmyrev

      You probably edited the model file feat.params somehow. It should start with 40,3,-1.

       
  • inktrap

    inktrap - 2017-02-09

    I tried it again with the current models from https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/German/ and get same output

    eine gewisse temperatur des notwendigen
    

    running:

    #!/bin/bash
    
    pocketsphinx_continuous -infile ../test.wav -lm cmusphinx-voxforge-de.lm.bin -dict cmusphinx-voxforge-de.dic -hmm cmusphinx-de-ptm-voxforge-5.2
    

    And I did not change feat.params:

    vh@box ~/src/sphinx/cmusphinx-de/German [master●●][i] % md5sum cmusphinx-de-ptm-voxforge-5.2/feat.params cmusphinx-de-ptm-voxforge-5.2.bak/feat.params 
    275be8550f70128299c0320cba9db178  cmusphinx-de-ptm-voxforge-5.2/feat.params
    275be8550f70128299c0320cba9db178  cmusphinx-de-ptm-voxforge-5.2.bak/feat.params
    
     

    Last edit: inktrap 2017-02-09
  • inktrap

    inktrap - 2017-02-09

    Could you share your working configuration?
    Please also have a look at my repository here:
    https://github.com/inktrap/cmusphinx-de

     
    • Nickolay V. Shmyrev

      Add -vad_threshold 3.5 and it should work as expected.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.