Menu

Problem with tutorial in Sphinxtrain (newb)

Help
m_ahlenius
2008-03-30
2012-09-22
  • m_ahlenius

    m_ahlenius - 2008-03-30

    Hi,
    Per the instructions for "Setting up the traininer" (http://www.speech.cs.cmu.edu/sphinx/tutorial.html#traincomponent)

    I have downloaded AN4 (big endian version) and Sphinxtrain db. Did the configure and make with no probs. (BTW I am running on a Centos 5 linux box, intel 686 chipset if it matters)
    I ran the first perl cmd:

        perl scripts_pl/setup_tutorial.pl an4
    

    again with no errors. But at the end, the above script displays the next set of cmds which I should run:

    cd ../an4
    And then, in Unix/Linux:
    perl scripts_pl/make_feats.pl -ctl etc/an4_train.fileids (if needed)
    perl scripts_pl/RunAll.pl

    When I run the first perl cmd (make_feats) - it runs and ends with an error:

    perl scripts_pl/make_feats.pl -ctl etc/an4_train.fileids

    "...
    INFO: fe_sigproc.c(771): Will not use double bandwidth in mel filter
    INFO: wave2feat.c(139): wav/an4_clstk/fash/an251-fash-b.sph
    ERROR: "wave2feat.c", line 655: Cannot read wav/an4_clstk/fash/an251-fash-b.sph
    FATAL_ERROR: "wave2feat.c", line 90: error converting files...exiting"

    Any idea what the problem is? The file which it cannot read is not there (not with that suffix).

    All I have is:

    > ls -la wav/an4_clstk/fash/*
    -rw-r--r-- 1 ahlenius ahlenius 32000 Feb 20 2003 wav/an4_clstk/fash/an251-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 22400 Feb 20 2003 wav/an4_clstk/fash/an253-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 28800 Feb 20 2003 wav/an4_clstk/fash/an254-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 83200 Feb 20 2003 wav/an4_clstk/fash/an255-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 112000 Feb 20 2003 wav/an4_clstk/fash/cen1-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 41600 Feb 20 2003 wav/an4_clstk/fash/cen2-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 115200 Feb 20 2003 wav/an4_clstk/fash/cen4-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 144000 Feb 20 2003 wav/an4_clstk/fash/cen5-fash-b.raw
    -rw-r--r-- 1 ahlenius ahlenius 80000 Feb 20 2003 wav/an4_clstk/fash/cen7-fash-b.raw

    any ptrs wud b appreciated.

    thx

    m

     
    • m_ahlenius

      m_ahlenius - 2008-03-31

      Hi,

      ok, yes I am a newb on this. Going from the tutorial, I was following the instructions (from: http://www.speech.cs.cmu.edu/sphinx/tutorial.html):

      "You will be given instructions on how to download, compile, and run the components needed to build a complete speech recognition system. Namely, you will be given instructions on how to use SphinxTrain and you will have to choose one of PocketSphinx, SPHINX-2, SPHINX-3, SPHINX-3 Flat, or SPHINX-4. Please check a short description for capabilities of each of these, or the CMUSphinx project page for more details. This tutorial does not instruct you on how to build a language model, but you can check the CMU SLM Toolkit page for an excellent manual."

      So that's why I am doing this.

      What I am trying to do, is to create a version of the pocketsphinx for the gumstix and see how well it runs there. What would you advise, keep going on that path that I am or something else? I don't have an app built yet for this platform.

      Also - since I am using a desktop linux box as my OpenEmbedded build platform for the Gumstix (embedded linux board), can the training be done on the desktop machine, and then just send the created models to the Gumstix board? Or must they be built elsewhere.

      thanks

      m

       
      • Nickolay V. Shmyrev

        > Also - since I am using a desktop linux box as my OpenEmbedded build platform for the Gumstix (embedded linux board), can the training be done on the desktop machine, and then just send the created models to the Gumstix board? Or must they be built elsewhere.

        Models are already trained and available for downloads. You can just download and use them. See

        http://www.speech.cs.cmu.edu/sphinx/models/

         
    • Nickolay V. Shmyrev

      open your sphinx_train.cfg and change the following:

      Audio waveform and feature file information
      $CFG_WAVFILES_DIR = "$CFG_BASE_DIR/wav";
      $CFG_WAVFILE_EXTENSION = 'raw';
      $CFG_WAVFILE_TYPE = 'raw'; # one of nist, mswav, raw
      ;

       
    • m_ahlenius

      m_ahlenius - 2008-03-31

      Hi,

      thanks - that got me past that issue. I didn't know where to look for that config.

      In doing the training, I am getting a number of errors and not sure if they are "normal" and acceptable, or not; (I am planning on using this for pocketsphinx)

      When I ran the cmd: perl scripts_pl/RunAll.pl

      this is the tail of the std. out:

      This step had 14 ERROR messages and 0 WARNING messages. Please check the log file for details.
      Normalization for iteration: 1
      This step had 68982 ERROR messages and 0 WARNING messages. Please check the log file for details.
      Current Overall Likelihood Per Frame = 17.777007443318
      Baum welch starting for 8 Gaussian(s), iteration: 2 (1 of 1)
      0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
      This step had 14 ERROR messages and 0 WARNING messages. Please check the log file for details.
      Normalization for iteration: 2
      Current Overall Likelihood Per Frame = 18.4619800740744
      Split Gaussians, increase by 0
      Training for 8 Gaussian(s) completed after 2 iterations
      MODULE: 90 deleted interpolation
      Skipped for continuous models
      MODULE: 99 Convert to Sphinx2 format models
      Can not create models used by Sphinx-II.


      real concerns or just ignore?

      thank you!

      m

       
      • Nickolay V. Shmyrev

        It's fine. Actually the number of your senones is too big for such a small database, that's why errors appear. About using in pocketsphinx, I don't quite understand you. an4 is just an example of the database mainly for tutorial. There is no sense to use it in real application.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.