Menu

Problem performing preliminary decode

Help
Jia Shen
2010-07-01
2012-09-22
  • Jia Shen

    Jia Shen - 2010-07-01

    Hello,

    I'm following the Sphinx tutorial and got stuck at the perliminary decode
    step. I'm running Windows XP and I compiled the nightly builds of sphinxtrain,
    sphinxbase and sphinx3 on VS C++ 2008 Express. I did the setup_tutorial for
    the rm1 database with cygwin and the preliminary training. Then when I did
    'perl scripts_pl/decode/slave.pl', this error came out:

    Could not find executable for C:\cygwin\home\BJiaShen\rm1\bin\sphinx3_decode
    at C:.cygwin/home/BJiaShen\rm1\scripts_pl\lib/SphinxTrain/Util.pm line 299.
    Aligning results to find error rate
    Can't open C:/cygwin/home/BJiaShen/rm1/results/rm1-1-1.match
    word_align.pl failed with error code 65280 at scripts_pl/decode/slave.pl line
    173

    Could anyone help me with this problem? Under what step of the tutorial is
    that sphinx3 supposed to get there? I also didn't get a word alignment
    program, but I don't suppose that's causing the problem. Sorry I'm quite a
    newbie so please bear with me.

     
  • Nickolay V. Shmyrev

    You probably need to copy sphinx3_decode.exe and sphinxbase.dll to bin
    training folder from sphinx3 folder manually.

     
  • Anonymous

    Anonymous - 2010-07-02

    Hey it works! Thank you so much! Actually I tried something similar yesterday:
    I copied all the bin/Release files from Sphinx3 and sphinxbase.dll to the bin
    for rm1 (with just sphinx3_decode exe and sphinxbase.dll, the second word
    align error comes out) but the script was stuck at 0% for such a long time
    that I thought it had hung or something. Turns out that it does take a long
    time, like 20 min in my case. Again, thanks!

     
  • Anonymous

    Anonymous - 2010-07-02

    Hey, now that I got the tutorial going I thought I'd try to train with my own
    training data. The problem is that when I run setup_SphinxTrain.pl, there's an
    error at this portion:

    Generating SphinxTrain configuration file in etc/sphinx_train.cfg
    Backing up existing configuration file to etc/sphinx_train.cfg.orig
    Can't open etc/sphinx_train.template or ./etc/sphinx_train.cfg

    My command was "perl scripts_pl/setup_SphinxTrain.pl -force -task limited
    -sphinxtraindir ."

    When I checked out the SphinxTrain/etc folder, the sphinx_train.cfg had been
    renamed to sphinx_train.cfg.orig, so it seems that the script renames, then
    tries to open the file it had just renamed (sphinx_train.cfg), which is rather
    strange. Could anyone help me with this? If I rename the .orig back to .cfg
    and run the command, it just renames it back again and the same error comes
    out.

     
  • Nickolay V. Shmyrev

    Hello

    The setup script creates layout in a folder. You need to run it from the
    folder where you will train, not in sphinxtrain directory.

    Try to run this script with -help option to get outline:

        To setup a new SphinxTrain task
            Create the new directory (e.g., mkdir RM1)
    
            Go to the new directory (e.g., cd RM1)
    
            Run this script (e.g., perl
            $SPHINXTRAIN/scripts_pl/setup_SphinxTrain.pl -task RM1)
    
     
  • Anonymous

    Anonymous - 2010-07-04

    Hello, sorry for all the questions, I've got some problems running RunAll.pl.

    Right now I'm using Ubuntu 10.04 and the stable releases of SphinxTrain,
    Sphinx3 and sphinxbase. (The previous set up was in my office, and now I'm
    trying it out at home. Office doesn't allow installation of Ubuntu)

    When I ran RunAll.pl, I got this:

    Phase 2: Flat initialize
    FATAL_ERROR: "corpus.c", line 262: input string too long. Truncated.

    Then I went through the forums and learnt that you're supposed to have a
    newline for the .fileids and .transcription files. So I went ahead to do that,
    then I got this:

    Phase 2: Flat initialize
    FATAL_ERROR: "corpus.c", line 1647: Failed to get the files after 100 retries
    of getting MFCC(about 300 seconds)
    This step had 101 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Something failed:
    (/home/u0700322/Downloads/limited/scripts_pl/20.ci_hmm/slave_convg.pl)

    Then when I went back to remove the newlines, I got this:

    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in
    the phonelist, and all phones in the phonelist appear at least once
    Something failed:
    (/home/u0700322/Downloads/limited/scripts_pl/00.verify/verify_all.pl)

    Which simply left me rather confused. Is there anyway around this? This is my
    etc folder WITHOUT the newlines at the end of those files:
    http://www.megaupload.com/?d=7UPSZLRD

     
  • Anonymous

    Anonymous - 2010-07-04

    Hey, I'm not sure if this information is important, but I'm using my own
    training data under the task 'limited', and I set up SphinxTrain and sphinx3
    using setup_SphinxTrain.pl and setup_sphinx3.pl.

     
  • Nickolay V. Shmyrev

    You don't setup database properly. In your file transcription file has emtpy
    line which is not allowed. Your database is too small. I don't see the point
    in training this model as well, it's way better to use stock model which is
    way more accurate then everything else you can train.

    I'm sure you can do that, just try to be more careful.

     
  • Anonymous

    Anonymous - 2010-07-05

    Hmm could I ask for your advice? I'm planning to have an experiment on the
    accuracy of speech recognition under noise using noise-cancelling Bluetooth
    headsets. The independent variables are noise (40, 65, 90 dBA) and the
    headsets (one non-noise-cancelling headset and two Bluetooth headsets with
    noise-cancelling).

    The task is command and control, you can see in the /etc that it's only about
    20 commands with about as many words. I was planning to use a word model
    instead of the phone model, that's why I used my own training data. Another
    reason is that I'm from Singapore, and we have our own accent. Is it still
    better to use an4/rm1?

     
  • Nickolay V. Shmyrev

    The task is command and control, you can see in the /etc that it's only
    about 20 commands with about as many words. I was planning to use a word model
    instead of the phone model, that's why I used my own training data.

    I don't think how your experiment is related to training this model but to let
    you know that to train good model with 20 commands you need to have several
    hours of recordings of several hundreds of speakers. You can't train anything
    good with just 20 recordings. Word model aren't recommended for CMUSphinx as
    well.

    Another reason is that I'm from Singapore, and we have our own accent

    Again I'm not sure how accent related to testing task. It applies for each
    noise condition, isn't it?

    Is it still better to use an4/rm1?

    I don't see how they fit into your task.

     
  • Anonymous

    Anonymous - 2010-07-08

    The task is command and control, you can see in the /etc that it's only
    about 20 commands with about as many words. I was planning to use a word model
    instead of the phone model, that's why I used my own training data.

    I don't think how your experiment is related to training this model but to let
    you know that to train good model with 20 commands you need to have several
    hours of recordings of several hundreds of speakers. You can't train anything
    good with just 20 recordings. Word model aren't recommended for CMUSphinx as
    well.

    With regards to the word model thing, I was following this project here:
    http://hk.myblog.yahoo.com/jw!afd6dGGRHBRkp2l
    aqwk198fg/article?mid=629

    It suggested using a word model, and I'm not sure if I understood it
    correctly, but it seems to suggest recording your own training data as well.
    Oh I used my own recordings to get a word model because we need to provide the
    transcriptions for each recording right? So I was thinking since the
    vocabulary for an4 and rm1 are quite limited (they don't have 'engage' and
    'release', for instance, which are used in my commands), I'll need to provide
    recordings of these words if I wanted a word model. Correct me if I'm wrong.

    Another reason is that I'm from Singapore, and we have our own accent

    Again I'm not sure how accent related to testing task. It applies for each
    noise condition, isn't it?

    Hmm.. if the accent is different, the acoustic models for the phonemes should
    be different too right? Won't it cause the recognizer to be more inaccurate if
    your test data is a Singaporean accent while your training data is an American
    accent? I need the accuracy at low noise level to be high so that when it
    drops at higher noise levels, the drop is still significant, e.g. 90% to 50%,
    instead of 40% to 30%.

    Is it still better to use an4/rm1?

    I don't see how they fit into your task.

    Sorry when you said 'use stock model' (post 8), did you mean the an4/rm1
    databases? Because that's what I interpreted it as.

     
  • Nickolay V. Shmyrev

    With regards to the word model thing, I was following this project here: ht
    tp://hk.myblog.yahoo.com/jw
    !afd6dGGRHBRkp2laqw
    k198fg/article?mid=629 It suggested using a word model, and I'm not sure if I
    understood it correctly, but it seems to suggest recording your own training
    data as well.

    This newbie blog is not really good source for advises. It's nice he is trying
    new technology but the way he is doing that is not correct.

    Hmm.. if the accent is different, the acoustic models for the phonemes
    should be different too right? Won't it cause the recognizer to be more
    inaccurate if your test data is a Singaporean accent while your training data
    is an American accent? I need the accuracy at low noise level to be high so
    that when it drops at higher noise levels, the drop is still significant, e.g.
    90% to 50%, instead of 40% to 30%.

    High accuracy is gained through adaptation of the acoustic model, not with
    training your new one.

    Sorry when you said 'use stock model' (post 8), did you mean the an4/rm1
    databases? Because that's what I interpreted it as.

    When I talk about stock models I mean model that goes with pocketsphinx
    distribution (hub4wsj_sc_8k). This is one of the best models available for
    you.

     
  • Anonymous

    Anonymous - 2010-07-13

    hmm if I want to recognize speech from a bone conduction microphone I need to
    train with recordings only from that microphone right? Will it still work?
    When I look at the spectrogram for those recordings there's pretty much no
    high frequencies, which I think is typical of bone conduction recordings.

     
  • Nickolay V. Shmyrev

    hmm if I want to recognize speech from a bone conduction microphone I need
    to train with recordings only from that microphone right?

    Not necessary

    Will it still work?

    Yes

    When I look at the spectrogram for those recordings there's pretty much no
    high frequencies, which I think is typical of bone conduction recordings.

    Frequency response of the channel is largely normalized during feature
    extraction with cepstral mean normalization. With proper frequency range
    selected, it shouldn't cause any issues.

     
  • Anonymous

    Anonymous - 2010-07-14

    Hello,

    I tried out pocketsphinx instead of sphinx3 as you suggested. I got
    pocketsphinx_continuous to work in Ubuntu 10.04 but I couldn't do the same in
    Windows XP. Can you help me check where I had gone wrong? These are my steps
    in Ubuntu:

    1) Download, unzip and rename stable releases of pocketsphinx and sphinxbase.
    2) Create language model using lmtools, download and unzip.

    Now the directory looks like:
    /pocketsphinx
    /sphinxbase
    /limited/etc/9345 (.lm and .dic are inside /9345)

    3) commands:
    cd sphinxbase
    ./configure
    make
    cd ../pocketsphinx
    ./configure
    make clean all
    make test
    make install
    cd ../limited
    perl ../pocketsphinx/scripts/setup_sphinx.pl -task limited
    bin/pocketsphinx_continuous -hmm ../pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k
    -lm etc/9345/9345.lm -dict etc/9345/9345.dic

    As you said, this model works quite well for my application, at least for a
    regular air-conductive microphone. Thanks alot for that suggestion!

    Now the problem comes when I try to do the same thing with cygwin under
    Windows XP. I followed the same steps, but after I run set_sphinx.pl, the /bin
    is empty. Do you know why that's the case?

     
  • Anonymous

    Anonymous - 2010-07-14

    Sorry, typo: last line is "setup_sphinx.pl", not "set_sphinx.pl"

     
  • Nickolay V. Shmyrev

    If you are using perl from cygwin, it might have problems. We usually
    recommend to use ActivePerl.

    Anyway, I don't see the point for you to run this perl script. That
    setup_sphinx.pl is used for training. You can copy files yourself, can't you?

     
  • Anonymous

    Anonymous - 2010-07-15

    Hello, yeah actually I have several questions now.

    1) Yeah I did install ActivePerl, just like the tutorial (http://www.speech.c
    s.cmu.edu/sphinx/tutorial.html)
    suggested, but I don't know if I did it correctly because I've never
    worked with Perl before. OK this is not a question haha.

    2) Anyway today I built the stable releases of pocketsphinx and sphinxbase
    with Visual C++ 2008 Express, and there wasn't a pocketsphinx_livecontinuous
    in /pocketsphinx/src/programs/ or anywhere else in /pocketsphinx. This was the
    same case as when I compiled with Cygwin. When I did it in ubuntu (as I wrote
    in my last post), livecontinuous was in that folder. So did I do something
    wrong somewhere? Or is this expected?

    3) I don't know if this problem is specific to sphinx, but do I need to do
    some special thing to get pocketsphinx_livecontinuous to work with my
    Bluetooth headset? This is on Ubuntu. I got my headset paired with the
    computer and I can record audio with the Ubuntu sound recorder using the
    headset microphone, but I'm thinking Sphinx does not detect the microphone
    when the script runs, because it doesn't register anything (no hypothesis)
    when I speak after 'READY'. I did a forum search on this but the results were
    related to Sphinx4. Could you help me here?

    Jia Shen

     
  • Nickolay V. Shmyrev

    Or is this expected?

    Stable release has build bug. You need to try subversion snapshot instead

    but do I need to do some special thing to get pocketsphinx_livecontinuous to
    work with my Bluetooth headset?

    If bluetooth device is not default alsa device, you need to specify it's name
    with -adcdev option. Make sure alsa can capture from bluetooth and that you
    compiled pocketsphinx with alsa support. Also check mixer settings.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.