Menu

Training questions (need help getting started)

Help
SimonHGR
2014-10-27
2014-10-30
  • SimonHGR

    SimonHGR - 2014-10-27

    I managed to build sphinxbase and pocketsphinx, and it recognizes my voice with perhaps 30-50% accuracy. So, I need to improve this, and I started to look at sphinxtrain. I seem to have succeeded with this step, but I'm not able to run it.

    I should note that when I run pocketsphinx, I manually set LD_LIBRARY_PATH to ensure that it finds the supporting elements.

    Also, I'm making guesses on how to run sphinxtrain, as I have not found any documentation (If you can point me at that, that would be a good thin, I"m sure!)

    So far, I have attempted to run <build>/scripts/sphinxtrain and also to run this with the argument "run".</build>

    When I run it solo, it prints out a usage message (which is how I came to the argument "run"). When I run it with "run" it complains:

    Failed to find sphinxtrain binaries. Check your installation

    But I built it with:

    ./autogen.sh && make && sudo make install

    (which is also how I built all the sphinx related binaries.

    If I run it under strace, I see this error:

    Can't open perl script "SPHINXTRAINDIR/scripts/000.comp_feat/make_feats.pl": No such file or directory
    Can't open perl script "SPHINXTRAINDIR
    /scripts/000.comp_feat/make_feats.pl": No such file or directory

    suggesting that somehow there's a ??variable?? sphinxtrain_dir that I need to set somehow (the scripts are present in the same subdirectory as the sphinxtrain script)

    Anyway, assuming I get past this point, I would also like to understand:

    1) How much training am I likely to have to give to get this up to 80-plus-percent accuracy? Is that likely to be possible by this route? If it's not going to get there, then I should probably not invest the time trying.

    2) How do I train it? Will it be obvious when I get this command running? Since I don't see documentation, I'm a bit concerned that there might be some complex training process that I won't know about if it's not GUI led.

    3) Will I be able to do the training based on recorded files and text templates? I'd prefer that if possible, since I have a lot of recorded material already, and some of it has already been manually transcribed (which is what I want this to do for me in the long run)...

    Thanks for any assistance!
    Cheers,
    Simon

     
    • Nickolay V. Shmyrev

      Hello Simon

      First of all you need to provide the audio you are trying to recognize. Further steps on accuracy improvement depends on that.

      I don't think you need to proceed with training before you get some good accuracy with your recordings.

       

Log in to post a comment.

MongoDB Logo MongoDB