Menu

Beginner help with pocketspinx on windows

Help
2016-09-26
2016-09-26
  • Harry Garrison

    Harry Garrison - 2016-09-26

    Hello Everyone!

    So I am new to this whole cmusphinx experience and I really want to create a python application using pocketsphinx. More specifically, I would like to train an acoustic model for a new language and then use that in a python application via the speech_recognition module that makes use of pocketsphinx language models.

    The Problem is that I am a complete beginner and there aren't many tutorials online. I thouroughly read the tutorial on the CMUSphinx Wiki but I still can't figure out how to train an acoustic model on windows. I have reached a point where I have downloaded the an4_sphere example along with all the other necessary sphinx modules and now I am trying to train a model based on that. The command I should probably be using at this point is -and I quote from the CMUSphinx Wiki- "python ../sphinxtrain/scripts/sphinxtrain -t an4 setup". From my understanding, this is a command prompt. but when I use windows cmd it gives me the following message: "Can't open file ../sphinxtrain/scripts/sphinxtrain no such file or directory". Should I add something to the variable path? Should I use cygwin? Or should I abandon windows and move to a Linux environment that will supposedly make things more straight-forward and easier?

    I need to know how to resolve this preliminary issue before proceeding to the creation of the database that I intend to use for the training. Any help please?

     
    • Nickolay V. Shmyrev

      Should I add something to the variable path?

      You need to learn a bit about command line. When you type a command it uses relative path you specified to look for data files it supposed to run. So it matters what is your current folder and what is the file layout. You need to run this command from the acoustic model folder and it should properly reference the file in filesystem. If the file is missing it means you are in the wrong folder currently or file layout is different from the one described in tutorial. Command cd prints current directory and allows you to change it.

      Should I use cygwin?

      No

      Or should I abandon windows and move to a Linux environment that will supposedly make things more straight-forward and easier?

      Linux is recommended, however you will meet same problems with command line there.

       
  • Harry Garrison

    Harry Garrison - 2016-09-26

    Thank you for your reply, Nickolay V. Shmyrev.
    I created a folder named "sphinx" in the C:\ directory. There I put everything Cmusphinx related.

    Having said that, the an4 example and its text components are located in the following directory:
    C:\sphinx\an4_sphere\an4\etc

    and supposedly I need to utilize a file named "sphinxtrain", which is located in python C:\sphinx\sphinxtrain\sphinxtrain\scripts\

    So if i get ths correctly:
    1) I open the command line from within the C:\sphinx\an4_sphere\an4\etc directory
    2) on command line I input: python C:/sphinx/sphinxtrain/sphinxtrain/scripts/sphinxtrain -t an4 setup
    3) ???
    4) profit

    Unfortunately, it doesn't work. it gives me: File "C:/sphinx/sphinxtrain/sphinxtrain/scripts/sphinxtrain", line 97 exit<ret>
    TabError inconsistent use of tabs and spaces in indentation</ret>

    What am I missing?

     
    • Nickolay V. Shmyrev

      1) I open the command line from within the C:\sphinx\an4_sphere\an4\etc directory

      Acoustic model folder should be C:\sphinx\an4, not C:\sphinx\an4_sphere\an4.

      2) on command line I input: python C:/sphinx/sphinxtrain/sphinxtrain/scripts/sphinxtrain -t an4 setup

      Sphinxtrain folder should be C:\sphinx\sphinxtrain, not C:\sphinx\sphinxtrain\sphinxtrain

      TabError inconsistent use of tabs and spaces in indentation

      To use python3 you can open sphinxtrain script in text editor and replace tabs with spaces on lines around line 112. Alternatively you can use python2.7

       
  • Harry Garrison

    Harry Garrison - 2016-09-26

    It was C:\sphinx\sphinxtrain\sphinxtrain and not C:\sphinx\sphinxtrain\ because of how I extracted the packages. Now I fixed that by shortening the directory.

    I tried to play smart and I used an automatic "tabs to spaces" utility. Apparently, it changed the whole thing from tabs to spaces. Now the error message I get is "IndentationError: expected an indented block". Funny thing is, that the error appears to be in line 97 again. How do I know which parts are supposed to be in tabs and which parts should be in spaces? This python 2 sure is confusing.

     
    • Nickolay V. Shmyrev

      In python3 there should be no tabs, everything should be spaces. It is recommended to edit by hand, not with utility to keep indentation which is important for python.

       
  • Harry Garrison

    Harry Garrison - 2016-09-26

    I made progress. I replaced a couple tabs with spaces. Now I get another error on line 133: "except getopt.GetoptError, err". The comma before the word "err" is indicated as the problem. Is this error happening due to inconsistencies between py2 and py3?

     
    • Nickolay V. Shmyrev

      You can probably remove python 3.4 and install python 2.7

       
      • Nickolay V. Shmyrev

        You can also write

         except getopt.GetoptError as err:
        
         

        Last edit: Nickolay V. Shmyrev 2016-09-26
  • Harry Garrison

    Harry Garrison - 2016-09-26

    Thank you once again. Your tweaks seem to have fixed the sphinxtrain file issues. Now the error I get is: "Failed to find sphinxtrain binaries. Check your installation". Could it be that I missed installing something or that something was not correctly installed?

     
    • Nickolay V. Shmyrev

      Did you download precompiled sphinxtrain or what? Binaries must be in C:\sphinx\sphinxtrain\bin\Release\Win32

       
  • Harry Garrison

    Harry Garrison - 2016-09-26

    the closest I have to that directory is C:\sphinx\sphinxtrain\win32. there are no bin and Release folders anywhere in there.

    In C:\sphinx\sphinxtrain\win32 there are two folders named libs and programs. I downloaded everything from the official Download page: http://cmusphinx.sourceforge.net/wiki/download

     
  • Harry Garrison

    Harry Garrison - 2016-09-26

    Quick Update: I downloaded the precompiled version, changed the sphinxtrain file and ran the cmd prompt. it gave me:
    "Sphinxtrain path: C:/..."
    "Sphinxtrain binaries path: C:/..."
    "Setting up the database an4"

    Meanwhile, it created an "etc" folder in the C:\sphinx\an4\etc directory and in there there are two files named feat.params and sphinx_train.cfg. Now I know for a fact that the latter is going to be used for setting the various parameters. But I also know that there should be more data folders in C:\sphinx\an4\etc, namely
    etc
    feat
    logdir
    model_parameters
    model_architecture
    result
    wav

    Where them folders at?

     
    • Nickolay V. Shmyrev

      Meanwhile, it created an "etc" folder in the C:\sphinx\an4\etc directory and in there there are two files named feat.params and sphinx_train.cfg

      If you unpacked an4_sphere.tar.gz properly there should be other files. Also wav subfolder which is inside an4 archive.

      But I also know that there should be more data folders in C:\sphinx\an4\etc, namely

      It is ok, most other folders are created during training.

       
  • Harry Garrison

    Harry Garrison - 2016-09-26

    So the creation of predescribed folders is happening during training process...good to know. Its is still baffling, though, because the tutorial/wiki states that the folders should be created right after the "python ../sphinxtrain/scripts/sphinxtrain -t an4 setup" prompt.

    Btw, could you elaborate on the "If you unpacked an4_sphere.tar.gz properly there should be other files"? Is there a wrong way to unpack it? I used win.rar and the following files were in the an4_sphere folder:
    etc
    wav
    LICENSE
    README

     
    • Nickolay V. Shmyrev

      I wrote above:

      Acoustic model folder should be C:\sphinx\an4, not C:\sphinx\an4_sphere\an4.

      When you unpack an4_sphere.tar.gz, you should have C:\sphinx\an4\wav an C:\sphinx\an4\etc with files like an4_train.fileids, an4_test.fileids and so on. sphinxtrain setup should add C:\sphinx\an4\etc\sphinx_train.cfg. After that you can continue next steps from tutorial.

       
  • Harry Garrison

    Harry Garrison - 2016-09-26

    Thanks a lot Nickolay! You have been of great help. Now I will take it from here and see what I can do. If I stumble on any other problems I shall post in this forum again. Thanks for your time and effort!

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.