Menu

Quick and dirty brainless Linux step-by-step tutorial for wav to text English conversion

Help
c1672140
2014-12-17
2014-12-21
  • c1672140

    c1672140 - 2014-12-17

    I am a student of sociology and my group needs to convert, as part of a group project, a wav file to text. We are not Linux gurus (Kubuntu 14.04) and probably never will be. We are not speech recognition gurus and probably never will be. We just want to convert a file from wav to text quick and dirty.

    We can't find a step-by-step cut-and-paste simple brainless tutorial on how to just install the darn thing, on Linux.

    Here is our best attempt at making that tutorial but it fails, we think, because we can't figure out WHERE to put the English library files, and how then to make the call for the conversion.

    Would you kindly explain our error?

    HOW TO CONVERT A WAV TO TEXT ON LINUX (KUBUNTU 14.04):
    REF: http://sourceforge.net/projects/cmusphinx/

    SUMMARY:
    You apparently need three things:
    1. sphinxbase
    2. pocketsphinx
    3. a language database (English is all we care about)
    4. wav file (e.g., /usr/share/sounds/alsa/Front_Center.wav)

    1. Obtain sphinxbase-0.8 from http://cmusphinx.sourceforge.net/wiki/download/
      Namely: http://sourceforge.net/projects/cmusphinx/files/sphinxbase/0.8
      NOTE: It would be preferable to have a "wget" command inserted here.
      $ tar -xvzf sphinxbase-0.8.tar.gz
      $ mv /tmp/spinxbase-0.8 /tmp/sphinxbase
      $ cd /tmp/sphinxbase
      $ sudo apt-get install libtool bison (dependencies that we had needed)
      $ view README
      $ view INSTALL
      $ ./autogen.sh
      $ ./configure
      $ make check (optional)
      $ make installcheck (optional)
      $ make clean all (if this is not the 1st time)
      $ make
      $ su root
      # make install

    2. Obtain pocketsphinx from http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/0.8/
      NOTE: It would be preferable to have a "wget" command inserted here.
      $ tar -xvzf pocketsphinx-0.8.tar.gz
      $ mv /tmp/pocketsphinx-0.8 /tmp/pocketsphinx
      $ cd /tmp/pocketsphinx
      $ view README
      $ view INSTALL
      $ ./autogen.sh
      $ ./configure
      $ make check (optional)
      $ make test
      $ make installcheck (optional)
      $ make clean all (if this is not the 1st time)
      $ make
      $ su root
      # make install

    3. Obtain US English generic acoustic models:
      http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20Generic%20Acoustic%20Model/

    Note: One or more of these English databases seem to be needed:
    $ tar -xvzf /tmp/en-us.tar.gz
    $ tar -xvzf /tmp/en-us-8khz.tar.gz
    $ tar -xvzf /tmp/en-us-semi.tar.gz
    $ tar -xvzf /tmp/en-us-semi-full.tar.gz
    NOTE: We just want a brainless pick of what will work on a "hello world" style file.

    1. Convert a WAV file to text on Linux:
      This seems to be needed because ours seems to have installed in a different place than the compiled binary is looking for it (which is odd but we're ok since it's a simple step):
      $ sudo ln -s /usr/local/share/pocketsphinx /usr/share/pocketsphinx

    Here are some universal "hello world" style test files:
    $ cp /usr/share/sounds/alsa/Front_Center.wav file1.wav
    $ cp /usr/share/sounds/alsa/Front_Right.wav file2.wav
    $ cp /usr/share/sounds/alsa/Rear_Right.wav file3.wav

    This is the best we can come up with to date for a basic dumb first-time-ever running of the program on the simplest of all test files:
    $ pocketsphinx_continuous -infile file1.wav -hmm en-us -lm en-us.lm.dmp 2> pocketsphinx.log

    Unfortunately, that command fails every single time, mostly, we think, because we have no guidance as to where to find the thing called "en-us" (which is in /tmp/) and the thing called "en-us.lm.dmp" which doesn't seem to exist yet).

    We can't find a decent tutorial that just says what to do (nothing else is desired or needed). No choices. Just do this. Do that. And it will work, is what we want. (We are not gurus and don't even want to be gurus. We just want it to work.)

    From the man pages:
    Note: The -hmm directory and -dict file arguments are always required.
    Note: Either -lm or -fsg is required, depending on whether you are using a statistical language model or a finite-state grammar.

    What step are we doing wrong for a quick and dirty brainless Linux Kubuntu 14.04 installation and "hello world" style simplest-possible test run?

     

    Last edit: c1672140 2014-12-17

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.