Menu

PocketSphinx TIDigits

Help
2009-11-20
2012-09-22
  • Jan Markowski

    Jan Markowski - 2009-11-20

    Hello :-)!

    I'd like to create application able to recognize about fifteen words and run it on mobile phone. I create acoustic model with SphinxTrain. But I've got some difficulties with PocketSphinx. First of all I'd like to know which files I need to edit in order to create my application analogically to already existing ones. I see much of redundant data and this is the reason why I ask above question. Which of the following directories contain files which I need to edit: doc, bin, src/programs, /usr/local/share or /usr/local/bin? In order to run application on mobile phone, I need .jar and .jad files. But I don't see any way to create those .jar files with any of parts of PocketSphinx. Didn't I notice something what is capable of creating .jar file?

    I think it is much simpler to create application for Sphinx4. I simply copied
    S:\tutorial\sphinx4-1.0beta3-src\src\apps\edu\cmu\sphinx\demo\helloworld
    directory to ...\my_app directory and I edit files inside it. For Sphinx4
    there is such a tutorial: http://www.speech.cs.cmu.edu/sphinx/tutorial.html . Is there similar one for
    PocketSphinx?
    I show below my attempt to solve my problem and some questions
    connected with it. I guess my approach isn't good so you can simply give me
    some general suggestions, rather than answer those questions, how to create /
    build / run applications for PocketSphinx.


    Those are my conclusions and questions after analysing pocketsphinx-0.5.1
    directory. I think I need doc/pocketsphinx_mdef_convert.1 in order to convert
    definitions of acoustic model from SphinxTrain which I created to change them
    into PocketSphinx version, am I right? I don't know why those three projects
    (batch, continuous, mdef_convert) are in both bin and doc directories and two
    projects (tidigits, wsj) are only in doc. I didn't see any html or txt
    documentation in doc directory. To which of those four (batch, continuous,
    tidigits, wsj) should I use analogy to create my application? Why some files
    are in /usr/local/bin rather than all in tutorial/pocketsphinx-0.5.1? Why is
    directory model in two places (/usr/local/share/pocketsphinx/model and
    tutorial/pocketsphinx-0.5.1) with the same content? Am I right that running
    application is simply entering doc directory in terminal and running
    pocketsphinx_tidigits? What are all of those things which it does? Running
    doc/pocketsphinx_batch doesn't do anything because I didn't give any
    parameters. But running doc/pocketsphinx_tidigits does some unclear to me
    things and later it enters the state which I can finish by ctrl+c.

    I guess, during the process of creating my own application, I need to be
    focused on model directory and its content. Do I need to replace cmudict.0.6d
    with my own dictionary or rather tidigits.dic? Do I need to copy to
    tidigits.lm that file which is generated from my dictionary with http://www.s
    peech.cs.cmu.edu/tools/lmtool.html
    ? Why does tidigits.dic has weird entries like "TWO T_two OO_two" rather
    than "TWO T OO"? Do I need to edit pocketsphinx.cfg and give here my own
    paths? Why does it use turtle as acoustic model? What is this turtle? Do I
    need to edit pocketsphinx_tidigits file in scripts directory? (I see in this
    file there are also paths to dictionary and language model). Do I need to run
    setup_tutorial.pl?

    I also examined src/programs directory. In the file continuous.c I see
    information that it uses fbs8 audio library. Why isn't it rather tidigits or
    wsj? Do I need to replace fbs8 with my wav files from which I created acoustic
    model in SphinxTrain? Do I need to edit manually Makefile which is in
    src/programs? What for do I need pocketsphinx_batch file? I guess I can ignore
    test/data/tidigits directory because it contains only files used to examine
    performance of application. Why does tidigits contain only mfc files and
    test/data/wsj not only mfc files, but also wav files? Why do files test-
    tidigits-fsg.match and tidigits.lsn have got the same content?

    Thanks for your help in advance!

    Greetings!

     
  • Anonymous

    Anonymous - 2009-11-20

    Hello,

    So first of all PocketSphinx is a C application, not Java. .jar are "compiled"
    java executables, there is of course no way of running a PS based app on your
    mobile. Look at Sphinx4 which is written in Java.

    Afterwards, if you want to train your own model, simply follow this tutorial,
    you will be able to generate a model for your application :
    http://www.speech.cs.cmu.edu/sphinx/tutorial.html#introduction

    You will basically need 3 things to program your app:

    1) Your Acoustic Model generated with SphinxTrain or using existing ones like
    the WSJ one.

    2) A language model you can generate will lm tool http://www.speech.cs.cmu.ed
    u/tools/lmtool.html

    or write manually using Finite State Grammar (FSG).

    3) The dictionnary of your app also generated by the online LM tool.

    Boris.

     
  • Jan Markowski

    Jan Markowski - 2009-11-20

    Thank you for your answer!

    More or less I know what to do with Sphinx4 and SphinxTrain. I followed that
    tutorial http://www.speech.cs.cmu.edu/sphinx/tutorial.html#introduction some time ago.
    What I don't know is how to move my project from Sphinx4 to PocketSphinx so
    that I can run it on mobile phone.

    So first of all PocketSphinx is a C application, not Java. .jar are
    "compiled" java executables

    OK, I forgot about such a simple thing that PocketSphinx is written in C, not
    in Java :-).

    there is of course no way of running a PS based app on your mobile. Look at
    Sphinx4 which is written in Java.

    Are you really sure that I cannot run PocketSphinx on mobile phone? Some time
    ago I asked here https://sourceforge.net/projects/cmusphinx/forums/forum/3823
    37/topic/3376756
    and Nshmyreq answered me "Pocketsphinx is used successfully
    on Symbian with minimal effort".

    There is also other way: "Pocketsphinx is used successfully on Symbian with
    minimal effort". However this option is much worse than running it on mobile
    phone. If I'd like to connect from mobile phone to server, I need to use Skype
    and then redirect speech from Skype to Sphinx4 on server, what is difficult
    thing. (My topic here: http://forum.skype.com/index.php?showtopic=464711). But, let me say it
    again, I think PocketSphinx on mobile phone is much better option than speech
    recognition on server.

    Summing up, how to edit / build / run applications from PocketSphinx (look at
    two first paragraphs in my first post)?

    Greetings!

     
  • Nickolay V. Shmyrev

    Damn, it's gonna be hard. Again you are asking 40 questions in one post. Also
    you are asking them on multiple forums at once. Could you try to generate less
    noise?

    One day one post one question. Is it ok? Anyhow, I'll try to answer to let us
    live in silence for the future.

    I'd like to create application able to recognize about fifteen words and run
    it on mobile phone.

    We know that already

    First of all I'd like to know which files I need to edit in order to create
    my application analogically to already existing ones.

    You need to edit source code of your application. It seems for me you don't
    quite understand what is programming. It's when you write programs. You don't
    edit existing files. You sit down and write the code that does thing.

    Which of the following directories contain files which I need to edit: doc,
    bin, src/programs, /usr/local/share or /usr/local/bin?

    None, you should not edit any pocketsphinx files. You are using pocketsphinx
    library as a library.

    In order to run application on mobile phone, I need .jar and .jad files.

    Not exactly. Many phones support either Java applications or Native
    applications. For example for Symbian platform you can create java-based
    application in jar file or native application in sis file with Nokia's
    development toolkit.

    http://eclipseme.org/

    http://www.forum.nokia.com/Tools_Docs_and_Code/Tools/

    Both requires you a lot of learning.

    For Sphinx4 there is such a tutorial: http://www.speech.cs.cmu.edu/sphinx/t
    utorial.html.

    It's not a tutorial for sphinx4. It's a tutorial for sphinxtrain.

    Is there similar one for PocketSphinx?

    No

    I show below my attempt to solve my problem and some questions connected
    with it. I guess my approach isn't good so you can simply give me some general
    suggestions, rather than answer those questions, how to create / build / run
    applications for PocketSphinx.

    The general suggestion is to learn how to create applications at all.
    The one who gave you this task certainly overestimated your abilities. Tell
    him my opinion that it's not so good to do this.
    Book like (also in Polish) could help you:

    http://ladweb.net/

    I think I need doc/pocketsphinxmdefconvert.1 in order to convert definitions
    of acoustic model from SphinxTrain which I created to change them into
    PocketSphinx version, am I right?

    No, this binary just compress mdef file. You can use mdef file without
    compression

    I don't know why those three projects (batch, continuous, mdefconvert) are
    in both bin and doc directories and two projects (tidigits, wsj) are only in
    doc.

    First of all they aren't projects. The applications like pocketsphinx_batch
    are in bin folder. The manuals in man format
    http://www.manpagez.com/man/n/format/
    are in doc folder. You can read man documentation with the command man.

    I didn't see any html or txt documentation in doc directory.

    Not every documentation is in html. Also I suppose you were able to build
    pocketsphinx, so you see javadoc files in doc/html folder.

    To which of those four (batch, continuous, tidigits, wsj) should I use
    analogy to create my application?

    pocketsphinx_continuous.

    Why some files are in /usr/local/bin rather than all in
    tutorial/pocketsphinx-0.5.1?

    Every linux program has source folder where it builds and installation folder
    where it installed and used. /usr/local/bin is installation folder of
    pocketsphinx binaries. Many programs are located in /usr/local/bin and in
    /usr/bin. It's a standard filesystem layout for Linux.

    Why is directory model in two places (/usr/local/share/pocketsphinx/model
    and tutorial/pocketsphinx-0.5.1) with the same content?

    The same as above

    Am I right that running application is simply entering doc directory in
    terminal and running pocketsphinxtidigits?

    Yes

    What are all of those things which it does?

    It recognizes digits you speak in the microphone

    But running doc/pocketsphinxtidigits does some unclear to me things and
    later it enters the state which I can finish by ctrl+c.

    It tells you you need to speak English digits. Did you try that?

    Do I need to replace cmudict.0.6d with my own dictionary or rather
    tidigits.dic?

    You need your own dictionary

    Do I need to copy to tidigits.lm that file which is generated from my
    dictionary with http://www.speech.cs.cmu.edu/tools/lmtool.html

    You need generated language model

    Why does tidigits.dic has weird entries like "TWO Ttwo OOtwo" rather than
    "TWO T OO"?

    This is the dictionary for small vocabulary recognition. Each word has unique
    phones. For large vocabulary recognition phones are shared between words.

    Do I need to edit pocketsphinx.cfg and give here my own paths?

    Yes

    Why does it use turtle as acoustic model?

    turtle is a language model. It uses turtle language model just for
    demonstration

    What is this turtle?

    It's a language model that recognizes turtle commands and allow you to manage
    little turtle. The sample sentences for managemnet are in file turtle.sent.

    Do I need to edit pocketsphinxtidigits file in scripts directory?

    No, because mobile phone has no shell.

    Do I need to run setuptutorial.pl?

    Only for testing an acoustic model.

    I also examined src/programs directory. In the file continuous.c I see
    information that it uses fbs8 audio library.

    No it doesn't use any fbs8 library. The comment you saw there is obsolete.

    Why isn't it rather tidigits or wsj?

    Because tidigits and wsj are names of the acoustic model and not the names of
    the library

    Do I need to replace fbs8 with my wav files from which I created acoustic
    model in SphinxTrain?

    You can't replace something that doesn't exist

    Do I need to edit manually Makefile which is in src/programs?

    No

    What for do I need pocketsphinx_batch file?

    You don't need it.

    I guess I can ignore test/data/tidigits directory because it contains only
    files used to examine performance of application.

    Yes

    Why does tidigits contain only mfc files and test/data/wsj not only mfc
    files, but also wav files?

    Because of the author preference.

    Why do files test-tidigits-fsg.match and tidigits.lsn have got the same
    content?

    Accidentally

    Thanks for your help in advance!

    You owe me beer at least.

     
  • Jan Markowski

    Jan Markowski - 2009-11-21

    Thanks!

    It looks like best choice is to have mobile phone with Symbian. I already
    tried creating application for mobile phone in Java ME with Wireless Toolkit.
    The program uses httpconnection POST method to send data from mobile phone to
    server, it also requires e.g. Tomcat to run on server in order to receive
    data. So if I can create application with PocketSphinx on mobile, I also can
    use the same httpconnection to send result of speech recognition to server. I
    guess I shouldn't use Wireless Toolkit because its applications are based on
    CLDC/MIDP so those are Java applications. And for PocketSphinx I need C. So I
    can use link which you gave me http://www.forum.nokia.com/Tools_Docs_and_Code
    /Tools/
    and check
    "Runtimes" to learn how to create C++ applications for Symbian. I see there
    are both C++ (Symbian C++ and Opec C/C++ Plug-in) and C (Maemo 5 SDK). It
    looks like I need some more guidelines about using PocketSphinx with Symbian.

    About using the application on computer it is simply running
    pocketsphinx_continuous. I still don't see the way of running it on mobile.
    The directory of pocketsphinx is about 30MB and not every file of this
    directory would be needed on mobile. And the installation process is different
    than for PCs. So my first aim is to find how to install and run this
    pocketsphinx_continuous on mobile. You said that "Pocketsphinx is used
    successfully on Symbian with minimal effort" so I guess some people have
    already created applications very similar to what I need. If I just could
    contact them.

    I need my own dictionary and language model. It is common thing for both
    Sphinx4 and PocketSphinx so there shouldn't be problems with creating .dic and
    .lm files. At first I thought I can simply create dictionary like "WORD WORD"
    and treat each of words as phoneme (even if it is not). Nsh suggested me that
    it is not good approach because of different length of words. So I created
    dictionary like this "TRZY T SZ Y". Now it looks like even this approach is
    not good and I should have "TRZY Ttrzy SZtrzy Ytrzy". From this dictionary I
    can create .phone file with entries like "Ttrzy SZtrzy Ytrzy" (of course each
    in new line) and many other, all phonemes which are present in my dictionary,
    with some redundant phonemes (e.g. Jjeden and Jpiec - the same phoneme for
    different word).

    OK, I posted my answer here and I didn't do it in the other topic on
    sourceforge and voxforge. Here http://www.voxforge.org/home/forums/message-
    boards/acoustic-model-discussions/creating-new-model-with-the-use-of-germany-
    voxforge-model/17
    I asked about transcription files. But it is less important
    issue than installing and running pocketsphinx_continuous on mobile.

    You say that programming is about writing programs rather than about editing
    already existing ones :-). That's right but how would I know, e.g. during
    writing application for Sphinx4 that I need to (like here S:\tutorial\sphinx4-
    1.0beta3-src\src\apps\edu\cmu\sphinx\demo\helloworld\HelloWorld.java) start
    manager, recognizer and microphone and do it in the way how they did it?
    Perhaps I could come to the conclusion about proper procedure by analysing all
    the tutorials, javadocs and so on but it would take much more time and would
    be more difficult. (About Javadoc I guess I lack experience in using
    documentation of source code). However it looks much simpler to simply check
    all the source files of already existing demos and come to the same
    conclusions much faster just by analysing the code and using analogy. Simply
    1. knowing all the files from which the solution is created, 2. the way of
    creating binaries from those files, 3. the way of running those on desired
    device (PC or mobile).

    I also thought about other approach, i.e. running Sphinx4 on server and
    establishing voice connection between mobile phone and server. I thought it
    can be done in at least two ways - with the use of Digium card on server or
    using Skype. The first approach is bad because those cards are not so cheap.
    And that second causes one difficulty with redirecting of speech from Skype on
    server to Sphinx4 on server. The latter can be solved by using Office
    Communication Server 2007 but again it is not freeeware application and I
    think it can be done somehow in easier way. Summing up, if I'd like to use
    this approach about communication with server, I need to find the way of
    redirecting speech between Skype and Sphinx4. And, of course, PocketSphinx on
    mobile is even better than Sphinx4 on server because it doesn't require paying
    for access to internet from mobile phone for Skype.

    You say that guy who gave me this task overestimated my abilities. It looks
    like I also overestimated my abilities by agreeing to finish the task. However
    I think it still should be possible for me to finish this project. I just
    wonder what kind of abilities I should improve at first in order to be able to
    say in the future that I am programmer for mobile phones and other embedded
    devices because certainly I cannot say it now. Perhaps I need to read some
    kind of book about 1. programming for Symbian, 2. using Eclipse with EclipseME
    and using Wireless Toolkit, 3. general programming for mobile phones, 4.
    development of applications for Linux. I guess betterworldbooks.com is good
    choise to buy those.

    Greetings!

     
  • Jan Markowski

    Jan Markowski - 2009-11-23

    Let me ask two questions.

    First. You said that "Pocketsphinx is used successfully on Symbian with
    minimal effort". Does it mean that I need to make "porting" to
    Symbian/Android?

    Second. Let's say I've got working application in Sphinx4. How to move it to
    PocketSphinx?

    Greetings!

     
  • eliasmajic

    eliasmajic - 2009-11-23

    First. You said that "Pocketsphinx is used successfully on Symbian with minimal effort". Does it mean that I need to make "porting" to Symbian/Android?

    I am not totally sure what your asking but pocketsphinx can be used on
    android, no clue about symbian. You need to look into the symbian/android
    frameworks.

    Second. Let's say I've got working application in Sphinx4. How to move it to
    PocketSphinx?

    Their different engines. They use the same models but they are different. Even
    different languages. A sphinx4 app wont work in pocketsphinx. You will need to
    rewrite it.

    Greetings!

     
  • eliasmajic

    eliasmajic - 2009-11-23

    Oops sorry I did not highlight your second question and did not cut the
    Greetings! part out. Since sourceforge is a POS I cant edit it.

     
  • Nickolay V. Shmyrev

    About easy of use on Symbian. Symbian part was yesterday committed to
    pocketsphinx, you see it's a few required modules for Nokia SDK:

    [http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx?view=rev&revision=9470]
    (http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx?view=rev&revision=9470)

    Thanks to Silvio Moioli

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.