Menu

pocketsphinx questions

Help
Agi Lim
2013-10-07
2014-02-26
  • Agi Lim

    Agi Lim - 2013-10-07

    Hi,

    First of all, big thanks to the CMU Sphinx team for making this great open source software.

    Background:
    I am trying to write an Android app where it will have continuous speech recognition in the background (as long as the app is running). The app is voice command/control oriented (not dictation) so the vocabulary in dictionary will be limited. So far, I have downloaded pocketsphinxandroiddemo and got it running. I also use lmtool to upload my words and with the resulting LM, accuracy and speed is very good. I also tried pocketsphinx_continuous.exe with my LM and it works great.

    Next, I am trying to tackle the continuous part in Android. Here are some questions I have:

    I am thinking of using SWIG to port the functions ad_open_dev, cont_ad_init, ad_start_rec, cont_ad_calib, cont_ad_read, etc to Java and use them from Android. There are two questions here: Will this approach even work, i.e. will the ported version of ad_open_dev and ad_start_rec work in Android or we have to use AudioRecord API to record in Android? Will this approach give us better battery consumption than using Android API like AudioRecord or SpeechRecognizer?

    To improve accuracy, I want to change the LM on the fly during runtime, depending on where user are in the app. Is it possible? If so, can we reuse the same Config object and just change the -lm and -dict? Or do we need to create new Config and Decoder object every time we context switch.

    Thanks!

     
  • Nickolay V. Shmyrev

    First of all, big thanks to the CMU Sphinx team for making this great open source software.

    You are welcome

    Will this approach even work, i.e. will the ported version of ad_open_dev and ad_start_rec work in Android or we have to use AudioRecord API to record in Android?

    Yes, sure. We are actually working on a voice activity detectoin ourselves, but this project is not ready yet. It will be available in 2 weeks.

    Will this approach give us better battery consumption than using Android API like AudioRecord or SpeechRecognizer?

    Yes.

    To improve accuracy, I want to change the LM on the fly during runtime, depending on where user are in the app. Is it possible?

    Yes, see ps_update_lmset function in API reference.

    If so, can we reuse the same Config object and just change the -lm and -dict? Or do we need to create new Config and Decoder object every time we context switch.

    You can use the same decoder.

     
    • Agi Lim

      Agi Lim - 2013-10-07

      Thank you Nickolay for the quick reply. Your responses have been very encouraging.

      That is very exciting news. I am interested to learn about the voice activity detection project and the approach. Is that for android and going to be open source as well? If so, then it will be one of the most anticipated features that many people have been looking for. Please keep us posted on it!

      Thanks!

       
  • Agi Lim

    Agi Lim - 2013-10-08

    Hello Nickolay,

    For experimentation, I added the ad_open_dev to pocketsphinx.i as follows (I tried to follow the same pattern as what is there for Decoder, Config, etc).
    ....
    typedef struct ad_rec_t AudioRecorder;
    ....
    typedef struct ad_rec_t {
    } AudioRecorder;
    ....
    %extend AudioRecorder {
    AudioRecorder(char const audio_device_name) {
    AudioRecorder
    ad = ad_open_dev(audio_device_name, (int32) 8000.0);
    return ad;
    }
    };

    Ran swig.exe and it generated the JNI wrapper and Java class which is good.

    Then I changed RecognizerTask.java constructor to add a simple call to create new AudioRecorder object.

    When I ran project clean, it gives some warning during compilation of the pocketsphinx_wrap.c, which is as follows.

    D:/CMUSphinx/PocketSphinxAndroidDemo//jni/pocketsphinx_wrap.c: In function 'new_AudioRecorder':
    D:/CMUSphinx/PocketSphinxAndroidDemo//jni/pocketsphinx_wrap.c:907:23: warning: initialization from incompatible pointer type [enabled by default]

    Do you see any issue with this warning?
    I am not sure why it did not complain this for the Decoder which came from base code, because it looks the same.

    Next the ndk-build failed with this error.

    D:/android-ndk-r9/toolchains/arm-linux-androideabi-4.6/prebuilt/windows-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.6/../../../../arm-linux-androideabi/bin/ld.exe: D:/CMUSphinx/PocketSphinxAndroidDemo//obj/local/armeabi/objs/pocketsphinx_jni/pocketsphinx_wrap.o: in function Java_edu_cmu_pocketsphinx_pocketsphinxJNI_new_1AudioRecorder:D:/CMUSphinx/PocketSphinxAndroidDemo//jni/pocketsphinx_wrap.c:907: error: undefined reference to 'ad_open_dev'

    Can you please help point what is the issue with this link error and how to solve it.

    Thanks!

     
  • Nickolay V. Shmyrev

    warning: initialization from incompatible pointer type [enabled by default]

    This error means you didn't include the header file which declares the used types (ad.h and cont_ad.h). This is done in swig file header.

    Undefined reference to 'ad_open_dev'

    This error means you didn't link to the object file with the corresponding function (ad.o created from ad.c). This can be done in Makefile.

     
  • Agi Lim

    Agi Lim - 2013-10-08

    Thanks Nickolay for the hints. Need more clarification please.

    When you say swig file header, do you mean pocketsphinx.i? I have added these after the #include <sphinxbase\err.h>, but still got the warning.
    #include <sphinxbase/ad.h>
    #include <sphinxbase/cont_ad.h>

    I am also not able to find ad.c nor ad.o anywhere in the code. There is only ad.h under sphinxbase\include\sphinxbase. If I try to do global search of ad_open_dev, I found the definition of ad_open_dev in rec_win32.c, but the comment in the code says it is dummy function? Is this what we should use?

    /* FIXME: Dummy function, doesn't actually use dev. */
    ad_rec_t *
    ad_open_dev(const char *dev, int32 sps)
    {
    return (ad_open_sps_bufsize(sps, WI_BUFSIZE * DEFAULT_N_WI_BUF * 1000 / sps));
    }

    Thank you.

     
  • Agi Lim

    Agi Lim - 2013-10-09

    Hello, I have some updates.

    I made a few changes in Android.mk to compile ad_base.c and cont_ad_base.c and finally was able to get pass the previous link error. But when I tried to call the JNI functions from Android, the call to audio recorder related function like ad_start_rec and cont_ad_calib returned -1.

    This makes me think that maybe the JNI call for ad_open_dev is not working in Android. So, just to see if continuous recognize_from_microphone function works or not in Android, I use include $(BUILD_EXECUTABLE) in Android.mk to build continuous executable, push it to my rooted HTC One phone and use adb shell to run it.

    This is the error I got:
    ....
    ....
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 17 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 248
    INFO: ngram_search_fwdtree.c(338): after: 57 root, 120 non-root channels, 16 single-phone words
    INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: continuous.c(380): ./continuous COMPILED on: Oct 8 2013, AT: 12:29:58

    Recognizing from mic...
    A/D library not implemented
    FATAL_ERROR: "continuous.c", line 246: Failed to open audio device

    Line 246 in my continuous.c refers to ad_open_dev, so I believe this does not work in Android.

    Question to Sphinx team: Is there a plan to provide native A/D library for Android?
    If not, what would be your recommendation for low-power audio recorder API which we need to develop continuous speech recognition app?

    Thanks!

     
  • Nickolay V. Shmyrev

    This makes me think that maybe the JNI call for ad_open_dev is not working in Android.

    Right, because you didn't implement AD itself. There is actually no easy way to make it. What we are doing is to use Java to input sound and then just add voice activity detector on the top of it.

    Is there a plan to provide native A/D library for Android?

    I wrote you about that in a first answer, the work on voice activity detection is ongoing

     
  • mondhs

    mondhs - 2014-02-24

    Nickolay,

    Not sure if I understand you statement correctly on "I wrote you about that in a first answer, the work on voice activity detection is ongoing".

    What I see from current svn branch http://svn.code.sf.net/p/cmusphinx/code/trunk/pocketsphinx/src/programs/continuous.c

    1. cont_ad_init(ad, ad_read) - VAD initialization requires callback audio device function "ad_read"
    2. cont_ad_calib(cont) - takes samples from microphone and adjust thresholds continuously.
    3. cont_ad_read(cont, adbuf, 4096) - filter out audio stream(byte array) non-voice samples.
    4. ps_process_raw(ps, adbuf, k, FALSE, FALSE) - passing filtered out voice-only audio stream.

    I would like to use cont_ad library same as ps: cont_ad_read(...) takes steam and somehow passes to ps_process_raw(...); or

    Is any plans to refactor ad_cont.h to decouple from ad.h [cont_ad_init(ad, ad_read)]?

     
  • Nickolay V. Shmyrev

    Ok, let me know how it goes

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.