Thank you for your help on Adapting Acoustic Model. I have some questions but as they are not in the same topic, I thought it was better to create a new topic. But feel free to merge the two topics if you prefer.
I'm trying to use the PocketSphinx "library" written in C so that I will be able to use with LUA code. Basically, I would like to "create" only a few methods and some will merge Pocketsphinx's functions. However, I have some basic questions reading the code :
1) I don't really understand the difference between :
Which one should I use to get the result of the decoder at the end of an utterance ?
2) I don't really understand how to create a new decoder with an Acoustic Model (given its path) and a Dictionary (given its path). I think I have to use this function :
But I have no idea how to implement it. For example, I would like to give the path of the Acoustic Model and of the Dictionary.
For now, the only solution I found was to use
cmd_ln_set_str_r(config, "-hmm", hmmpath);
and
cmd_ln_set_str_r(config, "-dict", dictpath);
Maybe it is the right way to do it, but I don't understand the interest of cmd_ln_parse_r.
I don't really understand the difference between :
char const ps_get_hyp(ps_decoder_t ps, int32 out_best_score)
char const ps_get_hyp_final(ps_decoder_t ps, int32 out_is_final)
Those two functions are described in API docs, first returns the hypothesis and score, second hypothesis and final flag. You can use either of them according to your needs.
I have a new question concerning how to recuperate the audio stream from Android and iOS.
Indeed, I've seen that in pokcetsphinx, there is the "ad" functions :
However, I've also seen that on the android-demo, you use directly the function from Android :
public void startRecording()
So I would like to know if I could use the Android/iOS functions to do all the recording and reading things and give the buffers to the pocketsphinx functions.
Ad functions are not supported neither on iOS nor on Android. You have to record data with Android tools and pass it to decoder with ps_process_raw, same for iOS. This approach is demonstrated in demo.
You can also implement ad in Android with OpenSL ES if you want to record audio without Java.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In the file continuous.c, you call ps_end_utt then ps_get_hyp.
In the Android demo, it seems that you first call get_hyp (Hypothesis var4 = SpeechRecognizer.this.decoder.hyp(); ) then you call endUtt.
Is there a reason for that (maybe the difference between onPartialResult and onResult ?) ? Or am I wrong ?
And so another question : do we have to call end_utt after/before getting any hypothesis ?
Thanks
Paul
Last edit: Paul Rolin 2015-07-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm closed to the end but I still have a problem concerning the setting up of the decoder.
My code is C (in fact the main code is in LUA but I call C code in it) and it seems that the "ps_init" method is a kind of "blocking" (ie, when it's called nothing is done then).
But the program continues. (I know that the error comes from the fact that the program doesn't know where are the files but it continues whereas if I specify the path, it doesn't)
In fact, my only solution is to use a AsyncTask in Android (that calls a function init in C that only does ps_init(config) ) in order to have something non-blocking but I think it can be done differently (for example in the continuous.c file, there is a ps_init(config) )
Last edit: Paul Rolin 2015-08-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi again,
Thank you for your help on Adapting Acoustic Model. I have some questions but as they are not in the same topic, I thought it was better to create a new topic. But feel free to merge the two topics if you prefer.
I'm trying to use the PocketSphinx "library" written in C so that I will be able to use with LUA code. Basically, I would like to "create" only a few methods and some will merge Pocketsphinx's functions. However, I have some basic questions reading the code :
1) I don't really understand the difference between :
Which one should I use to get the result of the decoder at the end of an utterance ?
2) I don't really understand how to create a new decoder with an Acoustic Model (given its path) and a Dictionary (given its path). I think I have to use this function :
and then do
But I have no idea how to implement it. For example, I would like to give the path of the Acoustic Model and of the Dictionary.
For now, the only solution I found was to use
and
Maybe it is the right way to do it, but I don't understand the interest of cmd_ln_parse_r.
I also saw that we could do :
So I just don't know what's the right way to do it.
Thanks in advance for your help,
Paul
Last edit: Paul Rolin 2015-07-27
Those two functions are described in API docs, first returns the hypothesis and score, second hypothesis and final flag. You can use either of them according to your needs.
This is an internal function and it is not present in public headers. You can not use it
You can use cmd_ln_init instead of cmd_ln_parse to quickly create config from set of string paramters:
All those functions are covered and explained in Pocketsphinx tutorial:
http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx
Please review it.
Ok thank you for your answer. I had seen the tutorial but as I also read the files, I didn't know exactly which functions to use.
I still don't really understand what you mean by score and final flag. (I read the tutorial and the API but I still don't understand, sorry...)
Score is a log probability of the match between the acoustic model and the audio. It is not really useful just present for historical reasons.
Grammar recognition result is final when it fully match the grammar. For example if grammar is:
then result "hello" is partial. Result "hello world" is final because it fully matches the grammar. The final flag tells you if result is final.
Ok that's clear :) Thank you very much !
Hi again,
I have a new question concerning how to recuperate the audio stream from Android and iOS.
Indeed, I've seen that in pokcetsphinx, there is the "ad" functions :
However, I've also seen that on the android-demo, you use directly the function from Android :
So I would like to know if I could use the Android/iOS functions to do all the recording and reading things and give the buffers to the pocketsphinx functions.
For example, with the 2 functions from Android :
and
And then give the data to the pocketsphinx function :
with
Can I do that ? Or should I use the "ad" functions and they would be to do the same both on Android and iOS with :
What would be the device name btw (for Android for example) ?
Thanks a lot.
Paul
Ad functions are not supported neither on iOS nor on Android. You have to record data with Android tools and pass it to decoder with ps_process_raw, same for iOS. This approach is demonstrated in demo.
You can also implement ad in Android with OpenSL ES if you want to record audio without Java.
Ok thank you.
I have a new question :
In the file continuous.c, you call ps_end_utt then ps_get_hyp.
In the Android demo, it seems that you first call get_hyp (Hypothesis var4 = SpeechRecognizer.this.decoder.hyp(); ) then you call endUtt.
Is there a reason for that (maybe the difference between onPartialResult and onResult ?) ? Or am I wrong ?
And so another question : do we have to call end_utt after/before getting any hypothesis ?
Thanks
Paul
Last edit: Paul Rolin 2015-07-30
You can retrieve a current hypothesis any time, before or after utterance is over.
Hello again,
I'm closed to the end but I still have a problem concerning the setting up of the decoder.
My code is C (in fact the main code is in LUA but I call C code in it) and it seems that the "ps_init" method is a kind of "blocking" (ie, when it's called nothing is done then).
Here is my code :
And then I do :
When I call the ps_init, the code is "stopped", anything after it is not executed (for ex, BLOP is never printed). Do you have a explanation ?
Thanks
Last edit: Paul Rolin 2015-08-19
You can find more details in log output printed on console when you run this code.
Ok so here is the code I execute (in LUA) :
initDecoder is :
And in my logcat, I have :
Okay, if I do :
The log output is :
But the program continues. (I know that the error comes from the fact that the program doesn't know where are the files but it continues whereas if I specify the path, it doesn't)
However, if I do :
The log output is :
And the program never stops.
In fact, my only solution is to use a AsyncTask in Android (that calls a function init in C that only does ps_init(config) ) in order to have something non-blocking but I think it can be done differently (for example in the continuous.c file, there is a ps_init(config) )
Last edit: Paul Rolin 2015-08-19