Menu

android pocketsphinx wrong recognized results

Help
Rawan Laz
2017-04-10
2017-04-10
  • Rawan Laz

    Rawan Laz - 2017-04-10

    The grammer I used in the training was based on sentences -it's like sentence1|sentence2 and so on- and it worked well when I was training the acoustic model ... but the results of the recognizer gives me words why? and is there a way that I can make the recognizer wait more before it stops taking the input from the user?
    Is there more clear documentation for using pocket sphinx for android than the current one?

    package edu.cmu.pocketsphinx.demo;
    
    import android.Manifest;
    import android.app.Activity;
    import android.content.pm.PackageManager;
    import android.os.AsyncTask;
    import android.os.Bundle;
    import android.support.v4.app.ActivityCompat;
    import android.support.v4.content.ContextCompat;
    import android.widget.TextView;
    import android.widget.Toast;
    
    import java.io.File;
    import java.io.IOException;
    import java.util.HashMap;
    
    import edu.cmu.pocketsphinx.Assets;
    import edu.cmu.pocketsphinx.Hypothesis;
    import edu.cmu.pocketsphinx.RecognitionListener;
    import edu.cmu.pocketsphinx.SpeechRecognizer;
    import edu.cmu.pocketsphinx.SpeechRecognizerSetup;
    
    import static android.widget.Toast.makeText;
    
    public class PocketSphinxActivity extends Activity implements
            RecognitionListener {
    
        private static final String TEST_SEARCH = "test";
    
        /* Used to handle permission request */
        private static final int PERMISSIONS_REQUEST_RECORD_AUDIO = 1;
    
        private SpeechRecognizer recognizer;
        private HashMap<String, Integer> captions;
    
        @Override
        public void onCreate(Bundle state) {
            super.onCreate(state);
            captions = new HashMap<String, Integer>();
            captions.put(TEST_SEARCH,R.string.test_caption);
            // Prepare the data for UI
            setContentView(R.layout.main);
            ((TextView) findViewById(R.id.caption_text))
                    .setText("Preparing the recognizer");
    
            runRecognizerSetup();
        }
    
        private void runRecognizerSetup() {
            // Recognizer initialization is a time-consuming and it involves IO,
            // so we execute it in async task
            new AsyncTask<Void, Void, Exception>() {
                @Override
                protected Exception doInBackground(Void... params) {
                    try {
                        Assets assets = new Assets(PocketSphinxActivity.this);
                        File assetDir = assets.syncAssets();
                        setupRecognizer(assetDir);
                    } catch (IOException e) {
                        return e;
                    }
                    return null;
                }
    
                @Override
                protected void onPostExecute(Exception result) {
                    if (result != null) {
                        ((TextView) findViewById(R.id.caption_text))
                                .setText("Failed to init recognizer " + result);
                    } else {
    //                    switchSearch("Start Recognize");
                        ((TextView) findViewById(R.id.caption_text))
                                .setText(R.string.test_caption);
                        recognizer.startListening(TEST_SEARCH);
                    }
                }
            }.execute();
        }
    
        @Override
        public void onDestroy() {
            super.onDestroy();
    
            if (recognizer != null) {
                recognizer.cancel();
                recognizer.shutdown();
            }
        }
    
        /**
         * In partial result we get quick updates about current hypothesis. In
         * keyword spotting mode we can react here, in other modes we need to wait
         * for final result in onResult.
         */
        @Override
        public void onPartialResult(Hypothesis hypothesis) {
            // Toast.makeText(getApplicationContext(),"partial",Toast.LENGTH_SHORT).show();
            if (hypothesis == null)
                return;
    
            String text = hypothesis.getHypstr();
            ((TextView) findViewById(R.id.result_text)).setText(text);
            //recognizer.cancel();
           // recognizer.startListening(TEST_SEARCH);
        }
    
        /**
         * This callback is called when we stop the recognizer.
         */
        @Override
        public void onResult(Hypothesis hypothesis) {
            recognizer.stop();
            Toast.makeText(getApplicationContext(),"res:",Toast.LENGTH_SHORT).show();
            ((TextView) findViewById(R.id.result_text)).setText("");
            if (hypothesis != null) {
                String text = hypothesis.getHypstr();
                makeText(getApplicationContext(), text, Toast.LENGTH_LONG).show();
            }
    
        }
    
        @Override
        public void onBeginningOfSpeech() {
            Toast.makeText(getApplicationContext(),"begin",Toast.LENGTH_SHORT).show();
        }
    
        /**
         * We stop recognizer here to get a final result
         */
        @Override
        public void onEndOfSpeech() {
            Toast.makeText(getApplicationContext(),"end",Toast.LENGTH_SHORT).show();
    //        if (!recognizer.getSearchName().equals(KWS_SEARCH))
    //            switchSearch(KWS_SEARCH);
        }
    
        private void setupRecognizer(File assetsDir) throws IOException {
            // The recognizer can be configured to perform multiple searches
            // of different kind and switch between them
    
            recognizer = SpeechRecognizerSetup.defaultSetup()
                    .setAcousticModel(new File(assetsDir, "files"))
                    .setDictionary(new File(assetsDir, "model.dic"))
                    //.setRawLogDir(assetsDir)// To disable logging of raw audio comment out this call (takes a lot of space on the device)
                    .setBoolean("-allphone_ci", true)
                    .getRecognizer();
            recognizer.addListener(this);
    
            //.setKeywordThreshold(1e-45f) // Threshold to tune for keyphrase to balance between false alarms and misses
            //.setBoolean("-allphone_ci", true)  // Use context-independent phonetic search, context-dependent is too slow for mobile
    
            /*File ngramModel = new File(assetsDir, "model.lm.DMP");
            recognizer.addNgramSearch(TEST_SEARCH, ngramModel);*/
            recognizer.addGrammarSearch(TEST_SEARCH,new File(assetsDir,"Grammar.jsgf"));
    
        }
    
        @Override
        public void onError(Exception error) {
            ((TextView) findViewById(R.id.caption_text)).setText(error.getMessage());
        }
    
        @Override
        public void onTimeout() {
            Toast.makeText(getApplicationContext(),"timeout",Toast.LENGTH_SHORT).show();
        }
    }
    
     
    • Nickolay V. Shmyrev

      The grammer I used in the training was based on sentences -it's like sentence1|sentence2

      There is no such thing as a grammar based on sentences. Grammars always contain words.

      and is there a way that I can make the recognizer wait more before it stops taking the input from the user?

      '-vad_postspeech' parameter controls that. But overall it is better to allow user to pause and continue. You should analyze partial commands and respond to them properly.

       
      • Rawan Laz

        Rawan Laz - 2017-04-11

        There is no such thing as a grammar based on sentences. Grammars always contain words.
        why is that? I've tried to do this in the training and it worked well -this is what my application requires-

         
        • Nickolay V. Shmyrev

          Not sure what you tried exactly, but "based on sentences" is not the right description of the grammar. It is better to paste it instead.

          Next, your problem are unlikely related to the grammar since you already tested it in training. But you do not explain them well either.

           
          • Rawan Laz

            Rawan Laz - 2017-04-11

            this is my grammar

            #JSGF V1.0;
            
            grammar kawther;
            
            public <AlKawther> = ( بِسْمِ اللَّهِ الرَّحْمَنِ الرَّحِيمِ     |      إِنَّا أَعْطَيْنَاكَ الكَوْثَرَ      |      فَصَلِّ لِرَبِّكَ وَانْحَرْ     |    إِنَّ شَانِئَكَ هُوَ الأَبْتَرُ   | بِسْمِ اللَّهِ الرَّحْمَنِ الرَّحِيمِ | قُلْ أَعُوذُ بِرَبِّ النَّاسِ  |  مَلِكِ النَّاسِ  |  إِلَهِ النَّاسِ  |  مِنْ شَرِّ الْوَسْوَاسِ الْخَنَّاسِ  |  الَّذِي يُوَسْوِسُ فِي صُدُورِ النَّاسِ  |  مِنَ الْجِنَّةِ وَالنَّاسِ );
            

            When I tried it on android it only recognizes one or two words from the sentence like "إِنَّا أَعْطَيْنَاكَ" instead of " إِنَّا أَعْطَيْنَاكَ الكَوْثَرَ "..... and it doesn't recognizes the first word in a correct way even when the accuracy was very good while training.

             
            • Nickolay V. Shmyrev

              You can test continuous recognition with pocketsphinx_continuous first without android.

              To get help on accuracy of recognition you need to provide the test audio data and the models. Make sure you have strictly followed the tutorial recommendations too, in particular the recommendation about the size of the training data.

               
  • Rawan Laz

    Rawan Laz - 2017-04-11

    Is there anything that I did wrong? notice that this is my first attempt to use my trained accoustic model on android..

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.