CMU Sphinx / Forums / Help: Recognition quality too low

premier - 2017-10-21

Hi,
this is my first time with this software and i'm trying to use sphinx4 with italian acustic model.
I've downloaded the code from sf and setupped a java project, this is my source code.

The recognition quality for italian and for english is very low: download my code in a folder and than "mvn clean install && mvn exec:java". In my code there is a wav file in which i told "Ciao questo è un messaggio di prova": sphinx4 not recognizes anything.

Please can you help me? which settings can i set to improve the quality?

Thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-10-21
  
  Our Italian model is very basic and trained from a tiny dataset. If you want good recognition, you need to collect the training data (1000+ hours). You can collect audiobooks, podcasts, depending on the type of application you want to build.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

I've used the english version with bad results. Maybe our english pronunciation is not very well, but other STT services works.
Maybe problem was about mic quality or audio format, i've used the code available on your website as below

public static void main(String[] args) throws Exception {

        Configuration configuration = new Configuration();

        configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
        configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
        configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin");

    StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
    InputStream stream = new FileInputStream(new File("test.wav"));

        recognizer.startRecognition(stream);
    SpeechResult result;
        while ((result = recognizer.getResult()) != null) {
        System.out.format("Hypothesis: %s\n", result.getHypothesis());
    }
    recognizer.stopRecognition();
    }

Can you suggest, please, some settings to improve the recognition quality?

Regards

Nickolay V. Shmyrev - 2017-10-22

There are no magic settings. To get help on the accuracy you need to provide the test data.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Recognition quality too low

Speech Recognition Toolkit

Forums

Help

Recognition quality too low document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Recognition quality too low