Menu

Terrible recognition results with JSGF

2014-07-01
2014-07-05
  • Dmytro Prylipko

    Dmytro Prylipko - 2014-07-01

    Hi,

    I was trying to create a simple tutorial on using Sphinx4 and discovered a strange thing...
    I have a simple setup based on the an4 database. A configuration that uses a trivial unigram model provides me with 78% of accuracy. For comparison, I created another setup that uses a trivial JSGF grammar (word loop, indeed). With that, I have 56%, accompanied with numerous "Falling back to non-recursive partition" messages. It is really strange, given ~100 words in vocabulary.

    Just wondering, what it might be, and how to cope with that. The two configurations are almost equal. I guess, both unigram and JSGF should provide comparable performance. Maybe, problem is with FlatLinguist?

     
  • Dmytro Prylipko

    Dmytro Prylipko - 2014-07-01

    I attach here the files you might want to look into.

     
    • Nickolay V. Shmyrev

      We do not recommend custom sphinx4 configs anymore, in particular you shouldn't use them in any tutorials.

       
      • Jeff Acquaviva

        Jeff Acquaviva - 2014-07-02

        We do not recommend custom sphinx4 configs anymore

        Out of curiocity, what is the reason behind this?

         
        • Nickolay V. Shmyrev

          XML config was an attempt to put some code logic into xml. Our users are not qualified enough to edit decoder internals through configuration files and don't usually understand or account important dependencies between configuration file objects.

          Our current work on high-level API allows users to access s4 features though API, though not perfect, but focused on specific usecases. I hope s4 will be more straightforward to use in the future due to that.

           
      • Dmytro Prylipko

        Dmytro Prylipko - 2014-07-02

        What do you mean under 'custom'? Each particular ASR task requires a config to be written...

        Does it mean that there exists a standard config for my task?

         
        • Nickolay V. Shmyrev

          Each particular ASR task requires a config to be written...

          No, there is a default.config.xml now, but it will probably be replaced by the code in the future too.

          Does it mean that there exists a standard config for my task?

          I'm not sure what your task is, but we'd like to see the support for most of the practical tasks in API, not in the configuration.

           
  • Dmytro Prylipko

    Dmytro Prylipko - 2014-07-03

    I tried to switch to the new API instead of using configs, but cannot figure out how to perform a recognition of multiple input files in a raw (using StreamSpeechRecognizer) with the same acoustic and language models.

    As I understood, new input file can be set in startRecognition method only, which allocates the recognizer. Does it mean that I should allocate-deallocate recognizer for every single file?

    Ok, with no respect to a certain way of usage: what does the message 'Falling back to non-recursive partition" mean? What might be the reason? Can it be caused by a slow laptop running the recognizer?

     
    • Nickolay V. Shmyrev

      As I understood, new input file can be set in startRecognition method only, which allocates the recognizer. Does it mean that I should allocate-deallocate recognizer for every single file?

      That has to be changed

      Ok, with no respect to a certain way of usage: what does the message 'Falling back to non-recursive partition" mean? What might be the reason? Can it be caused by a slow laptop running the recognizer?

      There should be no such message in recent sources, it has been removed quite some time ago, it was for debugging. The reason of this message is that active list pruning is implemented with recursive quicksort-like algorithm and if there are too many similar scores in the beam window then this algorithm goes out of stack. In that case algorithm switches to linear sorting which this debug message is notifying about.

      Overall it's not a good situation when you have this message. It means something is wrong either with your model or with input features.

       

Log in to post a comment.