Menu

Random recognition

Help
2009-03-29
2012-09-22
  • Magdalena Broniecka

    Hi!
    I made a mix of recognition from microphone and from wav with WSJ model for a skype-like program, that uses about 20 words. Actually i'm using some code from demo to understand the idea and not to mix up some things.

    http://www.magdusia.user.icpnet.pl/studia/magdalena-voicerec.rar you can download NetBeans version from here 12 MB. You can run MainFormatka to start it.

    What does this app:
    - Rozpoznawanie.java should listen and compare words from microphone one at a time. I've made special dictionary slownik.dict and 3 grammars starcommands, number and name. At the beginning it should recognize the command but when i say 'finish' it hears 'abort'. How can i improve it?
    - Nagrywanie.java records one word, to be precise english name, plays it and compares using dictionary slownik2 and name2 grammar which contain only few words. The recognition is random i say Daniel it hears Mary. Those two words aren't similar. I made my dictionary using cmudict.0.6d as example.

    I'm not native but i think that my english is not that bad and i'm not saying Daniel same way as Mary and so on. Besides i'm trying to put words that are not very similar especially in recognition straight from microphone. So where is a problem? I was trying to set frequently tuned properties but it doesn't seem to change a thing.
    When i used settings from "How can I detect and ignore out-of-grammar utterances?" in your FAQ i got only <unk>. I'm not that fluent in Java so my code can be in mess.

    Main question is how can i get accuracy about 80% or more? And how to make my recognition less random therefore the program will confuse similar words like too, two, toe but not Daniel and Mary?
    Thank you very much for any suggestions
    Magdalena

     
    • Magdalena Broniecka

      Thank you for your help. Unfortunately i have only time on weekends to work on that project. I'm facing another problem... I think that recognizer.deallocate doesn't work fine. After allocating it i call deallocate after one recognition and then when i call allocate again i get:

      Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
      at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader.readFloatArray(ModelLoader.java:1044)
      at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader.loadDensityFileBinary(ModelLoader.java:798)
      at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader.loadModelFiles(ModelLoader.java:540)
      at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader.load(ModelLoader.java:476)
      at edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model.allocate(Model.java:177)
      at edu.cmu.sphinx.linguist.flat.FlatLinguist.allocateAcousticModel(FlatLinguist.java:336)
      at edu.cmu.sphinx.linguist.flat.FlatLinguist.allocate(FlatLinguist.java:318)
      at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.allocate(SimpleBreadthFirstSearchManager.java:602)
      at edu.cmu.sphinx.decoder.Decoder.allocate(Decoder.java:109)
      at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:182)
      at rozpoznawanie.Rozpoznawanie.Run(Rozpoznawanie.java:46)
      at rozpoznawanie.MainFormatka.RozpoznajButtonActionPerformed(MainFormatka.java:118)
      at rozpoznawanie.MainFormatka.access$000(MainFormatka.java:31)
      at rozpoznawanie.MainFormatka$1.actionPerformed(MainFormatka.java:63)
      at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1995)
      at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2318)
      at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:387)
      at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:242)
      at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:236)
      at java.awt.Component.processMouseEvent(Component.java:6216)
      at javax.swing.JComponent.processMouseEvent(JComponent.java:3265)
      at java.awt.Component.processEvent(Component.java:5981)
      at java.awt.Container.processEvent(Container.java:2041)
      at java.awt.Component.dispatchEventImpl(Component.java:4583)
      at java.awt.Container.dispatchEventImpl(Container.java:2099)
      at java.awt.Component.dispatchEvent(Component.java:4413)
      at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4556)
      at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4220)
      at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4150)
      at java.awt.Container.dispatchEventImpl(Container.java:2085)

      It looks like deallocate doesn't stop every resource that is involved in recognition and they stay in memeory. Any hints on how to start and stop recognition without closing the programme?

       
      • Nickolay V. Shmyrev

        You just need to start jvm with more heap space with -Xmx512m for example.

        It's not recommended to deallocate recognizer frequently btw

         
    • Nickolay V. Shmyrev

      To fix your recognition you need to set

      &lt;property name=&quot;bigEndianData&quot; value=&quot;true&quot;/&gt;
      

      since Sun audio files you are recording are big endian. Also I suggest you to set bigger wordInsertionProbabilty (around 0.2) and use enpointer SpeechMarker/NonSpeechDataFilter) instead of timed recording.

      It should work pretty well.

      As for generic aproach probably you need to setup some testing database like sphinx4 regression test and get exact accuracy estimaiton. It's certainly possible to improve accuracy after that.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.