Menu

Using ITSM spanish model with sphinx4

Help
2008-05-09
2012-09-22
  • Héctor Delgado Flores

    Hello,

    I'm trying to use sphinx4 with the ITESM h4 model for spanish. I'm modifying the "wavfile" demo to recognize 3 keywords in wav files.

    The models are here: http://www.speech.cs.cmu.edu/sphinx/models/hub4spanish_itesm/

    But these models are in sphinxTrain format. For sphinx4 I have to make a .jar file with the model. I did this following this link: http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html

    I have changed some parammeters of the config.xml,but I don't know if I'm doing it well.

    When I run the program, it keep running much time and no result is returned

    What am I doing wrong?

    These are my files: http://www.megaupload.com/?d=BP3M2CLG

    Thanks a lot

     
    • Claudia  Ocampo

      Claudia Ocampo - 2008-11-01

      Hola, estoy trabajando con Sphinx-4, y necesito configurarlo para español, cuando ejecuto mi programa me sale este error:

      Loading Recognizer...

      Exception in thread "main" java.lang.NullPointerException
      at edu.cmu.sphinx.util.props.SaxLoader.load(SaxLoader.java:64)
      at edu.cmu.sphinx.util.props.ConfigurationManager.loader(ConfigurationManager.java:383)
      at edu.cmu.sphinx.util.props.ConfigurationManager.<init>(ConfigurationManager.java:115)
      at demo.sphinx.wavfile.WavFile.main(WavFile.java:60)

      Alguien me podrian colaborar.

      Muchas Gracias.

       
    • Nickolay V. Shmyrev

      I was unable to download the files. Could you use another resourse, say mediafire.com instead.

      The biggest problem is that spanish models use s3_1x39 feature set so you have to use another feature extraction class in the frontend (S3FeatureExtractor). The rest must be quite standard.

      About your question on 256M, well, it's quite standard. Remember that there is always a swap file and you can even pass -Xmx512m, it doesn't mean java will actually use 512m. After all it's Java.

      About your task, I'm not quite sure why do you want to setup sphinx4, I don't think it will bring you something new.

       
      • Santiago Brandi

        Santiago Brandi - 2008-06-27

        The biggest problem is that spanish models use s3_1x39 feature set so you have to use another feature extraction class in the frontend (S3FeatureExtractor). The rest must be quite standard

        Hi, im sorry about keeping asking for help...

        I have my application running with the itsm spanish models but the recongnition is totaly null, in first place i couldnt find info about how to use the H4.arpa.Z.DMP file, and also had no idea about the s3_1x39 feature...

        How do i get or create that diferent feature extraction class in the frontend??

        sorry for my ignorance..

        thanks for your help!
        santiago

         
        • Nickolay V. Shmyrev

          > I have my application running with the itsm spanish models but the recongnition is totaly null, in first place i couldnt find info about how to use the H4.arpa.Z.DMP file, and also had no idea about the s3_1x39 feature...

          There must be different problem. First of all, don't use H4.arpa.Z.DMP, just because it's not suitable for your task most probable. Second, to use s3_1x39, choose S3FeatureExtractor in frontend. If you'll still have troubles, please give a link to your file and it's transcription.

           
    • Héctor Delgado Flores

      Thank you for your answer!

      I expect results will be similar. The reason is that I think it's easier for me to write an application with a simple user interface than with C language. I have no much time for my project and I shuld have something even if results aren't perfect.

      My files: http://www.mediafire.com/?hkxkatyjbty

      Thank you again.

       
    • Héctor Delgado Flores

      Nickolay,

      I'm trying and I don't get nothing. Can you provide a config.xml file that works fot my test?

      Thank you very much.

       
      • Nickolay V. Shmyrev

        Well, I managed to make it work. On the way I had to fix a bug in sphinx4. Check my files here:

        http://www.mediafire.com/?nizfvxxesg9

        You have to checkout latest sphinx4 svn and apply the patch attached. Its still very slow and not so optimal in keyword spotting, as I said we ought to try another search algorithm.

         
    • Héctor Delgado Flores

      Sorry

      Which svn subcommand may I use to apply the patch? The patch file is sphinx4_noloop.diff?

      Thanks

       
      • Nickolay V. Shmyrev

        cp sphinx4_noloop.diff sphinx4
        cd sphinx4
        patch -p0 < sphinx4_noloop.diff

        alternatively you can just open the patch with text editor and make changes from it by hand. man patch can be also helpful.

         
    • Santiago Brandi

      Santiago Brandi - 2008-06-12

      Hi, i downloaded the acustic models from the same link, and also trainded the models following the steps in the other link you mentioned, when i try to run he application an errors pops, something about a bad URL in the config.xml file, in dictionary configuration, it seems it doesnt recognises the JAR created, or something like that, i really dont know.
      If someone has an idea of what may be happening or managed to make sphinx 4 run with spanish words i would really apreciate a hand.

      thanks a lot!
      excuse me for my english...

      Santiago

       
      • Nickolay V. Shmyrev

        > when i try to run he application an errors pops, something about a bad URL in the config.xml file, in dictionary configuration, it seems it doesnt recognises the JAR created, or something like that, i really dont know.

        Learn to paste the errors when you report about them first. It's a trivial thing you must understand first. We'll translate it for you if can't do it yourself.

         
        • Santiago Brandi

          Santiago Brandi - 2008-06-17

          Hi, this is what i get when i try to run the aplication :

          Problem configuring HelloDigits: Property Exception component:'dictionary' property:'dictionaryPath' - Bad URL resource:/edu.cmu.sphinx.model.acoustic.ESPAÑOL_H4.Model!/edu/cmu/sphinx/model/acoustic/ESPAÑOL_H4/dict/cmudict.0.6dunknown protocol: resource
          Property Exception component:'dictionary' property:'dictionaryPath' - Bad URL resource:/edu.cmu.sphinx.model.acoustic.ESPAÑOL_H4.Model!/edu/cmu/sphinx/model/acoustic/ESPAÑOL_H4/dict/cmudict.0.6dunknown protocol: resource

          This lines belong to the config.xml file, when instead of using this spanish acustic model i use the wsj model it runs pefectly...

          I trained the model following the steps from the link http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html

          any ideas ?

          thanks
          Santiago

           
          • Santiago Brandi

            Santiago Brandi - 2008-06-17

            Well, i repeated the whole process again and now it works, i was doing something wrong obviously...

            Now the problem i have is that the eficiency in recongnition is really poor, y read something about some parameters needed to be changed...

            if someone worked that out i would apreciate a hint!

            thank a lot!
            santiago

             
    • Santiago Brandi

      Santiago Brandi - 2008-06-28

      hi! it works thanks a lot! now the application recognizes spanish words with remarkably accuracy!

      Now i´ve encoutered a new kind of problem, using this s3FeatureExtractor, the aplication recognizes only one word "per time", for example if i say "abrir puerta", it only returns "abrir", or if i say the same word two times, it only returns it once...

      I´ve been checking out the codes of deltaFeatureExtractor and s3FeatureExtractor, guessing the problem was in the time window size but i exetended it as much as i could and the results are the same. More over, when i try to impose some grammar rules, like the ones you can see in helloWord demo, in which words must follow some determined order, the program keeps loading and loading and doesn´t starts....

      Do you something about this???

      thanks again!
      Santiago

       
      • Nickolay V. Shmyrev

        It's the restriction of your grammar or language model. It's not related to features at all.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.