Hola, estoy trabajando con Sphinx-4, y necesito configurarlo para español, cuando ejecuto mi programa me sale este error:
Loading Recognizer...
Exception in thread "main" java.lang.NullPointerException
at edu.cmu.sphinx.util.props.SaxLoader.load(SaxLoader.java:64)
at edu.cmu.sphinx.util.props.ConfigurationManager.loader(ConfigurationManager.java:383)
at edu.cmu.sphinx.util.props.ConfigurationManager.<init>(ConfigurationManager.java:115)
at demo.sphinx.wavfile.WavFile.main(WavFile.java:60)
Alguien me podrian colaborar.
Muchas Gracias.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was unable to download the files. Could you use another resourse, say mediafire.com instead.
The biggest problem is that spanish models use s3_1x39 feature set so you have to use another feature extraction class in the frontend (S3FeatureExtractor). The rest must be quite standard.
About your question on 256M, well, it's quite standard. Remember that there is always a swap file and you can even pass -Xmx512m, it doesn't mean java will actually use 512m. After all it's Java.
About your task, I'm not quite sure why do you want to setup sphinx4, I don't think it will bring you something new.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The biggest problem is that spanish models use s3_1x39 feature set so you have to use another feature extraction class in the frontend (S3FeatureExtractor). The rest must be quite standard
Hi, im sorry about keeping asking for help...
I have my application running with the itsm spanish models but the recongnition is totaly null, in first place i couldnt find info about how to use the H4.arpa.Z.DMP file, and also had no idea about the s3_1x39 feature...
How do i get or create that diferent feature extraction class in the frontend??
sorry for my ignorance..
thanks for your help!
santiago
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> I have my application running with the itsm spanish models but the recongnition is totaly null, in first place i couldnt find info about how to use the H4.arpa.Z.DMP file, and also had no idea about the s3_1x39 feature...
There must be different problem. First of all, don't use H4.arpa.Z.DMP, just because it's not suitable for your task most probable. Second, to use s3_1x39, choose S3FeatureExtractor in frontend. If you'll still have troubles, please give a link to your file and it's transcription.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I expect results will be similar. The reason is that I think it's easier for me to write an application with a simple user interface than with C language. I have no much time for my project and I shuld have something even if results aren't perfect.
You have to checkout latest sphinx4 svn and apply the patch attached. Its still very slow and not so optimal in keyword spotting, as I said we ought to try another search algorithm.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, i downloaded the acustic models from the same link, and also trainded the models following the steps in the other link you mentioned, when i try to run he application an errors pops, something about a bad URL in the config.xml file, in dictionary configuration, it seems it doesnt recognises the JAR created, or something like that, i really dont know.
If someone has an idea of what may be happening or managed to make sphinx 4 run with spanish words i would really apreciate a hand.
thanks a lot!
excuse me for my english...
Santiago
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> when i try to run he application an errors pops, something about a bad URL in the config.xml file, in dictionary configuration, it seems it doesnt recognises the JAR created, or something like that, i really dont know.
Learn to paste the errors when you report about them first. It's a trivial thing you must understand first. We'll translate it for you if can't do it yourself.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hi! it works thanks a lot! now the application recognizes spanish words with remarkably accuracy!
Now i´ve encoutered a new kind of problem, using this s3FeatureExtractor, the aplication recognizes only one word "per time", for example if i say "abrir puerta", it only returns "abrir", or if i say the same word two times, it only returns it once...
I´ve been checking out the codes of deltaFeatureExtractor and s3FeatureExtractor, guessing the problem was in the time window size but i exetended it as much as i could and the results are the same. More over, when i try to impose some grammar rules, like the ones you can see in helloWord demo, in which words must follow some determined order, the program keeps loading and loading and doesn´t starts....
Do you something about this???
thanks again!
Santiago
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm trying to use sphinx4 with the ITESM h4 model for spanish. I'm modifying the "wavfile" demo to recognize 3 keywords in wav files.
The models are here: http://www.speech.cs.cmu.edu/sphinx/models/hub4spanish_itesm/
But these models are in sphinxTrain format. For sphinx4 I have to make a .jar file with the model. I did this following this link: http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html
I have changed some parammeters of the config.xml,but I don't know if I'm doing it well.
When I run the program, it keep running much time and no result is returned
What am I doing wrong?
These are my files: http://www.megaupload.com/?d=BP3M2CLG
Thanks a lot
Hola, estoy trabajando con Sphinx-4, y necesito configurarlo para español, cuando ejecuto mi programa me sale este error:
Loading Recognizer...
Exception in thread "main" java.lang.NullPointerException
at edu.cmu.sphinx.util.props.SaxLoader.load(SaxLoader.java:64)
at edu.cmu.sphinx.util.props.ConfigurationManager.loader(ConfigurationManager.java:383)
at edu.cmu.sphinx.util.props.ConfigurationManager.<init>(ConfigurationManager.java:115)
at demo.sphinx.wavfile.WavFile.main(WavFile.java:60)
Alguien me podrian colaborar.
Muchas Gracias.
I was unable to download the files. Could you use another resourse, say mediafire.com instead.
The biggest problem is that spanish models use s3_1x39 feature set so you have to use another feature extraction class in the frontend (S3FeatureExtractor). The rest must be quite standard.
About your question on 256M, well, it's quite standard. Remember that there is always a swap file and you can even pass -Xmx512m, it doesn't mean java will actually use 512m. After all it's Java.
About your task, I'm not quite sure why do you want to setup sphinx4, I don't think it will bring you something new.
The biggest problem is that spanish models use s3_1x39 feature set so you have to use another feature extraction class in the frontend (S3FeatureExtractor). The rest must be quite standard
Hi, im sorry about keeping asking for help...
I have my application running with the itsm spanish models but the recongnition is totaly null, in first place i couldnt find info about how to use the H4.arpa.Z.DMP file, and also had no idea about the s3_1x39 feature...
How do i get or create that diferent feature extraction class in the frontend??
sorry for my ignorance..
thanks for your help!
santiago
> I have my application running with the itsm spanish models but the recongnition is totaly null, in first place i couldnt find info about how to use the H4.arpa.Z.DMP file, and also had no idea about the s3_1x39 feature...
There must be different problem. First of all, don't use H4.arpa.Z.DMP, just because it's not suitable for your task most probable. Second, to use s3_1x39, choose S3FeatureExtractor in frontend. If you'll still have troubles, please give a link to your file and it's transcription.
Thank you for your answer!
I expect results will be similar. The reason is that I think it's easier for me to write an application with a simple user interface than with C language. I have no much time for my project and I shuld have something even if results aren't perfect.
My files: http://www.mediafire.com/?hkxkatyjbty
Thank you again.
Nickolay,
I'm trying and I don't get nothing. Can you provide a config.xml file that works fot my test?
Thank you very much.
Well, I managed to make it work. On the way I had to fix a bug in sphinx4. Check my files here:
http://www.mediafire.com/?nizfvxxesg9
You have to checkout latest sphinx4 svn and apply the patch attached. Its still very slow and not so optimal in keyword spotting, as I said we ought to try another search algorithm.
Sorry
Which svn subcommand may I use to apply the patch? The patch file is sphinx4_noloop.diff?
Thanks
cp sphinx4_noloop.diff sphinx4
cd sphinx4
patch -p0 < sphinx4_noloop.diff
alternatively you can just open the patch with text editor and make changes from it by hand. man patch can be also helpful.
Hi, i downloaded the acustic models from the same link, and also trainded the models following the steps in the other link you mentioned, when i try to run he application an errors pops, something about a bad URL in the config.xml file, in dictionary configuration, it seems it doesnt recognises the JAR created, or something like that, i really dont know.
If someone has an idea of what may be happening or managed to make sphinx 4 run with spanish words i would really apreciate a hand.
thanks a lot!
excuse me for my english...
Santiago
> when i try to run he application an errors pops, something about a bad URL in the config.xml file, in dictionary configuration, it seems it doesnt recognises the JAR created, or something like that, i really dont know.
Learn to paste the errors when you report about them first. It's a trivial thing you must understand first. We'll translate it for you if can't do it yourself.
Hi, this is what i get when i try to run the aplication :
Problem configuring HelloDigits: Property Exception component:'dictionary' property:'dictionaryPath' - Bad URL resource:/edu.cmu.sphinx.model.acoustic.ESPAÑOL_H4.Model!/edu/cmu/sphinx/model/acoustic/ESPAÑOL_H4/dict/cmudict.0.6dunknown protocol: resource
Property Exception component:'dictionary' property:'dictionaryPath' - Bad URL resource:/edu.cmu.sphinx.model.acoustic.ESPAÑOL_H4.Model!/edu/cmu/sphinx/model/acoustic/ESPAÑOL_H4/dict/cmudict.0.6dunknown protocol: resource
This lines belong to the config.xml file, when instead of using this spanish acustic model i use the wsj model it runs pefectly...
I trained the model following the steps from the link http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html
any ideas ?
thanks
Santiago
Well, i repeated the whole process again and now it works, i was doing something wrong obviously...
Now the problem i have is that the eficiency in recongnition is really poor, y read something about some parameters needed to be changed...
if someone worked that out i would apreciate a hint!
thank a lot!
santiago
hi! it works thanks a lot! now the application recognizes spanish words with remarkably accuracy!
Now i´ve encoutered a new kind of problem, using this s3FeatureExtractor, the aplication recognizes only one word "per time", for example if i say "abrir puerta", it only returns "abrir", or if i say the same word two times, it only returns it once...
I´ve been checking out the codes of deltaFeatureExtractor and s3FeatureExtractor, guessing the problem was in the time window size but i exetended it as much as i could and the results are the same. More over, when i try to impose some grammar rules, like the ones you can see in helloWord demo, in which words must follow some determined order, the program keeps loading and loading and doesn´t starts....
Do you something about this???
thanks again!
Santiago
It's the restriction of your grammar or language model. It's not related to features at all.