I've downloaded and built sphinx3 0.6.3 on ubuntu 6.06. I trimmed the dictionary down to just a few words and I get great accuracy when running the sphinx3-simple script (this runs the sphinx3-livedecode executable). My plan is to write up a step by step tutorial so that students can install and use this for robotics input. It looks like I want to run the sphinx3_continuous app - but I don't understand the command line args needed. I've dug into the documentation and searched the web, but I can't find an example of what the cfg,raw, and ctrl files should be set to.
Could anyone provide or point me to a command line example for sphinx3_continuous that uses the files provided in the sphinx3-0.6.3.tar.gz download?
Any help is greatly appreciated!
Thanks,
Ed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear sir/Madam,
I am a student of ece at Delhi College of Engineering.
i intend to develop a voice command recognition hardware based on bf531.
I am trying to use pocket sphinx as a platform for my speech recognition but i am having difficulties in finding the proper documentation or any tutorial.
Any kind of help will be welcomed.
You can visit http://ketanbj.googlepages.com for my previous works.
regards
Ketan bhardwaj
ketanbj@gmail.com
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It looks like the 3 command line args are:
ctrlfile - file with list of input files (batch)
rawdir - directory where file in above list are to be found.
cfgfile - file with config params.
So first I tried just changing the executable from sphinx3_livepretend to sphinx3_continuous in the script file test-livecontinuous.sh. This runs - but without the highest accuracy of the livepretend program - not sure why the diff - but some chars of P I T T S B U R G H are decoded correctly.
But I'm trying to get mic input to be fed into sphinx3_continuous. So I assume I want to specify /dev/dsp instead of pittsburgh.littleendian in teh an4.ctl file. I try this, but of course sphinx3_continuous appends '.raw' so it can't find /dev/dsp.raw to open it! So I created a symbolic link file 'x.raw' that points to /dev/dsp and I edit an4.ctl so now it just has 'x' in it.
This runs, but I get no decoding when I speak into the microphone. I get great decoding when I run sphinx3-simple, however.
Am I on the right track? Any ideas what I'm missing here?
Sure would be nice to have a script that would run sphinx3_continuous and have it take mic input and output strings to stdout. I assume that is the way sphinx3_continuous is supposed to work. (?)
Sorry for the wordy post, but I'm trying to detail where I'm at and what I've tried so far.
Any additional input is appreciated!
Thanks,
Ed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What happens is the language model in the test directory is more a small model which was specially made to only alphabet recognition. The model assume an alphabet loop structure which is very inaccurate in general. (alphabet is confusing even for humans.)
So my suggestion is to decide what kinds of things you want to speak to the recognizer first. Then try to constraint the recognizer to search for those types of words.
If you just want to use a generic LM, try to check out the the cmusphinx.org page for resource for building speech recognizer.
Another note here, even given the resource we provided, building highly accurate speech recognition (especially interactive one) is still very difficult in practice. So, be careful when setting your expectation. ;-)
-tgj
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello all,
Thanks for all the wonderful work!
I've downloaded and built sphinx3 0.6.3 on ubuntu 6.06. I trimmed the dictionary down to just a few words and I get great accuracy when running the sphinx3-simple script (this runs the sphinx3-livedecode executable). My plan is to write up a step by step tutorial so that students can install and use this for robotics input. It looks like I want to run the sphinx3_continuous app - but I don't understand the command line args needed. I've dug into the documentation and searched the web, but I can't find an example of what the cfg,raw, and ctrl files should be set to.
Could anyone provide or point me to a command line example for sphinx3_continuous that uses the files provided in the sphinx3-0.6.3.tar.gz download?
Any help is greatly appreciated!
Thanks,
Ed
Dear sir/Madam,
I am a student of ece at Delhi College of Engineering.
i intend to develop a voice command recognition hardware based on bf531.
I am trying to use pocket sphinx as a platform for my speech recognition but i am having difficulties in finding the proper documentation or any tutorial.
Any kind of help will be welcomed.
You can visit http://ketanbj.googlepages.com for my previous works.
regards
Ketan bhardwaj
ketanbj@gmail.com
Try to see livepretend. It will give you some idea how this works.
llivepretend's example could be found in the test directory. make test-livepretend will do a lot of trick for you. -a
Thanks for the quick reply, Arthur!
It looks like the 3 command line args are:
ctrlfile - file with list of input files (batch)
rawdir - directory where file in above list are to be found.
cfgfile - file with config params.
So first I tried just changing the executable from sphinx3_livepretend to sphinx3_continuous in the script file test-livecontinuous.sh. This runs - but without the highest accuracy of the livepretend program - not sure why the diff - but some chars of P I T T S B U R G H are decoded correctly.
But I'm trying to get mic input to be fed into sphinx3_continuous. So I assume I want to specify /dev/dsp instead of pittsburgh.littleendian in teh an4.ctl file. I try this, but of course sphinx3_continuous appends '.raw' so it can't find /dev/dsp.raw to open it! So I created a symbolic link file 'x.raw' that points to /dev/dsp and I edit an4.ctl so now it just has 'x' in it.
This runs, but I get no decoding when I speak into the microphone. I get great decoding when I run sphinx3-simple, however.
Am I on the right track? Any ideas what I'm missing here?
Sure would be nice to have a script that would run sphinx3_continuous and have it take mic input and output strings to stdout. I assume that is the way sphinx3_continuous is supposed to work. (?)
Sorry for the wordy post, but I'm trying to detail where I'm at and what I've tried so far.
Any additional input is appreciated!
Thanks,
Ed
What happens is the language model in the test directory is more a small model which was specially made to only alphabet recognition. The model assume an alphabet loop structure which is very inaccurate in general. (alphabet is confusing even for humans.)
So my suggestion is to decide what kinds of things you want to speak to the recognizer first. Then try to constraint the recognizer to search for those types of words.
If you just want to use a generic LM, try to check out the the cmusphinx.org page for resource for building speech recognizer.
Another note here, even given the resource we provided, building highly accurate speech recognition (especially interactive one) is still very difficult in practice. So, be careful when setting your expectation. ;-)
-tgj