I have sucessfully built Sphinx 3 0.8, sphinxbase 0.6, and SphinxTrain 1.0 on
my Mac OS X 10.6 machine. The tutorial with the an4 database works. Now, I'm
trying to train/decode my own database. First off, I thought it would be a
good idea to modify the tutorial accordingly, instead of starting from
scratch.
Can't open /Users/me/Documents/project1/result/project1-1-1.match
word_align.pl failed with error code 65280 at scripts_pl/decode/slave.pl line
173.
Indeed, that file doesn't exists. So I opened the project1.html file to see
what went wrong, and it says:
Decoding 3 segments starting at 0 (part 1 of 1)
sphinx3_decode Log File
This step had 1 ERROR messages and 0 WARNING messages. Please check the log
file for details.
So I checked the log file, which reads:
SYSTEM_ERROR: "lm_3g_dmp.c", line 1270:
fopen(/Users/me/Documents/project1/etc/project1.ug.lm.DMP,rb) failed
; No such file or directory
which is true, because that file doesn't exists either. So I realized that the
language model files an4.ug.lm* actually readily ship with the an4 tarball.
Hence, I thought, I need to build my own language model files for my database.
So I went to http://www.speech.cs.cmu.edu/sphinxman/fr4.html to see what the general procedure would be
for training from scratch, and I found that a model needs to be built with a
tool called "mk_model_def". Curiously, neither its binary nor source code
exists on my system. In the SphinxTrain binary folder, all tools are there,
except mk_model_def.
What the hell is wrong with my installation? How do I build the language model
files?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So I can choose between an outdated and an incomplete documentation?
.
We are trying to keep our documentation uptodate, but it's a hard task. Most
reliable source is of course our website http://cmusphinx.sourceforge.net and
source tarballs. All other sources are often outdated and can not be trusted.
As I wrote, I'm using Sphinx 3, not PocketSphinx or Sphinx4 (which is what
the "documentation" you linked is written for, it says).
This is not the best choice as well. You can read about versions here:
Hm, interesting. Let me ask a more fundamental question then.
I've got hundreds of animal sound recordings. Each audio file contains one
sound with trailing silences. Bears, birds, horses, cows. No lingual phones,
no grammar logic, no continuous speech, just hundreds of separated animal
sounds from a few species. How would I (how would YOU?) go about telling them
apart with Sphinx? Using Sphinx 4? The time/work required to get acquainted
with S4 seems unreasonable, that's why I chose Sphinx 3 initially.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
How would I (how would YOU?) go about telling them apart with Sphinx?
CMUSphinx projects provides you two type of recognizer. Fast one written in C
(pocketsphinx) and extensible one written in Java (sphinx4). However, they all
are highly optimized towards recognizing speech.
I don't see how CMUSphinx toolkit can help you to classify animal sounds. I
would better use custom feature extractor with GMM classifier. Or even better
SVM classifier. You could look at MARF
(http://marf.sourceforge.net) or LIUM
spkDiarization toolkit.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I see. However, I would still like to try it with Sphinx. Just to see how well
it competes with other HMM speech recognizers, for my specific task. I have
successfully classified the sounds with others, but I'm looking for
alternatives.
Could you point me to a PocketSphinx
manual/documantation/tutorial/walkthrough? I can't seem to find any.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I know those pages. It's just that the tutorials given there are VERY
sparse. It's merely explained how to run a simplistic recognition run based on
predefined and pretrained models, which is of, well, limited use. But how to
actually use PocketSphinx is just not documented... and let's face it,
guessing what methods to call when, in what order, and why, just from the
Doxygen pages is an impossible task, at least if you want/need to know what
you're doing.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello speech recognizers
I have sucessfully built Sphinx 3 0.8, sphinxbase 0.6, and SphinxTrain 1.0 on
my Mac OS X 10.6 machine. The tutorial with the an4 database works. Now, I'm
trying to train/decode my own database. First off, I thought it would be a
good idea to modify the tutorial accordingly, instead of starting from
scratch.
Things seem to go well until I encounter the same problem during decode as in
https://sourceforge.net/projects/cmusphinx/forums/forum/5471/topic/3757326:
Indeed, that file doesn't exists. So I opened the project1.html file to see
what went wrong, and it says:
So I checked the log file, which reads:
which is true, because that file doesn't exists either. So I realized that the
language model files an4.ug.lm* actually readily ship with the an4 tarball.
Hence, I thought, I need to build my own language model files for my database.
So I went to http://www.speech.cs.cmu.edu/sphinxman/fr4.html to see what the general procedure would be
for training from scratch, and I found that a model needs to be built with a
tool called "mk_model_def". Curiously, neither its binary nor source code
exists on my system. In the SphinxTrain binary folder, all tools are there,
except mk_model_def.
What the hell is wrong with my installation? How do I build the language model
files?
That was wrong decision to read some outdated manual
There is nothing wrong I suppose. You need to build a language model or
download existing one
See documentation:
http://cmusphinx.sourceforge.net/wiki/languagemodelhowto
So I can choose between an outdated and an incomplete documentation? .....
As I wrote, I'm using Sphinx 3, not PocketSphinx or Sphinx4 (which is what the
"documentation" you linked is written for, it says).
Also, I need to be able to build models offline, hence lmtool is not an
option.
.
We are trying to keep our documentation uptodate, but it's a hard task. Most
reliable source is of course our website
http://cmusphinx.sourceforge.net and
source tarballs. All other sources are often outdated and can not be trusted.
This is not the best choice as well. You can read about versions here:
http://cmusphinx.sourceforge.net/wiki/versions
In short, if you don't know how to build a language model, sphinx3 is not for
you.
The page above lists several options to build model offline. Read it
carefully.
Hm, interesting. Let me ask a more fundamental question then.
I've got hundreds of animal sound recordings. Each audio file contains one
sound with trailing silences. Bears, birds, horses, cows. No lingual phones,
no grammar logic, no continuous speech, just hundreds of separated animal
sounds from a few species. How would I (how would YOU?) go about telling them
apart with Sphinx? Using Sphinx 4? The time/work required to get acquainted
with S4 seems unreasonable, that's why I chose Sphinx 3 initially.
CMUSphinx projects provides you two type of recognizer. Fast one written in C
(pocketsphinx) and extensible one written in Java (sphinx4). However, they all
are highly optimized towards recognizing speech.
I don't see how CMUSphinx toolkit can help you to classify animal sounds. I
would better use custom feature extractor with GMM classifier. Or even better
SVM classifier. You could look at MARF
(http://marf.sourceforge.net) or LIUM
spkDiarization toolkit.
I see. However, I would still like to try it with Sphinx. Just to see how well
it competes with other HMM speech recognizers, for my specific task. I have
successfully classified the sounds with others, but I'm looking for
alternatives.
Could you point me to a PocketSphinx
manual/documantation/tutorial/walkthrough? I can't seem to find any.
You can find all required materials and links on our website:
http://cmusphinx.sourceforge.net/wiki
I know those pages. It's just that the tutorials given there are VERY
sparse. It's merely explained how to run a simplistic recognition run based on
predefined and pretrained models, which is of, well, limited use. But how to
actually use PocketSphinx is just not documented... and let's face it,
guessing what methods to call when, in what order, and why, just from the
Doxygen pages is an impossible task, at least if you want/need to know what
you're doing.
Thanks, your comments are important for us. We understand that our
documentation is far from being perfect and working on it.
If you have any specific question, feel free to ask.