I am not sure if this is the right place for this topic, so please tell me if
it is not.
I 've been playing around with pocketsphinx recently. My goal is to learn how
to use it so that I can perform some model adaptation for speech recognition
in noisy conditions. But I started wih the models:
That I downloaded from sourceforge as recomende in the web.
Then, I tested the models on the_ cmu_arctic_bdl _ database downloaded from festvox.
I trained a langauge model from the same database using: text2wfreq,
text2idngram, idngram2lm, sphinx_lm_convert
and I used pocketsphinx_batch to perform recognition all over the arctic
database. I evaluated the results using: sclite from SCTK.
Finally I got a 25% WER on the database. This is quite a high value, in my
opinion, for a clean database with a tuned language model. Then, when I
perform MLLR+MAP adaptation with 200 sentences the WER only goes down to 22%
on the same database, which is not the kind of improvement I was expecting.
Is there anyone around here, that did a similar experiment ,that can tell me
if the WER values are reasonable comparing to his/her experience. I've been
trying to find out if I made I mistake and it seems that I didn't but I still
resist to think that MAP adaptation can only perform this improvement.
Regards,
Jordi Adell
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Jordi, the data to reproduce the problem also includes the commands and
scripts you where trying to invoke, temporary files you created, output logs
and result files. It's not only the data files.
If you upload something that I can run with a single script and that will show
what the problem is that will greatly increase your chances to get help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
At first I did not expect such amount of help, I was just looking for an
opinion about the results, to see if they are reasonable or not. But, If you
are so kind that you offer this kind of help, I will provide all the
information. I'll see if I can upload the bash scripts and the code, or at
least the log files.
Thanks in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If sphinxtrin and pocketsphinx binary are avilable you can run the command: run.sh
and the whole process runs. At the end it will come up with a couple of files
named resultsSource and resultsAdapted.
If you dont have sclite installed, you can have a look at the hypSource and hypAdapted files.
The logs are published in the log directory.
Regards.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello all,
I am not sure if this is the right place for this topic, so please tell me if
it is not.
I 've been playing around with pocketsphinx recently. My goal is to learn how
to use it so that I can perform some model adaptation for speech recognition
in noisy conditions. But I started wih the models:
pocketsphinx-extra/model/hmm/en_US/hub4_wsj_sc_3s_8k.cd_semi_5000/
That I downloaded from sourceforge as recomende in the web.
Then, I tested the models on the_ cmu_arctic_bdl _ database downloaded from
festvox.
I trained a langauge model from the same database using: text2wfreq,
text2idngram, idngram2lm, sphinx_lm_convert
and I used pocketsphinx_batch to perform recognition all over the arctic
database. I evaluated the results using:
sclite from SCTK.
Finally I got a 25% WER on the database. This is quite a high value, in my
opinion, for a clean database with a tuned language model. Then, when I
perform MLLR+MAP adaptation with 200 sentences the WER only goes down to 22%
on the same database, which is not the kind of improvement I was expecting.
Is there anyone around here, that did a similar experiment ,that can tell me
if the WER values are reasonable comparing to his/her experience. I've been
trying to find out if I made I mistake and it seems that I didn't but I still
resist to think that MAP adaptation can only perform this improvement.
Regards,
Jordi Adell
You are welcome to provide the data which will let us reproduce your problem
Hello,
I thought I had given enough information already, I'm sorry.
For more detail:
* The data I used to test the models can be downlaod from here:
http://festvox.org/cmu_arctic/dbs_bdl.html
I performed MLLR+MAP adaptaion with the first 200 sentences and test on the
whole database (1132 sentences)
I downloaded the models from here:
http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/pocketsphinx-
extra/?view=tar
as suggested in:
http://cmusphinx.sourceforge.net/wiki/tutorialadapt
Regards
Jordi, the data to reproduce the problem also includes the commands and
scripts you where trying to invoke, temporary files you created, output logs
and result files. It's not only the data files.
If you upload something that I can run with a single script and that will show
what the problem is that will greatly increase your chances to get help.
ah!,
and SCTK can be downloaded from here:
http://www.itl.nist.gov/iad/mig/tools/
Dear Nickolay,
At first I did not expect such amount of help, I was just looking for an
opinion about the results, to see if they are reasonable or not. But, If you
are so kind that you offer this kind of help, I will provide all the
information. I'll see if I can upload the bash scripts and the code, or at
least the log files.
Thanks in advance.
Here you can download all the data and scripts that I am using:
http://dl.dropbox.com/u/818449/forgehelp.tgz
If sphinxtrin and pocketsphinx binary are avilable you can run the command:
run.sh
and the whole process runs. At the end it will come up with a couple of files
named resultsSource and resultsAdapted.
If you dont have sclite installed, you can have a look at the hypSource and
hypAdapted files.
The logs are published in the log directory.
Regards.