pocketsphinx y Voxforge Spanish Model Released

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

pocketsphinx y Voxforge Spanish Model Released

Forum: Help

Creator: brayanpuerta

Created: 2014-05-03

Updated: 2014-07-04

brayanpuerta - 2014-05-03

how integrate Voxforge Spanish Model with pocketsphinx for spanish voice recognition ?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2014-05-04

Hola Brayan

To use models for other language you just need to point them to decoder configuration. You need to point both to the acoustic model and to the dictionary:

pocketsphinx_continuous -hmm voxforge_es_sphinx.cd_cont_1500 -dict voxforge_es_sphinx.dic -jsgf your.gram

You can do the same from the source code when you create configuration object.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

brayanpuerta - 2014-05-04

voxforge_es_sphinx.cd_cont_1500 is a directory right?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

brayanpuerta - 2014-05-04

when I execute pocketsphinx_continuous -hmm voxforge_es_sphinx.cd_cont_1500 -dict voxforge_es_sphinx.dic -jsgf your.gram I had the following error:

ERROR "fe_interface.c", line 109: FFT: Number of points must be greater or equal to frame size (409 samples)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2014-05-04

Remove -nfft 256 line from feat.params file in model folder

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

brayanpuerta - 2014-05-05

thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Avee - 2014-06-27

Hi,

Is Voxforge model trained on all the database mentioned in this path http://www.repository.voxforge1.org/downloads/es/Trunk/Audio/Main/8kHz_16bit (in build.sh)

or just on some test database and we need to run build.sh as mentioned in the readme. (voxforge-es-0.1.1)

"To setup the files use build.sh script as a base. It should download
required files from Voxforge, setup structure and extract features.
Scripts are located in scripts subfolder."

Also, this model only provide test language model i.e. voxforge_es_sphinx.transcription.test.lm

To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2014-06-27

Is Voxforge model trained on all the database mentioned in this path

Model is trained on the data which was available few years ago. You can find exact list of the utterances in etc/voxforge_es_sphinx.fileids.

If you start the training again there is sense to download additional data added recently. Like I said, you can also use other data beside voxforge, you need way more data for a good model.

To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?

For language model you need additional data which you can crawl. For example you can get some data from wikipedia or crawl spanish subtitles. Subtitles are usually a good source of spoken language.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Avee - 2014-06-30

Hi Nickolay,

Thanks for your reply.

There is no file named "etc/voxforge_es_sphinx.fileids".Although, it contains a file named "etc/voxforge_es_sphinx.fileids.train".
So, I suppose you are referring to this file.

etc/voxforge_es_sphinx.fileids.train contains 3783 utterances whereas now there are 13448 utterances available for download.

So, it is better train a model from the scratch or should I adapt it?

Also, how much data should be sufficient to build a good language model?

and what should be duration of each utterance?

Last edit: Avee 2014-06-30

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2014-06-30

So, it is better train a model from the scratch or should I adapt it?

It's better to train from scratch

Also, how much data should be sufficient to build a good language model?
and what should be duration of each utterance?

These issues are covered in acoustic model training tutorial

http://cmusphinx.sourceforge.net/wiki/tutorialam

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Avee - 2014-07-01

Hi Nickolay,

This link does talk about data required for AM training but not language model training.
Can you please shared some stats about how much data is required for good language model?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2014-07-01
  
  It depends on the complexity of the language you want to recognize. For small domains (10k words in vocabulary) its enough to have 1Gb of texts. For generic speech (1m words in vocabulary) people use up to several Tbs of text.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Avee - 2014-07-04

OK thanks Nickolay.

Is it good idea to use transcription text used for AM training to be used for language model creation?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2014-07-04
  
  Voxforge text is artificial and not very useful. It's better to crawl subtitles.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.