To use models for other language you just need to point them to decoder configuration. You need to point both to the acoustic model and to the dictionary:
or just on some test database and we need to run build.sh as mentioned in the readme. (voxforge-es-0.1.1)
"To setup the files use build.sh script as a base. It should download
required files from Voxforge, setup structure and extract features.
Scripts are located in scripts subfolder."
Also, this model only provide test language model i.e. voxforge_es_sphinx.transcription.test.lm
To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Is Voxforge model trained on all the database mentioned in this path
Model is trained on the data which was available few years ago. You can find exact list of the utterances in etc/voxforge_es_sphinx.fileids.
If you start the training again there is sense to download additional data added recently. Like I said, you can also use other data beside voxforge, you need way more data for a good model.
To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?
For language model you need additional data which you can crawl. For example you can get some data from wikipedia or crawl spanish subtitles. Subtitles are usually a good source of spoken language.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is no file named "etc/voxforge_es_sphinx.fileids".Although, it contains a file named "etc/voxforge_es_sphinx.fileids.train".
So, I suppose you are referring to this file.
etc/voxforge_es_sphinx.fileids.train contains 3783 utterances whereas now there are 13448 utterances available for download.
So, it is better train a model from the scratch or should I adapt it?
Also, how much data should be sufficient to build a good language model?
and what should be duration of each utterance?
Last edit: Avee 2014-06-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This link does talk about data required for AM training but not language model training.
Can you please shared some stats about how much data is required for good language model?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It depends on the complexity of the language you want to recognize. For small domains (10k words in vocabulary) its enough to have 1Gb of texts. For generic speech (1m words in vocabulary) people use up to several Tbs of text.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
how integrate Voxforge Spanish Model with pocketsphinx for spanish voice recognition ?
Hola Brayan
To use models for other language you just need to point them to decoder configuration. You need to point both to the acoustic model and to the dictionary:
You can do the same from the source code when you create configuration object.
voxforge_es_sphinx.cd_cont_1500 is a directory right?
when I execute pocketsphinx_continuous -hmm voxforge_es_sphinx.cd_cont_1500 -dict voxforge_es_sphinx.dic -jsgf your.gram I had the following error:
ERROR "fe_interface.c", line 109: FFT: Number of points must be greater or equal to frame size (409 samples)
Remove -nfft 256 line from feat.params file in model folder
thank you!
Hi,
Is Voxforge model trained on all the database mentioned in this path http://www.repository.voxforge1.org/downloads/es/Trunk/Audio/Main/8kHz_16bit (in build.sh)
or just on some test database and we need to run build.sh as mentioned in the readme. (voxforge-es-0.1.1)
"To setup the files use build.sh script as a base. It should download
required files from Voxforge, setup structure and extract features.
Scripts are located in scripts subfolder."
Also, this model only provide test language model i.e. voxforge_es_sphinx.transcription.test.lm
To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?
Model is trained on the data which was available few years ago. You can find exact list of the utterances in etc/voxforge_es_sphinx.fileids.
If you start the training again there is sense to download additional data added recently. Like I said, you can also use other data beside voxforge, you need way more data for a good model.
For language model you need additional data which you can crawl. For example you can get some data from wikipedia or crawl spanish subtitles. Subtitles are usually a good source of spoken language.
Hi Nickolay,
Thanks for your reply.
There is no file named "etc/voxforge_es_sphinx.fileids".Although, it contains a file named "etc/voxforge_es_sphinx.fileids.train".
So, I suppose you are referring to this file.
etc/voxforge_es_sphinx.fileids.train contains 3783 utterances whereas now there are 13448 utterances available for download.
So, it is better train a model from the scratch or should I adapt it?
Also, how much data should be sufficient to build a good language model?
and what should be duration of each utterance?
Last edit: Avee 2014-06-30
It's better to train from scratch
These issues are covered in acoustic model training tutorial
http://cmusphinx.sourceforge.net/wiki/tutorialam
Hi Nickolay,
This link does talk about data required for AM training but not language model training.
Can you please shared some stats about how much data is required for good language model?
It depends on the complexity of the language you want to recognize. For small domains (10k words in vocabulary) its enough to have 1Gb of texts. For generic speech (1m words in vocabulary) people use up to several Tbs of text.
OK thanks Nickolay.
Is it good idea to use transcription text used for AM training to be used for language model creation?
Voxforge text is artificial and not very useful. It's better to crawl subtitles.