Menu

pocketsphinx y Voxforge Spanish Model Released

Help
2014-05-03
2014-07-04
  • brayanpuerta

    brayanpuerta - 2014-05-03

    how integrate Voxforge Spanish Model with pocketsphinx for spanish voice recognition ?

     
  • Nickolay V. Shmyrev

    Hola Brayan

    To use models for other language you just need to point them to decoder configuration. You need to point both to the acoustic model and to the dictionary:

      pocketsphinx_continuous -hmm voxforge_es_sphinx.cd_cont_1500 -dict voxforge_es_sphinx.dic -jsgf your.gram
    

    You can do the same from the source code when you create configuration object.

     
  • brayanpuerta

    brayanpuerta - 2014-05-04

    voxforge_es_sphinx.cd_cont_1500 is a directory right?

     
  • brayanpuerta

    brayanpuerta - 2014-05-04

    when I execute pocketsphinx_continuous -hmm voxforge_es_sphinx.cd_cont_1500 -dict voxforge_es_sphinx.dic -jsgf your.gram I had the following error:

    ERROR "fe_interface.c", line 109: FFT: Number of points must be greater or equal to frame size (409 samples)

     
  • Nickolay V. Shmyrev

    Remove -nfft 256 line from feat.params file in model folder

     
  • brayanpuerta

    brayanpuerta - 2014-05-05

    thank you!

     
  • Avee

    Avee - 2014-06-27

    Hi,

    Is Voxforge model trained on all the database mentioned in this path http://www.repository.voxforge1.org/downloads/es/Trunk/Audio/Main/8kHz_16bit (in build.sh)

    or just on some test database and we need to run build.sh as mentioned in the readme. (voxforge-es-0.1.1)

    "To setup the files use build.sh script as a base. It should download
    required files from Voxforge, setup structure and extract features.
    Scripts are located in scripts subfolder."

    Also, this model only provide test language model i.e. voxforge_es_sphinx.transcription.test.lm

    To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?

     
  • Nickolay V. Shmyrev

    Is Voxforge model trained on all the database mentioned in this path

    Model is trained on the data which was available few years ago. You can find exact list of the utterances in etc/voxforge_es_sphinx.fileids.

    If you start the training again there is sense to download additional data added recently. Like I said, you can also use other data beside voxforge, you need way more data for a good model.

    To build a serious language model should i used the transcription file i.e. voxforge_es_sphinx.transcription or language model is stored somewhere else?

    For language model you need additional data which you can crawl. For example you can get some data from wikipedia or crawl spanish subtitles. Subtitles are usually a good source of spoken language.

     
  • Avee

    Avee - 2014-06-30

    Hi Nickolay,

    Thanks for your reply.

    There is no file named "etc/voxforge_es_sphinx.fileids".Although, it contains a file named "etc/voxforge_es_sphinx.fileids.train".
    So, I suppose you are referring to this file.

    etc/voxforge_es_sphinx.fileids.train contains 3783 utterances whereas now there are 13448 utterances available for download.

    So, it is better train a model from the scratch or should I adapt it?

    Also, how much data should be sufficient to build a good language model?

    and what should be duration of each utterance?

     

    Last edit: Avee 2014-06-30
  • Nickolay V. Shmyrev

    So, it is better train a model from the scratch or should I adapt it?

    It's better to train from scratch

    Also, how much data should be sufficient to build a good language model?
    and what should be duration of each utterance?

    These issues are covered in acoustic model training tutorial

    http://cmusphinx.sourceforge.net/wiki/tutorialam

     
  • Avee

    Avee - 2014-07-01

    Hi Nickolay,

    This link does talk about data required for AM training but not language model training.
    Can you please shared some stats about how much data is required for good language model?

     
    • Nickolay V. Shmyrev

      It depends on the complexity of the language you want to recognize. For small domains (10k words in vocabulary) its enough to have 1Gb of texts. For generic speech (1m words in vocabulary) people use up to several Tbs of text.

       
  • Avee

    Avee - 2014-07-04

    OK thanks Nickolay.

    Is it good idea to use transcription text used for AM training to be used for language model creation?

     
    • Nickolay V. Shmyrev

      Voxforge text is artificial and not very useful. It's better to crawl subtitles.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.