CMU Sphinx / Forums / Help: PocketSphinx loading time

Anonymous - 2012-02-07

Hello, my english is not very good, sorry for that. I will try to explain
myself. First a small briefing:

Platform: Android + PocketSphinx.
Language Model Parameters: voxforge_es_sphinx.cd_cont_1500 - v.0.1.1.
Dictionary: 244 Words.
Grammar: JSGF.

When I started working with Android + PocketSphinix the Decoder takes about
one/three second to load all stuff using "hub4wsj_sc_8k" and that was great.
Now I am using the VoxForge Language Model Parameters for spanish and I am
getting about five/seven seconds at time of load the Decoder. That is too much
time.

I know that my dictionary and JSGF file loads very quickly. So the problem is
inside the Language Model Paremeters.

Question is: why those differences between loading time?

Thanks in advance.

Regards,

Lilo.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

eliasmajic - 2012-02-07

Not really an answer but you only need to encounter that 5s/7s once. One thing
that I did is just hide that 5s in the loading screen they initially see when
running the app.

The spanish voxforge audio is also available to train your own model.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-02-07

@eliasmajic thank you very much for your reply. By the nature of the
application, I can not wait the initial seven seconds. I know that I can add
all heavy load in a service executed at system boot and avoid the loading in
each launch of the application (this will be my last alternative).

About to train a new model I have doubts. Actually I am creating the
Dictionary and the JSGF in a dinamic way. So I don't know how to train a model
without a static content. Due to my bad english... my knowledges about that
are very poor. For that reason I like to know why one model is loaded faster
than other.

One more time, sorry for my bad english. I am not an smart monkey! I am an
average human :)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

At this time, just with testing purposes I am trying to train a new Acoustic
Model using VoxForge stuff for spanish. But I'm getting the following Warnings
and Errors:

...
WARNING: This word: Té was in the transcript file, but is not in the dictionary (<s> TODO ESTABA LO MISMO QUE UNA HORA ANTES CUANDO EL Té HUMEABA EN LA TAZA DE OJEDA AHORA VACíA Y BLANQUEABAN SOBRE LA MESA LOS PLIEGOS </s> ). Do cases match?
WARNING: This word: VACíA was in the transcript file, but is not in the dictionary (<s> TODO ESTABA LO MISMO QUE UNA HORA ANTES CUANDO EL Té HUMEABA EN LA TAZA DE OJEDA AHORA VACíA Y BLANQUEABAN SOBRE LA MESA LOS PLIEGOS </s> ). Do cases match?
WARNING: This word: SALóN was in the transcript file, but is not in the dictionary (<s> DEL FONDO DEL SEGUNDO SALóN LLEGABAN CONFUNDIDOS CON RISAS DE MUJERES Y CHOQUE DE BANDEJAS </s> ). Do cases match?
WARNING: This word: FRíO was in the transcript file, but is not in the dictionary (<s> Y EL FRíO RESPLANDOR DE LAS AMPOLLAS ELéCTRICAS DESCENDíAN GORJEOS DE PáJAROS </s> ). Do cases match?
...





...
MODULE: 40 Build Trees
    Phase 1: Cleaning up old log files...
    Phase 2: Make Questions
    Phase 3: Tree building
        Processing each phone with each state
        A 0 
        A 1 
        A 2 
        B 0 
        B 1 
        B 2 
        CH 0 
        CH 1 
        CH 2 
        D 0 
        D 1 
        D 2 
        E 0 
        E 1 
        E 2 
        F 0 
        F 1 
        F 2 
        G 0 
        G 1 
        G 2 
        GN 0 
This step had 1 ERROR messages and 1 WARNING messages.  Please check the log file for details.
        GN 1 
This step had 1 ERROR messages and 1 WARNING messages.  Please check the log file for details.
        GN 2 
This step had 1 ERROR messages and 1 WARNING messages.  Please check the log file for details.
        I 0 
        I 1 
        I 2 
        J 0 
        J 1 
        J 2 
        K 0 
        K 1 
        K 2 
        L 0 
        L 1 
        L 2 
        LL 0 
        LL 1 
        LL 2 
        M 0 
        M 1 
        M 2 
        N 0 
        N 1 
        N 2 
        O 0 
        O 1 
        O 2 
        P 0 
        P 1 
        P 2 
        R 0 
        R 1 
        R 2 
        RR 0 
        RR 1 
        RR 2 
        S 0 
        S 1 
        S 2 
        T 0 
        T 1 
        T 2 
        U 0 
        U 1 
        U 2 
        X 0 
        X 1 
        X 2 
        Y 0 
        Y 1 
        Y 2 
        Z 0 
^C        Skipping SIL
MODULE: 45 Prune Trees
    Phase 1: Tree Pruning
This step had 1 ERROR messages and 0 WARNING messages.  Please check the log file for details.
    Phase 2: State Tying
This step had 1 ERROR messages and 0 WARNING messages.  Please check the log file for details.
MODULE: 50 Training Context dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Copy CI to CD initialize
This step had 1 ERROR messages and 0 WARNING messages.  Please check the log file for details.
    Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% 
This step had 2 ERROR messages and 0 WARNING messages.  Please check the log file for details.
Only 0 parts of 1 of Baum Welch were successfully completed
Parts 1 failed to run!
Training failed in iteration 1
MODULE: 60 Lattice Generation
Skipped:  $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 61 Lattice Pruning
Skipped:  $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 62 Lattice Format Conversion
Skipped:  $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 65 MMIE Training
Skipped:  $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 90 deleted interpolation
Skipped for continuous models
...

Initially, Dictionary and Transcription files was in UTF-8 but I was getting
"broken" characters in the log output.. so I set both files to ISO-8859-15.
Then all characters like áéíóú was showed fine, but I have the same errors.

Nickolay V. Shmyrev - 2012-02-08

WARNING: This word: VACíA was in the transcript file, but is not in the
dictionary ( TODO ESTABA LO MISMO QUE UNA HORA ANTES CUANDO EL Té HUMEABA
EN LA TAZA DE OJEDA AHORA VACíA Y BLANQUEABAN SOBRE LA MESA LOS PLIEGOS
). Do cases match?

Dictionary might require resorting after you converted it to another encoding.
It's recommended to use UTF-8 , not 8859-15.

The warning just tells you that the word is missing. All you need to do is to
make sure word is there.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-02-08

Thank you for your reply. Initially I was using the original file (encoded
with UTF-8) but I got the following warnings:

... WARNING: This word: SEÃ±ALA was in the transcript file, but is not in the dictionary (<s> SEÃ±ALA CON EL DEDO LO QUE MÃ¡S TE GUSTE </s> ). Do cases match? ...

It should be:

...
SEñALA instead of SEÃ±ALA
MáS instead of MÃ¡S
...

Dictionary extraction:

... SEñALA S E GN A L A ...

Transcription extraction:

...
~~SEñALA CON EL DEDO LO QUE MáS TE GUSTE~~ (ciego-10122009/wav/78)
...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-08

This seems to be a mistake in the original voxforge data. I think it's easy to
fix it in the archive itself and then train.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-02-08

I think that both (Dictionary and Transcription) files are not mistaken in the
original VoxForge, I can read them without broken characters. How can I double
check it?

Please, sorry for by bad english.. maybe I misunderstood something. Thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-08

Maybe the training script that creates prompts using awk breaks the text. You
might want to set locale on your machine to en_US.UTF-8 or to C. It could
affect awk behaviour.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Same result as before. But don't worry I don't like to bother you.

Just one thing: the idea to create a new acoustic model is due to I am getting
a "Decoder" load time of seven seconds using PocketSphinx + Android (please
check my post number 1 and 3). But I think that it will be the same, due to it
will be only a new compilation using the same base. So... it is there any way
to reuse the current VoxForge in Spanish doing it like "hub4wsj_sc_8k" that
takes only 3 seconds to load? Maybe I am asking silly questions...

~$ locale -a
C
C.UTF-8
de_CH.utf8
en_AG
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_IN.utf8
en_NG
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM
en_ZM.utf8
en_ZW.utf8
es_AR.utf8
es_BO.utf8
es_CL.utf8
es_CO.utf8
es_CR.utf8
es_DO.utf8
es_EC.utf8
es_ES.utf8
es_GT.utf8
es_HN.utf8
es_MX.utf8
es_NI.utf8
es_PA.utf8
es_PE.utf8
es_PR.utf8
es_PY.utf8
es_SV.utf8
es_US.utf8
es_UY.utf8
es_VE.utf8
POSIX
zh_CN.utf8
zh_SG.utf8
~$ LC_ALL="en_US.utf8"
~$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=en_US.utf8
~$ cat /etc/default/locale
LANG="en_US.UTF-8"
~$ reboot

Thank you Mr. Nickolay.

Nickolay V. Shmyrev - 2012-02-08

Just one thing: the idea to create a new acoustic model is due to I am
getting a "Decoder" load time of seven seconds using PocketSphinx + Android
(please check my post number 1 and 3). But I think that it will be the same,
due to it will be only a new compilation using the same base. So... it is
there any way to reuse the current VoxForge in Spanish doing it like
"hub4wsj_sc_8k" that takes only 3 seconds to load? Maybe I am asking silly
questions..

.

As said above, you need to train semi-continuous model from the Voxforge data.
Also you can convert mdef file to binary format with
pocketsphinx_mdef_convert. Then it will load faster.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-02-08

Thank you! I am already using a mdef file in binary format. There is about one
second difference with the mdef text file. So, I will try to create a new
model using a different computer.

Really, thank you very much for your time.

I was trying to edit my first post to put everything in order, but I can not.
I will maintain you updated with my progress.. if you need something from me,
just poke me :)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

PocketSphinx loading time

Speech Recognition Toolkit

Forums

Help

PocketSphinx loading time document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

PocketSphinx loading time