Menu

training & decoding questio, missing xx.ug.lm

Help
UF grad
2008-08-12
2012-09-22
  • UF grad

    UF grad - 2008-08-12

    I think i am sucessfully build a model using the data I have, mostly digit utturance.

    Looking in the log file, i got the following warning, but not sure that it means. Does anyone know?
    The log mesg is from 02.falign_ci_hmm/boca1.1.1-1.bw.log

    <i> ===========================================================================================
    INFO: cvt2triphone.c(199): no multiphones defined, no conversion done
    92 88 36 71 3.927145e-11 5.601433e+00 4.968471e+03
    utt> 1 enus.7 882 0WARNING: "corpus.c", line 1986: LSN utt id, enus.7.raw, does not match ctl utt id, enus.7.
    </i>============================================================================================

    For decoding, I do not have enus1.ug.lm and enus1.ug.lm.DMP and I am not sure how to create them.
    What I did was, I just copy it from an4.ug.lm and an4.ug.lm.DMP and it seems to work only the
    word that common with the an4 dataset. Is this a language model? do I need to use lm tool to build it?

     
    • UF grad

      UF grad - 2008-08-14

      Look like the sphinx_jsgf2fsg version (from the sphinxbase) that I download does not support the "-fsg digits.fsg -op_mode 2" option. However, I was able to create fsg using just

      $ sphinx_jsgf2fsg digits.jsgf > digit.fsg

      Now, how do I use it with the Sphinx3 decoder? I am using the example environment.
      Looking into the etc/sphinx_decode.cfg, I see only LM parameters, but not the FSG, i.e,

      $DEC_CFG_LANGUAGEMODEL_DIR = "$DEC_CFG_BASE_DIR/etc";
      $DEC_CFG_LANGUAGEMODEL = "$DEC_CFG_LANGUAGEMODEL_DIR/an4.ug.lm.DMP";

      What setup do I need to do to make the decoder work with fsg file?

       
      • Nickolay V. Shmyrev

        "-fsg digits.fsg -op_mode 2" are options of sphinx3_decode, not options of sphinx_jsgf2fsg. Now once you have fsg, change scipts_pl/decode/s3decode.pl script, delete

        &quot;-lm \&quot;$DEC_CFG_LANGUAGEMODEL\&quot; &quot; .
        

        add

        &quot;-fsg your_fsg &quot; .
        &quot;-op_mode 2 &quot; .
        
         
    • Nickolay V. Shmyrev

      > utt id, enus.7.raw, does not match ctl utt id, enus.7.

      to expand this message, utterance id, the word in brackets in your enus1.transcription file on line 1896 doesn't match the utterance id in your ctl enus1.fileids file on the same line. Check this line , probably you forgot to delete .raw.

      > do I need to use lm tool to build it?

      yes

       
    • UF grad

      UF grad - 2008-08-13

      Instead of building LM, can i use building vocab using some sort of grammar rule?
      My data are only number string, so i want to create some rule like
      grammar = <digit> <digit> | <digit> <digit> <digit> ...

      Is there a tool to build vocab or LM from this grammar?

       
      • Nickolay V. Shmyrev

        Yes, you can use JSGF grammar instead of lm. Write JSGF, convert it to FSG with sphinx_jsgf2fsg and use it with options "-fsg digits.fsg -op_mode 2". If you'll use pocketsphinx decoder, you can use jsgf directly.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.