Menu

ARPA Format Syntax

Help
2018-04-25
2018-04-28
  • Scott Guthery

    Scott Guthery - 2018-04-25

    In the ARPA lm files generated by the CMU lmtool , , and <UNK> are not surrounded by apostrophes (') but in the examples of the ARPA format on the pocketsphinx help page regarding language models they are. Is there any differece in how they are treated?

     
    • Nickolay V. Shmyrev

      I fixed the wiki page, thanks for the notification.

       
  • Scott Guthery

    Scott Guthery - 2018-04-25

    I'll note in passing that lm_trie.c seems to enforce rules about the ARPA format that are over and above the format specifications. For example, if I have 0 1-, 2-, 3-, and 4-grams and 1 5-gram then lm_trie.c/recursive_insert fails the assertion priority_queue_size(ngrams)==0. If I just put the 5-gram in the lm file then ngram_model_trie.c/Line 458 fails with "Wrong magic header size number".

     
  • Scott Guthery

    Scott Guthery - 2018-04-25

    Further to ARPA grammars:

    ERROR: "ngrams_raw.c", line 84: Format error; 6-gram ignored at line 25
    INFO: lm_trie.c(474): Training quantizer
    Error in `pocketsphinx_continuous': double free or corruption (out): 0x0000000032cae150

    \data\ ngram 1=1
    ngram 2=1
    ngram 3=1
    ngram 4=1
    ngram 5=1
    ngram 6=1

    \1-grams:
    -10.000 foo

    \2-grams:
    -10.0 foo foo

    \3-grams:
    -10.0 foo foo foo

    \4-grams:
    -10.0 foo foo foo foo

    \5-grams:
    0.0000 when you hear <unk>

    \6-grams:
    0.0000 when you hear <unk>

    \end\

     
  • Scott Guthery

    Scott Guthery - 2018-04-25

    I.e. pocketsphinx_continuous doesn't do 6-grams, right?
    It does seem to handle 5-grams, however. At least it eats them and doesn't crash.
    Are you also signalling that pocketsphinx_continuous doesn't do <unk>?

     
    • Nickolay V. Shmyrev

      Thanks for great tests, Scott. It will be nice to get them fixed one day. Anything beside 3grams are not useful with pocketsphinx. <unk> is not supported either.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.