Menu

idngram2lm problem with table size.

Help
2015-05-24
2015-05-24
  • Savins Puertas Martin

    Hi,

    I'm creating a new lenguage model for spanish because I don't find some words that I need it in: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Spanish%20Voxforge/

    Now, I do the step 5 of http://cmusphinx.sourceforge.net/wiki/tutoriallm, but when I execute the command: idngram2lm -vocab_type 0 -idngram weather.idngram -vocab weather.vocab -arpa weather.lm, I get the error below and I don't know how to change the table size it indicates me.

    This is the console output. The error is the last line.

    n : 3
    Input file : a.idngram (binary format)
    Output files :
    ARPA format : a.lm
    Vocabulary file : a.vocab
    Cutoffs :
    2-gram : 0 3-gram : 0
    Vocabulary type : Closed
    Minimum unigram count : 0
    Zeroton fraction : 1
    Counts will be stored in two bytes.
    Count table size : 65535
    Discounting method : Good-Turing
    Discounting ranges :
    1-gram : 1 2-gram : 7 3-gram : 7
    Memory allocation for tree structure :
    Allocate 100 MB of memory, shared equally between all n-gram tables.
    Back-off weight storage :
    Back-off weights will be stored in four bytes.
    Reading vocabulary.
    ...................
    read_wlist_into_siht: a list of 19996 words was read from "a.vocab".
    read_wlist_into_array: a list of 19996 words was read from "a.vocab".
    Allocated space for 3571428 2-grams.
    Allocated space for 8333333 3-grams.
    table_size 19997
    Allocated 57142848 bytes to table for 2-grams.
    Allocated (2+33333332) bytes to table for 3-grams.
    Processing id n-gram file.
    20,000 n-grams processed for each ".", 1,000,000 for each line.
    Warning : id n-gram stream contains OOV's (n-grams will be ignored).
    ..................................................
    ..................................................
    ..................................................
    ..................................................
    ..................................................
    ..................................................
    ..................................................
    ..................................................
    ................
    More than 8333333 3-grams needed to be stored. Rerun with a higher table size.

    Thanks in advance.

     
    • Nickolay V. Shmyrev

      Use srilm

       
      • Savins Puertas Martin

        Thanks Nickolay.
        But I have a question, is there a tutorial for srilm?

         

Log in to post a comment.