Menu

cmucltk test.sh fails - Segmentation fault

Help
2010-11-27
2012-09-22
  • Robert Twomey

    Robert Twomey - 2010-11-27

    Hi,

    I downloaded cmusphinx src from svn and am trying to build cmuclmtk on OSX
    10.6.4. My end goal is to train SLMs for data locally rather than using the
    web-based lm-tool. My problem is that I get segmentation faults when I run
    text2idngram.

    I had this error first processing my own input text and now with the test
    script in cmuclmtk/test/. See output below:

    dxtwomey:test rtwomey$ ./test.sh
    V3 32BITS text2wfreq
    V3 32BITS text2wfreq PASSED
    V3 32BITS wfreq2vocab
    V3 32BITS wfreq2vocab PASSED
    V3 32BITS text2idngram BIN IDNGRAM
    ./test.sh: line 25: 31478 Done                    cat English/emma11.txt.filtered
         31479 Segmentation fault      | $BIN/text2idngram -vocab English/emma11.vocab -idngram ./English/emma11.txt.filtered.idngram.32bits 2> text2idngram.log
    

    As far as I could tell, the build went fine. (no errors). Has anyone built
    this successfully in OS X? Is it a 32 bit/64bit kind of problem?

    Has anyone had problems with text2idngram built on other systems recently?

    Thanks!

    Robert

     
  • Robert Twomey

    Robert Twomey - 2010-11-27

    oh and fyi text2idngram.log doesn't give much info... :

    text2idngram
    Vocab : English/emma11.vocab
    Output idngram : ./English/emma11.txt.filtered.idngram.32bits
    N-gram buffer size : 100
    Hash table size : 2000000
    Temp directory : cmuclmtk-3UTDpW
    Max open files : 20
    FOF size : 10
    n : 3
    Initialising hash table...
    Reading vocabulary...
    Allocating memory for the n-gram buffer...
    Reading text into the n-gram buffer...
    20,000 n-grams processed for each ".", 1,000,000 for each line.
    ........
    Sorting n-grams...

     
  • Nickolay V. Shmyrev

    Hello

    Has anyone built this successfully in OS X?

    Nobody uses OSX as development platform here

    Is it a 32 bit/64bit kind of problem?

    No, I don't think so

    Has anyone had problems with text2idngram built on other systems recently?

    No, everything works as expected

    My problem is that I get segmentation faults when I run text2idngram.

    It must be a bug. To help us solve it you need to collect stack trace from
    gdb. See

    http://live.gnome.org/GettingTraces/Details

    For more detailed information see gdb documentation

    http://sourceware.org/gdb/current/onlinedocs/gdb/Backtrace.html#Backtrace

     
  • Robert Twomey

    Robert Twomey - 2010-11-27

    Thank you for your reply. This is my first time with gdb, I apologize for
    excessive ignorance.
    Here are the results of my gdb backtrace:

    gdb /usr/local/bin/text2idngram
    GNU gdb 6.3.50-20050815 (Apple version gdb-1472) (Wed Jul 21 10:53:12 UTC 2010)
    Copyright 2004 Free Software Foundation, Inc.
    GDB is free software, covered by the GNU General Public License, and you are
    welcome to change it and/or distribute copies of it under certain conditions.
    Type "show copying" to see the conditions.
    There is absolutely no warranty for GDB.  Type "show warranty" for details.
    This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries .
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/ac_hash.o" - no debug information available for "ac_hash.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/ac_lmfunc_impl.o" - no debug information available for "ac_lmfunc_impl.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/ac_parsetext.o" - no debug information available for "ac_parsetext.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/parse_line.o" - no debug information available for "parse_line.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/pc_comline.o" - no debug information available for "pc_comline.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/pc_message.o" - no debug information available for "pc_message.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/quit.o" - no debug information available for "quit.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rd_wlist_arry.o" - no debug information available for "rd_wlist_arry.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/read_voc.o" - no debug information available for "read_voc.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/read_wlist_si.o" - no debug information available for "read_wlist_si.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_calloc.o" - no debug information available for "rr_calloc.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_feof.o" - no debug information available for "rr_feof.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_fexists.o" - no debug information available for "rr_fexists.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_filesize.o" - no debug information available for "rr_filesize.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_fopen.o" - no debug information available for "rr_fopen.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_fread.o" - no debug information available for "rr_fread.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_fseek.o" - no debug information available for "rr_fseek.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_fwrite.o" - no debug information available for "rr_fwrite.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_iopen.o" - no debug information available for "rr_iopen.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_malloc.o" - no debug information available for "rr_malloc.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/rr_oopen.o" - no debug information available for "rr_oopen.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/salloc.o" - no debug information available for "salloc.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/sih.o" - no debug information available for "sih.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/libs.a/win32compat.o" - no debug information available for "win32compat.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/arpa_bo_ng_prob.o" - no debug information available for "arpa_bo_ng_prob.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/bo_ng_prob.o" - no debug information available for "bo_ng_prob.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/calc_mem_req.o" - no debug information available for "calc_mem_req.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/calc_prob_of.o" - no debug information available for "calc_prob_of.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/compute_back_off.o" - no debug information available for "compute_back_off.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/compute_discount.o" - no debug information available for "compute_discount.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/compute_unigram.o" - no debug information available for "compute_unigram.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/counts.o" - no debug information available for "counts.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/decode_bo_case.o" - no debug information available for "decode_bo_case.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/disc_meth.o" - no debug information available for "disc_meth.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/disc_meth_absolute.o" - no debug information available for "disc_meth_absolute.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/disc_meth_good_turing.o" - no debug information available for "disc_meth_good_turing.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/disc_meth_linear.o" - no debug information available for "disc_meth_linear.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/disc_meth_witten_bell.o" - no debug information available for "disc_meth_witten_bell.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/disc_method_linear.o" - no debug information available for "disc_method_linear.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/gen_fb_list.o" - no debug information available for "gen_fb_list.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/generate.o" - no debug information available for "generate.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/genrand.o" - no debug information available for "genrand.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/get_ngram.o" - no debug information available for "get_ngram.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/guess_mem.o" - no debug information available for "guess_mem.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/idngram2lm.o" - no debug information available for "idngram2lm.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/increment_context.o" - no debug information available for "increment_context.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/load_lm.o" - no debug information available for "load_lm.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/lookup_index_of.o" - no debug information available for "lookup_index_of.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/miscella.o" - no debug information available for "miscella.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/ngram.o" - no debug information available for "ngram.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/num_of_types.o" - no debug information available for "num_of_types.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/parse_comline.o" - no debug information available for "parse_comline.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/perplexity.o" - no debug information available for "perplexity.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/short_indices.o" - no debug information available for "short_indices.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/stats.o" - no debug information available for "stats.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/two_byte_alphas.o" - no debug information available for "two_byte_alphas.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/validate.o" - no debug information available for "validate.c".
    
    
    warning: Could not find object file "/Users/rtwomey/code/cmusphinx/cmuclmtk/src/.libs/libcmuclmtk.lax/liblmest.a/write_lms.o" - no debug information available for "write_lms.c".
    
    .. done
    
    (gdb) run -vocab English/emma11.vocab -idngram English/emma11.txt.filtered.idngram.32bits < English/emma11.txt.filtered
    Starting program: /usr/local/bin/text2idngram -vocab English/emma11.vocab -idngram English/emma11.txt.filtered.idngram.32bits < English/emma11.txt.filtered
    Reading symbols for shared libraries ++. done
    text2idngram
    Vocab                  : English/emma11.vocab
    Output idngram         : English/emma11.txt.filtered.idngram.32bits
    N-gram buffer size     : 100
    Hash table size        : 2000000
    Temp directory         : cmuclmtk-aFUQGV
    Max open files         : 20
    FOF size               : 10
    n                      : 3
    Initialising hash table...
    Reading vocabulary... 
    Allocating memory for the n-gram buffer...
    Reading text into the n-gram buffer...
    20,000 n-grams processed for each ".", 1,000,000 for each line.
    ........
    Sorting n-grams...
    
    Program received signal EXC_BAD_ACCESS, Could not access memory.
    Reason: KERN_INVALID_ADDRESS at address: 0x000000005fbff160
    0x00007fff83506180 in strlen ()
    (gdb) bt
    #0  0x00007fff83506180 in strlen ()
    #1  0x00007fff83511a7c in __vfprintf ()
    #2  0x00007fff83544a07 in vsnprintf ()
    #3  0x00007fff83544886 in __sprintf_chk ()
    #4  0x0000000100006d31 in read_txt2ngram_buffer ()
    #5  0x0000000100002635 in main (argc=1, argv=<value temporarily unavailable, due to optimizations>) at text2idngram.c:196
    (gdb) bt full
    #0  0x00007fff83506180 in strlen ()
    No symbol table info available.
    #1  0x00007fff83511a7c in __vfprintf ()
    No symbol table info available.
    #2  0x00007fff83544a07 in vsnprintf ()
    No symbol table info available.
    #3  0x00007fff83544886 in __sprintf_chk ()
    No symbol table info available.
    #4  0x0000000100006d31 in read_txt2ngram_buffer ()
    No symbol table info available.
    #5  0x0000000100002635 in main (argc=1, argv=<value temporarily unavailable, due to optimizations>) at text2idngram.c:196
        vocab_filename = 0x100100080 "English/emma11.vocab"
        idngram_filename = <value temporarily unavailable, due to optimizations>
        outfile = (FILE *) 0x7fff70629ec0
        tempfiles_directory = "cmuclmtk-aFUQGV", '\0' <repeats 489 times>, "???_?\000\000p?_?\000\000\006\003???\000\000??_?\000\000?&?_?", '\0' <repeats 18 times>, "???_?\000\000??_?\000\000`??_?\000\000?\005?_?\000\000?N?L\000\000\000\000?\024\n\000?\000\000G?\000\000d#҃?#\027\031^?A?\034W=OT???e??W?D\027??-ċj\021??\031?\\\024\022#)\030??U?~\005??\001%ӧ?U??\037??\034Oϡ޵???x\t?"...
        verbosity = 2
        buffer_size = 8333300
        max_files = 20
        fof_size = 10
        temp_file_root = 0x5fbff160 <Address 0x5fbff160 out of bounds>
        temp_file_ext = 0x1001000d0 ""
        help_flag = <value temporarily unavailable, due to optimizations>
        vocabulary = {
      size = 2000003, 
      chain = 0x100200000
    }
    (gdb)
    
     
  • Robert Twomey

    Robert Twomey - 2010-11-27

    OK. So looking through the code a bit it seems the problem is at line163 in
    text2idngram.c, the

    mkdtemp()
    

    call:

      /* If the last charactor in the directory name isn't a / then add one. */
      strcpy (tempfiles_directory, "cmuclmtk-XXXXXX");
      temp_file_root = mkdtemp(tempfiles_directory);
    

    If I change

    temp_file_root = mkdtemp(tempfiles_directory);
    

    to

    temp_file_root = salloc(".");
    

    and recompile, the tests all complete fine.

    Is this a bug or a problem with permissions on making the tmp directory?

     
  • Nickolay V. Shmyrev

    Hello.

    That information is a great help. I've just committed a small update to check
    mkdtemp result and print error message. Please update and run it again, it's
    interesting why it fails.

     
  • Robert Twomey

    Robert Twomey - 2010-11-27

    Hmmmm... neat. I have updated, compiled and run, but I don't get any error
    message dumped to the terminal.

    dxtwomey:test rtwomey$ ../src/programs/text2idngram -vocab English/emma11.vocab -idngram ./English/emma11.txt.filtered.idngram.32bits < English/emma11.txt.filtered
    text2idngram
    Vocab                  : English/emma11.vocab
    Output idngram         : ./English/emma11.txt.filtered.idngram.32bits
    N-gram buffer size     : 100
    Hash table size        : 2000000
    Temp directory         : cmuclmtk-yrF0ox
    Max open files         : 20
    FOF size               : 10
    n                      : 3
    Initialising hash table...
    Reading vocabulary... 
    Allocating memory for the n-gram buffer...
    Reading text into the n-gram buffer...
    20,000 n-grams processed for each ".", 1,000,000 for each line.
    ........
    Sorting n-grams...
    Segmentation fault
    

    I actually inserted a check for temp_file_root == NULL earlier as well, it
    seems like mkdtemp isn't returning NULL but it isn't anything valid...

     
  • Robert Twomey

    Robert Twomey - 2010-11-27

    If I move the quit(-1, "Failed to create temporary folder: %s\n",
    strerror(errno)) out of the if statement so it immediately gets called after
    mkdtemp, I get:

    Failed to create temporary folder: No such file or directory
    
     
  • Nickolay V. Shmyrev

    Hello.

    Thanks for your help in debugging this issue.

    I don't think the reason is in mkdtemp, it looks like it works as expected and
    creates non-null result. The reason might be in memory corruption somewhere
    later.

    I think you need ot continue debugging and try to get more meaningful
    backtrace. Your one was created without debuging symbols and lacks important
    information. You need to compile the application using -g -O0 options to get
    proper backtrace. See the docs linked above for more details.

     
  • Nickolay V. Shmyrev

    Hm, I was able to reproduce this issue on 64-bit machine. Will fix it soon!

     
  • Nickolay V. Shmyrev

    Fixed in trunk, please update.

     
  • Robert Twomey

    Robert Twomey - 2010-11-30

    That fixes it for me--thanks!

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.