#17 core dump during test

Thomas Klausner

With the patches from 1586031 applied, I get a core
dump when I try to run tesseract on phototest.tif on

The backtrace is:
Program terminated with signal 11, Segmentation fault.
#0 0x00000000004c1c70 in reverse32 ()
(gdb) bt
#0 0x00000000004c1c70 in reverse32 ()
#1 0x00000000004aed12 in read_squished_dawg ()
#2 0x00000000004aaded in init_permute ()
#3 0x0000000000485779 in program_editup ()
#4 0x0000000000485869 in start_recog ()
#5 0x0000000000403d04 in init_tesseract ()
#6 0x000000000040309b in main ()

I don't know yet what causes this problem.


  • Armin

    This particular problem is due to ccutil/host.h having
    inconsistent definitions for 32bit ints:

    typedef long INT32;
    typedef unsigned int UINT32;

    This is plainly wrong, as int and long have no reason to
    have the same size on a given platform. As a temproary fix,
    changing long to int would make this particular crash go
    away (and be replaced by compile errors and other crashes).
    A better solution IMHO would be, since configure checks for
    the C99 header stdint.h anyway, to do sokmething like:

    #ifdef HAVE_STDINT_H /*defined or not in the generated
    config_auto.h */
    #include <stdint.h>

    typedef int8_t INT8;
    typedef uint8_t UINT8;
    typedef int16_t INT16;
    typedef uint16_t UINT16;
    typedef int32_t INT32;
    typedef uint32_t UINT32;

    /* same for pointers and const pointers */
    /* original defines, for old compilers that do not have
    stdint.h */

    and leave the platform-specific C library figure out how to
    define the sizes.

    Note that making INT32 a 32-bit integer will break some
    pointer-to INT32 conversions (gcc flags that as an error) -
    typically in lengths passed through pointers. Making the
    length a long or an explicit cast pointer->long->INT32 fixes
    that problem.

    And please, please, maintainers, fix the bloody name
    spellings - s/case_sensative/case_sensitive/p and so on. The
    code is spaghetti enough without having to parse Engrish as