#30 xchm crashes with signal 11 (SIGSEGV)

CHM access (5)
manuel wolfshant

While trying to open the chm file available at http://www.springframework.net/doc-latest/reference/htmlhelp/spring-net-reference.chm, xchm crashes.
The problem was initially reported at https://bugzilla.redhat.com/show_bug.cgi?id=701827 but can be reproduced always at least in F14 and Scientific Linux 6 ( a clone of RHEL 6), even when using xchm-1.19-1
Test rpm builds are available at http://koji.fedoraproject.org/koji/taskinfo?taskID=3048888 ( for Fedora 14) and http://koji.fedoraproject.org/koji/taskinfo?taskID=3048897 ( for RHEL 6 )


1 2 > >> (Page 1 of 2)
  • The CHM document is properly displayed, xCHM works fine

  • This is most likely a problem specific to the Fedora build. I don't have a Fedora system, so I can't check it out, but I did load your CHM file in my xCHM 1.19 build on a Gentoo Linux with wxGTK-2.8.11 (both Unicode and non-Unicode versions of the library). As you can see in the attached screenshot, it works just fine.

    You should report this to the Fedora xCHM package maintainers, or at least obtain a backtrace of the crash using gdb and debug versions of xCHM and wxGTK.

    • status: open --> closed-works-for-me
  • I am the maintainer :) As of the build, there is nothing special, it's just an usual configure/make + create rpm process.

    The backtrace for the segfault in 1.17.1 is available at https://bugzilla.redhat.com/attachment.cgi?id=496675 ( part of https://bugzilla.redhat.com/show_bug.cgi?id=701827 ). In 1.19.1 the failure is similar.

    If you need more details I'll be happy to provide, just teach me how to obtain them as my gdb skills are close to zero.

  • What version of wxGTK are you using? I see that you're using 2.8 from the backtrace, but 2.8.what? Also, are you building xCHM for a 64bit platform? I see in the backtrace that xchm is linked with libraries from /lib64. My builds (that work fine) are 32bit builds. I've tested on Windows as well, and it works just fine.

    So, a couple of suggestions:

    1. make sure all your libraries are either 32-bit or 64-bit, but not mixed. I'm especially talking about wxGTK here, but libstdc++ is also important.

    2. when obtaining backtraces, try to work with an xchm binary that hasn't been aggressively optimized by the compiler. What this means is, you need to run "./configure --enable-debug" for xchm (without --enable-optimize). There are an awful lot of <value optimized out>s in there, and it makes it very hard to fully understand the problem.

    Again, I'm pretty sure this is something particular to your build, because A. I've never recevied a similar bug report before, and 1.17 has been out for years, and 2. I've tested with your CHM now on both Linux and Windows and it works fine.

    If all else fails, comment out these lines in chmfile.cpp, starting at line 434:

    // toBuild->Freeze();
    // bool btoc = BinaryTOC(toBuild);
    // toBuild->Thaw();
    // if(btoc)
    // return true;

    then recompile xchm and let me know if it still crashes.

  • Both RHEL6 and Fedora 14 use the packages wxBase-2.8.11-3 and wxGTK-2.8.11-3
    All tests seem to have been performed on 64bit platforms. At least those made last night with xchm-1.19.1 were performed on 64 bit platforms , for sure.
    As of your suggestions
    1. Fedora's build system ensures that proper libs are used. There is no way to make errors here unless you really do something very stupid.
    2. The buld log for the F14 packages says that the following options were used:
    ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-dependency-tracking

    --enable-optimize does not seem to be present. Do you see any other flag which might create issues?

    You mentioned that all your tests were performed with 32bit versions of the binary. In the past I've seen programs which were not 64bit friendly, could there be a difference in the way your program behaves , triggered by the different architecture ?

    3. I'll give it a spin and see what comes out.


  • I'm not sure what you mean by "F14 packages". I was referring to xCHM only, and the proper way to compile it for testing is after a "./configure --enable-debug". --enable-optimize needs to be ommited as a configure parameter, and --enable-debug needs to be present.

    I'm not going to say that it's impossible that the 64 bit version might have some problem that escaped my attention due to the fact that I'm only using 32-bit operating systems, but that would have to be quite unlikely since 1. where it would matter, I use explicit types such as uint32_t, uint64_t and so on, so that's not platform-dependent, and 2. the latest Gentoo Linux ebuild for xCHM has been tested on both PPC and x86_64 systems and has been deemed stable.

    The wxGTK version you're using should be OK. I've tested with 2.8.11 on Gentoo and it works fine (2.8.12 too).

    I assume the problem only occurs with this particular CHM file, is that correct? Other CHM documents work fine?

  • Sorry, I forgot to answer your other question: no, the other configure flags, assuming they've been applied to xCHM, should not affect it in any bad way. But --enable-debug should be present.

  • Unlike other distros, Fedora and RHEL use more or less the same compiler flags for all the packages available. The buildsystem ensures that. It can be overridden but this is done only in special cases and with a very good justification. In this case, the defaults are used. And by "F14 packages" I meant " the wxGtk packages used by all Fedora users and which were used both by the build system during compile time and by users at use time".

    And yes, you are correct. This is the only CHM file that was brought to my attention as creating problems.

    I'll see what happens after adding the --debug flag that you have recommended.

    thank you

  • Steps to get a backtrace:

    1. run "gdb ./xchm"
    2. open the CHM file
    3. after the segmentation fault message, simply type "bt" in gdb
    4. copy/paste the output

    That's all you need to know to get a backtrace. If the binary you're using has been compiled without aggressive optimization (i.e. flags such as -O2), and with the '-g' flag, the backtrace will look more helpful.

  • The flag is not --debug, it is --enable-debug. Please make sure you're using the correct flag. It should be listed if you run xCHM's ./configure --help.

  • I do know the difference between --debug and --enable-debug. I was just using a shortcut :)

  • I've just noticed that the Gentoo .ebuild file for chmlib says this:

    KEYWORDS="alpha amd64 hppa ~ia64 ppc ppc64 x86"

    The '~' character before 'ia64' means that the Gentoo chmlib maintainers don't think it quite works as it should on ia64 platforms. This may be because they haven't managed to compile it with the proper flags, or that there's some fault with the code itself.

    Not sure if this helps you or not. Maybe tinker with chmlib as well, on your platform?

  • Could you also paste this portion of the output from configure:

    checking for inttypes.h... yes
    checking for stdint.h... yes
    checking for unistd.h... yes
    checking for int32_t... yes
    checking for int16_t... yes
    checking for uint16_t... yes
    checking for uint32_t... yes
    checking for uint64_t... yes

    or attach the config.log file to this bug report, so we can make 100% sure those types are properly matched?

  • checking for inttypes.h... yes
    checking for stdint.h... yes
    checking for unistd.h... yes
    checking for int32_t... yes
    checking for int16_t... yes
    checking for uint16_t... yes
    checking for uint32_t... yes
    checking for uint64_t... yes
    checking chm_lib.h usability... yes
    checking chm_lib.h presence... yes
    checking for chm_lib.h... yes
    checking for chm_open in -lchm... yes
    configure: creating ./config.status

    The whole log is available from Fedora's build system: http://koji.fedoraproject.org/koji/getfile?taskID=3048889&name=build.log

    obs: IA64 is not supported by Fedora. x86_64 on the other hand, is.

  • You're putting -O2 in all relevant flags before running configure:

    + CFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'
    + export CFLAGS
    + CXXFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'
    + export CXXFLAGS
    + FFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/lib64/gfortran/modules'
    + export FFLAGS

    That forces a build with -O2 on even when --enable-optimize is not present as a configure flag. Now, I'm interested in this:

    1. Does the problem go away if -O2 is not present when compiling xCHM? If not, obtain a backtrace and proceed to step 2.

    2. Does the problem go away if you comment out the code I told you about earlier?

    Let me know.

  • -O2 is standard for all fedora packages.

    I'll let you know once I do the tests. I am at work now, no time for compiling and such, unfortunately
    Thank you for your support

  • I understand that, I'm not suggesting that you stop using -O2 in production builds. But for now, we need to narrow the possibilities that lead to the issue.

  • Patch for chmfile.cpp

  • Actually, before trying anything else. please patch src/chmfile.cpp using the chmfile.diff file I've attached here. To apply the patch. download the chmfile.diff file in your xchm/src/ directory, and then run "patch < chmfile.diff".

    This should, at the very least, get rid of this compiler warning from your build:

    chmfile.cpp:1120:42: warning: dereferencing type-punned pointer will break strict-aliasing rules

    and maybe it will even help fix the whole issue. So apply that patch, compile like you normally would, and if that doesn't fix things go ahead with the two steps outlined below (no -O2, chmfile.cpp code commented out).

  • I just did a test with your patch included and a normal fedora build. There is no difference, the program still crashes.
    I'll try later to do the "remove -O2 / add debug options / remove content starting with line 434 " dance.

  • The problem still crashes but at least it now compiled without any warnings, right?

  • The build log for the patched version is available at http://wolfy.fedorapeople.org/xchm/build.log in case you want to take a look.. As you have intended, the "dereferencing type-punned pointer will break strict-aliasing rules" warning no longer exists.

  • Thanks. I'll fix the comparison warnings, but I am sure they can't possibly cause the crash so we don't need to bother with patches for that too.

    There was a very slim chance that the strict aliasing warning, combined with the -O2 flag could have produced malfunctioning binary code, but apparently that is not the case. Onwards and forwards.

  • Is there somewhere I could SSH maybe, where I could compile xCHM and debug for myself? On a Fedora x86_64 system, with X forwarding via SSH?

    If it's possible and you prefer it that way let me know, but don't post user access data here.

1 2 > >> (Page 1 of 2)