#182 ecl should by default not handle signals in different thread

12.2.2
open
nobody
signals (1)
5
2014-09-23
2012-05-12
Paulo Andrade
No

I am working on porting sagemath to fedora, and in a previous work, it was
required to patch sagemath ecl initialization, see
http://trac.sagemath.org/sage_trac/ticket/11752

Now, when attempting to generate a request for enhancement in fedora, to
build maxima with ecl support, I found again the
problem, as maxima would dead lock in make check. Probably related to
http://www.mail-archive.com/ecls-list@lists.sourceforge.net/msg00644.html
or at least same symptoms.

The attached patch corrects the problem for me, and allows building maxima
with ecl enabled in fedora.

Discussion

  • I am afraid this is not a patch, but a hack which is probably hiding some other problem. POSIX does not allow us to run any useful code in a signal handler and in the future all signals will be handled in separate threads.

     
  • Paulo Andrade
    Paulo Andrade
    2012-05-13

    Do you have a suggestion of a existing one, or how to create a very small,
    preferably self contained test case to verify the problem is not happening?

    If I understand correctly, at least in my current test system with fedora 16

    $ rpm -q glibc
    glibc-2.14.90-24.fc16.6.x86_64

    it is showing the behaviour that was expected to exist only in older glibc.
    Such a test case would be filled as bug report to glibc, and hopefully added
    to regression tests.

     
  • Paulo Andrade
    Paulo Andrade
    2012-05-13

    You may also want to experiment with semaphores. I have a toy and
    work in progress C/C++ like language, where I just adapted the
    example from "David R. Butenhof. Programming with POSIX Threads.
    Addison-Wesley. ISBN 0-201-63392-2".
    It uses sigsuspend, but I also wrote a version using semaphores
    because helgrind understand the semantics of semaphores, what
    makes it a lot easier to have it detecting racing conditions.
    But I do not enable the semaphore version other than for debug
    builds because it dead locks from time to time, apparently behavior
    varies depending on kernel and glibc version, and the mix of
    semaphores and mutexes.

     
  • We cannot use POSIX semaphores in ECL because they are not available everywhere --for instance older versions of OS X--. Also past experiences with interrupts and POSIX threads have made me remove all dependencies on them.

     
  • Paulo Andrade
    Paulo Andrade
    2012-05-14

    I tried to adapt the patch to ecl-12.2.1 but it got too confusing after
    things like s/ecl_option_values[(.*)]/ecl_get_option(\1)/, among a few
    others...
    Examples
    /home/pcpa/rpmbuild/BUILD/ecl-12.2.1/src/c/unixint.d:208:1: note: expected 'void ()(int, struct siginfo_t , void )' but argument is of type 'void ()(int)'
    /home/pcpa/rpmbuild/BUILD/ecl-12.2.1/src/c/unixint.d:588:4: error: too few arguments to function 'si_wait_for_all_processes'
    /home/pcpa/rpmbuild/BUILD/ecl-12.2.1/src/c/unixint.d:593:20: error: 'struct cl_core_struct' has no member named 'known_signals'

     
  • Sorry, I assumed you were using a more recent version of ECL (it would have been nice to check it with it, outside sage, just to ensure things will be ok). In any case, I would be very grateful if you could test the new patch that I produced with ecl-12.2.2

     
  • Paulo Andrade
    Paulo Andrade
    2012-05-17

    Sorry for the delay to respond. I did update my fedora system from
    fedora 16 to rawhide and now resumed working again in porting sagemath
    to fedora. But now I get link errors:

    /home/pcpa/rpmbuild/BUILD/ecl-12.2.1/src/c/alloc_2.d:1273: undefined reference to `GC_push_conditional'

    looking at gc-7.2 sources, I see that these functions are not supposed
    to be defined, i.e. GC_INNER is defined to static, or, if not defined,
    in private/gc_priv.h it has the check if defined(GNUC) and
    if GNUC >= 4
    define GC_INNER attribute((visibility("hidden")))

    probably for some reason in fedora 16 those symbols were visible, now
    I probably should configure with --enable-precisegc but I think it
    will still just fail, because the first error is in unconditional code:

    if 1

    if (env->stack) {
        GC_push_conditional((void *)env->stack, (void *)env->stack_top, 1);
        GC_set_mark_bit((void *)env->stack);
    }
    

    ...

    ecl is also actually broken in rawhide:

    ================================================================================
    Package Arch Version Repository Size
    ================================================================================
    Installing:
    ecl x86_64 12.2.1-2.fc18 rawhide 3.7 M

    Transaction Summary

    Install 1 Package

    Total download size: 3.7 M
    Installed size: 18 M
    Is this ok [y/N]: y
    Downloading Packages:
    ecl-12.2.1-2.fc18.x86_64.rpm | 3.7 MB 00:02
    Running Transaction Check
    Running Transaction Test
    Transaction Test Succeeded
    Running Transaction
    Warning: RPMDB altered outside of yum.
    Installing : ecl-12.2.1-2.fc18.x86_64 1/1
    Verifying : ecl-12.2.1-2.fc18.x86_64 1/1

    Installed:
    ecl.x86_64 0:12.2.1-2.fc18

    Complete!
    $ ecl
    ecl: symbol lookup error: /usr/lib64/libecl.so.12.2: undefined symbol: GC_push_other_roots

     
  • Sorry, your bug report was not really clear: I did not realize until today that you were discussing a pre-installed copy of the Boehm-Weiser garbage collector, coming with Fedora.

     


Anonymous


Cancel   Add attachments