Menu

#3699 Simultaneous Compilation in parallel maxima processes fails

None
wont-fix
nobody
None
5
2021-01-09
2021-01-06
No

Debugging this one has required loads of time as this was a real heisenbug, that triggered reliably on the CI, but only if I didn't ask it to provide much debug information.

One symptom was:

21:24:54: Debug: Failed to shown notification: Failed to execute child process dbus-launch (No such file or directory)
21:24:54: Failed to shown notification: Failed to execute child process dbus-launch (No such file or directory)

Your C compiler failed to compile the intermediate file.
loadfile: failed to load /usr/share/maxima/5.43.2/share/draw/draw.lisp
 -- an error. To debug this try: debugmode(true);

another symptom, that happened only after looking at what feels like a 100 CI runs:

;; Note: Tail-recursive call of BIPART was replaced by iteration.
;; Note: Tail-recursive call of BIPART was replaced by iteration.Message from maxima's stderr stream: /home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14028:1: warning: null character(s) ignored
14028 | object V1853;object V1854;
      | ^
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c: In function LI61:
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14028: error: expected declaration specifiers before ) token
14028 | object V1853;object V1854;
      | 
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14029:2: error: expected declaration specifiers before vs_top
14029 | {  VMB70 VMS70 VMV70
      |  ^~~~~~
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14030:2: error: expected declaration specifiers before if
14030 |  goto TTL;
      |  ^~
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14033:2: error: expected declaration specifiers before V1903
14033 |  goto T5158;
      |  ^~~~~
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14034:2: error: expected declaration specifiers before goto
14034 |  }
      |  ^   
/home/runner/work/wxmaxima/wxmaxima/build/.maxima/binary/5_43_2/gcl/GCL_2_6_12/share/draw/gnuplot.c:14036:2: error: expected declaration specifiers before goto
14036 |  goto T5158;
      |  ^~~~

The cause was: Two maxima processes were trying to compile draw at the same time - which caused the lisp to re-generate C source files while a c compiler was starting up causing a compilation failure that resulted in a non-loadable file in maxima's binary folder.

Three remedies I can think of:

  1. Add a locking mechanism for maxima's load- and compile-type commands.
  2. make sure that during compilation all filenames include maxima's pid and therefore are unique. After compilation the result can be moved to the final destination in an atomic operation which means that the binary is consistent if it is present. In order to support certain virus scanners on MS windows after the move operation has claimed to be successful we have to manually check if that is the case and if not repeat it: If a virus scanner still scans a file on MS Windows it cannot be moved but as it this neither is a permission problem nor a non-existing source or destination the move command cannot report this.
  3. close this ticket with a "fixme: is a problem of the lisp compiler": I am not sure if it actually is.

Related

Bugs: #3698

Discussion

  • Leo Butler

    Leo Butler - 2021-01-06

    Here are a few random thoughts:

    1. Wouldn't it make more sense to pre-compile all the packages that need
      it? To avoid duplication of effort?

    2. I wonder if this bug report should be filed with GCL? Based on your
      description, it seems like this collision can/will happen in parallel
      compilation with GCL (e.g. with Axiom).

    3. Do you know why this does not affect other maxima+lisp combinations?

    4. Use of pids to name files uniquely is decidedly inferior to mktemp
      and relatives.

    Leo

     
    • Wolfgang Dautermann

      If every Lisp file is compiled for a complete Maxima compilation, Maxima would not compile e.g. on SBCL 32 Bit (don't know, if other Lisp's are affected too (maybe CMUCL, SBCL is a fork of it)).

      The compilation of Lapack requires more memory, so that package can not be loaded there.

       
  • Gunter Königsmann

    1. or moving over draw from share into maxima's src directory?
    2. I didn't test this with ecl and sbcl. But you might be entirely right in this point.
    3. see 2.
    4. Good idea.
     
    • Leo Butler

      Leo Butler - 2021-01-06
      1. or moving over draw from share into maxima's src directory?

      I don't think that will necessarily fix the problem you see. For
      example, lapack (and others) are compiled on first loading.

      It may make more sense to create a custom image of maxima that already
      contains all the packages you want to test, before you start the
      testing.

      That may be independently desirable: being able to configure the build
      process to create a "core" image or some "maximal" image might be
      useful. STACK, for example, offers some options to use an "optimized"
      maxima image, but as far as I know there is no testing of that image.

      1. I didn't test this with ecl and sbcl. But you might be entirely
        right in this point.

      Yes, my guess is that the collisions will happen with all lisps that
      compile lisp code, as long as you are running the same lisp in parallel
      jobs.

      Leo

       
      • Gunter Königsmann

        Working around the problem isn't too hard once one has found its reason. Why I finally reported was a different aspect of this problem, though: If an user ever triggers this problem the remains of the unsuccessful compilation will linger around in the user's binary folder. That means that the user will never again be able to load this package - except after the binary folder is manually deleted or invalidated by a maxima update or a lisp change.
        A race condition that isn't easy to trigger, but that permanently breaks a package for one user isn't nice.

        Fortunately if sbcl runs out of memory during compilation of a package there won't be remains from the compilation attempt. Or at least I never have seem that there are.

         
        • Gunter Königsmann

          On the other hand, and as a 2nd thought the fact that I got an invalid object file with gcl while with sbcl, in all cases I had compilation failures I got no object file, at all, might possibly be caused by a gcl bug... ...I haven't enough data about the failures to be able to make a statistics, though...

           
  • Kris Katterjohn

    Kris Katterjohn - 2021-01-06

    Is [#3698] the same as this one or is it different? At first glance it looks like you made two essentially identical tickets.

    If they're different then I think you should clearly explain the difference. If they're the same then we should close [#3698] since there is some activity here.

     

    Related

    Bugs: #3698

    • Gunter Königsmann

      I tried to report this bug twice since Sourceforge told me that I had triggered an internal error and when I tried to look at the bug list the bug I tried to report didn't appear (perhaps didn't appear yet). Now Bug #3698 reliably crashes my firefox => if you are able to close #3698 I would beg you to do so for me.

       
      • Kris Katterjohn

        Kris Katterjohn - 2021-01-06

        I've closed [#3698]. That report is huge and that may be the cause of the problems you've seen.

         

        Related

        Bugs: #3698

        • Gunter Königsmann

          Thanks a lot!

           
  • Raymond Toy

    Raymond Toy - 2021-01-09

    Seems to me that if you want parallel builds, you need to arrange for each build to save it's outputs in different locations so that .o's and fasls are stored in different places. I this not happening?

     
    • Gunter Königsmann

      Building maxima in parallel has worked fine for me until now. This means that both make and defsystem are aware of nearly all interdependencies of the source files. There are some interdependencies at least one of these tools isn't aware of: Sometimes changes to maxima's source lead to test failures unless one attempts a clean build.

      Until a few years ago plotting from one maxima process re-used the temporary files from all other maxima processes that plotted at the same time. But that has been resolved by now => the only place I still ran into problems is the half-way exotic case this bug ticket is about.

       
  • Robert Dodier

    Robert Dodier - 2021-01-09

    I dunno, this doesn't look like an issue for Maxima to fix. If anything, this is a Lisp compiler problem -- if it is supposed to support parallel builds, then it has to name output files / folders accordingly. But parallel builds is outside the scope of Common Lisp, so Maxima can't assume that it's possible. It might be supported by some Common Lisp compilers, or it might not.

    I'm inclined to close this as "won't fix", since the problem is outside of Maxima.

     
    • Gunter Königsmann

      As I tried to state initially I am fine with that solution. Seems like the spell checker has changed the meaning of my initial ticket, though ;-(

       
  • Gunter Königsmann

    • status: open --> wont-fix
     

Log in to post a comment.

MongoDB Logo MongoDB