Menu

#2467 Tests fail with gnuplot-5.4.2 on arm

None
closed-fixed
nobody
None
2021-12-25
2021-10-17
Sam James
No

Originally reported downstream in Gentoo at https://bugs.gentoo.org/811927.

I'm not sure how to understand the error output, so this is the last part of the log, and I've attached the output in full. Let me know if I can provide more useful debugging information.

<cr> to continuemake[3]: *** [Makefile:730: check-noninteractive] Error 139 make[3]: Leaving directory '/var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/demo' make[2]: *** [Makefile:585: check-am] Error 2 make[2]: Leaving directory '/var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/demo' make[1]: *** [Makefile:435: check-recursive] Error 1 make[1]: Leaving directory '/var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/demo' make: *** [Makefile:429: check-recursive] Error 1 * ERROR: sci-visualization/gnuplot-5.4.2::gentoo failed (test phase): * emake failed

1 Attachments

Discussion

  • Ethan Merritt

    Ethan Merritt - 2021-10-17

    Thanks for the error report.
    Unfortunately there is really no useful information in the build log, other than showing which demo script was in progress when something caused the run to stop. Can you run that one demo again under gdb and/or valgrind? E.g.

    GNUTERM=dumb
    valgrind --tool=memcheck --log-file=valgrind.log gnuplot-5.4.2 isosurface.dem < /bin/yes
    

    That runs here with no errors reported, but I don't have an arm system to test it on. If it fails for you, then the content of valgrind.log may be informative.

    Possibly the same bug as #2450 "Segfault on s390x with 5.4.1 during docs build"?
    https://sourceforge.net/p/gnuplot/bugs/2450/
    That one also is not reproducible on x86, and the failure shown in the valgrind out is in the same code that is exercised by the isosurface demo.

     
  • Sam James

    Sam James - 2021-10-17

    Thanks for the quick reply!

    Yeah, I felt a bit guilty even submitting it, but I was struggling to get more information out because I'm quite new to gnuplot.

    I was able to run that one demo again and it segfaulted outside of Portage (the package manager) which is good. Iv'e also got a backtrace:

    (gdb) bt
    #0  tessellate_one_cube (iz=12514560, iy=<optimized out>, ix=<optimized out>, plot=0x1ebdb68) at vplot.c:320
    #1  vplot_isosurface (plot=plot@entry=0x1ebdb68, downsample=12514176) at vplot.c:213
    #2  0x00acf720 in do_3dplot (plots=0x0, pcount=1, pcount@entry=2, replot_mode=4) at graph3d.c:1107
    #3  0x00b03a50 in eval_3dplots () at plot3d.c:2754
    #4  0x00b048c8 in plot3drequest () at plot3d.c:399
    #5  0x00aa98cc in splot_command () at command.c:2350
    #6  0x00aab1bc in command () at command.c:698
    #7  do_line () at command.c:468
    #8  0x00aef4e0 in load_file (fp=fp@entry=0x1ebc540, name=<optimized out>, calltype=calltype@entry=4) at misc.c:335
    #9  0x00a9a208 in main (argc_orig=<optimized out>, argv=0xfff20288) at plot.c:675
    

    I've attached the 'bt full' output too. It does seem to be the same function as in https://sourceforge.net/p/gnuplot/bugs/2450/ (and the same line?). BTW, I do have access to s390 (and could probably give you access too if it would be helpful to s390 or some other exotic arches), so let me know if I can do something there as well.

    I might be doing Valgrind wrong but this is what I get:

    /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/demo $ valgrind -s --tool=memcheck --log-file=valgrind.log ../src/gnuplot isosurface.dem < /bin/yes
    Segmentation fault (core dumped)
    
    /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/demo $ cat valgrind.log
    ==755448== Memcheck, a memory error detector
    ==755448== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
    ==755448== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
    ==755448== Command: ../src/gnuplot isosurface.dem
    ==755448== Parent PID: 1
    ==755448==
    ==755448== Invalid write of size 4
    ==755448==    at 0x4001458: _dl_start (rtld.c:503)
    ==755448==    by 0x400088F: ??? (in /lib/ld-2.33.so)
    ==755448==  Address 0xfeaf0214 is on thread 1's stack
    ==755448==  72 bytes below stack pointer
    ==755448==
    ==755448==
    ==755448== Process terminating with default action of signal 11 (SIGSEGV): dumping core
    ==755448==  Access not within mapped region at address 0xFEAEFF8C
    ==755448==    at 0x4001EC0: dl_main (rtld.c:1128)
    ==755448==    by 0x4019F83: _dl_sysdep_start (dl-sysdep.c:250)
    ==755448==    by 0x4001747: _dl_start_final (rtld.c:489)
    ==755448==    by 0x4001747: _dl_start (rtld.c:582)
    ==755448==    by 0x400088F: ??? (in /lib/ld-2.33.so)
    ==755448==  If you believe this happened as a result of a stack
    ==755448==  overflow in your program's main thread (unlikely but
    ==755448==  possible), you can try to increase the size of the
    ==755448==  main thread stack using the --main-stacksize= flag.
    ==755448==  The main thread stack size used in this run was 8388608.
    ==755448==
    ==755448== HEAP SUMMARY:
    ==755448==     in use at exit: 0 bytes in 0 blocks
    ==755448==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
    ==755448==
    ==755448== All heap blocks were freed -- no leaks are possible
    ==755448==
    ==755448== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 5 from 5)
    ==755448==
    ==755448== 1 errors in context 1 of 1:
    ==755448== Invalid write of size 4
    ==755448==    at 0x4001458: _dl_start (rtld.c:503)
    ==755448==    by 0x400088F: ??? (in /lib/ld-2.33.so)
    ==755448==  Address 0xfeaf0214 is on thread 1's stack
    ==755448==  72 bytes below stack pointer
    ==755448==
    --755448--
    --755448-- used_suppression:      5 glibc-2.33-on-SUSE-10.3-(x86) /usr/libexec/valgrind/default.supp:1338
    ==755448==
    ==755448== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 5 from 5)
    
     

    Last edit: Sam James 2021-10-17
    • Ethan Merritt

      Ethan Merritt - 2021-10-17

      That valgrind log seems to indicate it segfaulted before it even got to run the program. Does it do the same thing if you use some other innocuous demo like 'arrows.dem'?

      I will stare at the ASAN output and see if I can make any sense of it. It does look like the same thing as the earlier s390 report.

       
  • Sam James

    Sam James - 2021-10-17

    AddressSanitizer has given the following:

                          [t=0:20] '+' using (cos($1)):(sin($1)):($1) ***A***
    
                                 A**A*     +
                            AAAAA*A** *A**A|A**A*
                                           |     *A**A*
                                           |-+         A*A
                                           |              A
                                           |              A
                                           |-+       *A*A*A
                               A*A*A**A*   |    *A**A
                            AAA*A**A*A***A*|A*AA*
                                           |     *A**A*
                                           |     +---+-A*--+
               +------                     |-+---   ----- A
                      --------------     +-|-   ----      A
                                    --------------      A*A
                                 +---------      +--------------
                             +--------                         +------+-+
                     +-- +--------
                        --+--
    
    <cr> to continuevfill from + :
            radius 0.9 gives a brick of 15 voxels on x, 15 voxels on y, 5 voxels on z
            number of points input:          55
            number of voxels modified:    29540
    
    
                        Fill voxel grid around the points shown
    
                                                    voxel grid points    G
                         G GGGGGGGGGG GGGG
                GGGGGGGGGGGGGGGGGGGGGGGGGGG+GGGGGGGGGGGGG
              GGGGGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGG
              GGGGGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGGGGG
               G GGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGGGGGGGGGG
                       G    GGGGGGGG G G G |GGGGGGGGGGGGGGGGGGGGGGGGGGGGG
                                           |GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
                GGGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGGGGGGGGGG
              GGGGGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGGGGGGG
              GGGGGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGGGGG
                GGGGGGGGGGGGGGGGGGGGGGGGGGG|GGGGGGGGGGGGGGGGGGGGGGGGGGGG
                     G  G GGGGGGGGGGGGGGGGG|GGGGGGGGGGGGG--+GGGGGGGGGGGGGG
               +------++                   |GGGGGGG------GGGGGGGGGGGGGGGG
                      --------------++---+-|GG-----GGGGGGGGGGGGGGGGGGGGGG
                                 +----------------GGGGGGGGGGGGGGGGGGGG G
                             +----+-----      G GG---------------G G   -+
                     +-- +---------                             ------+
                        --+--
    
    <cr> to continueAddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==761556==ERROR: AddressSanitizer: SEGV on unknown address 0x00dbc268 (pc 0x00c9641c bp 0x00d2eed3 sp 0xff9447d8 T0)
    ==761556==The signal is caused by a READ memory access.
        #0 0xc9641c in tessellate_one_cube /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/vplot.c:320
        #1 0xc9641c in vplot_isosurface /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/vplot.c:213
        #2 0xaac38c in do_3dplot /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/graph3d.c:1107
        #3 0xb3ff64 in eval_3dplots /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/plot3d.c:2754
        #4 0xa4d098 in splot_command /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/command.c:2350
        #5 0xa504ac in command /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/command.c:698
        #6 0xa504ac in do_line /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/command.c:468
        #7 0xb0517c in load_file /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/misc.c:335
        #8 0xa25188 in main /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/plot.c:675
        #9 0xf72ffe38 in __libc_start_main /usr/src/debug/sys-libs/glibc-2.33-r7/glibc-2.33/csu/libc-start.c:332
    
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV /var/tmp/portage/sci-visualization/gnuplot-5.4.2/work/gnuplot-5.4.2/src/vplot.c:320 in tessellate_one_cube
    ==761556==ABORTING
    
     
  • Ethan Merritt

    Ethan Merritt - 2021-10-17

    I have a hypothesis that the issue is not the platform per se (little/big endian) but the compiler behavior when making a signed test against a (static const char) variable promoted to (int).

    If you can do so, would you please test the attached patch for the source files marching_cubes.h and qt_table.h? It explicitly declares these static tables as (static signed char) rather than (static char).

     

    Last edit: Ethan Merritt 2021-10-17
    • Sam James

      Sam James - 2021-10-18

      The patch works! That test no longer segfaults. Let me try the whole test suite now.

      I suspect this is because char defaults to unsigned on arm? I'm a little bit surprised this didn't fail at compile time.. I've seen this sort of failure before in e.g. xmrig (https://github.com/xmrig/xmrig/issues/2527).

      EDIT: Oh, duh, I see now. C isn't C++ and it's just going to wrap around for us.

       

      Last edit: Sam James 2021-10-18
      • Sam James

        Sam James - 2021-10-18

        All tests passed!

         
  • Ethan Merritt

    Ethan Merritt - 2021-10-18

    Yes, defaulting to "unsigned char" was the issue. I am surprised that there were not compiler warnings both from testing for negative values and for initializing values to -1.
    The fix is commit a2359efd for the development version, and will be included in 5.4.3

     
  • Ethan Merritt

    Ethan Merritt - 2021-10-18
    • status: open --> pending-fixed
    • Group: -->
    • Priority: -->
     
  • Ethan Merritt

    Ethan Merritt - 2021-12-25
    • Status: pending-fixed --> closed-fixed
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.