Menu

#2797 function block: SEGV occurs due to invalid context call in a function block

closed-fixed
nobody
None
2025-06-04
2025-05-15
No

The following scripts cause a segmentation fault.

(1) "set mark ..."

function $setup_mark() <<end
  set mark 1 "+" using 1:1
end

plot "+" using ($setup_mark(),$1):1 

(2) "set palette file ..."

function $setup_palette() <<end
  set palette file "pal.txt" using 1:1:1
end

plot "+" using ($setup_palette(),$1):1 

These "set" commands may call df_open(), which can conflict with other functions performing data input processing.

(3) function redefinition

function $setup_func() <<end
  f(x) = exp(x)
end

f(x) = (x>0)?$setup_func():cos(x)

plot f(x)

This error depends on the environment. This causes SEGV on my desctop PC, but doesn't on note PC.

(4) exit error

function $setup() <<end
  exit error
end

plot "+" using ($setup(),$1):1 with lines

(5) replot in plot

function $setup() <<end
  replot
end

plot sin(x) with points

plot "+" using ($setup(),$1):1 with lines

This error also depends on the environment. This causes SEGV on my note PC, but I got the following message on desktop PC about once every few times.

gnuplot(52381,0x1f5f18c80) malloc: *** error for object 0x38000000000000: pointer being freed was not allocated
gnuplot(52381,0x1f5f18c80) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6

My test environments is Mac OS. I didn't try on other OS.

Discussion

  • Ethan Merritt

    Ethan Merritt - 2025-05-15

    Cases 1, 2, and 5 are dumb mistakes on my part. In each case the code sets the interlock flag inside_plot_command correctly. But it stupidly forgets to check whether it was already set, which was the whole point of the interlock. So as written it only protected in one direction; e.g. it would catch an attempt to plot inside set palette but didn't catch the converse set palette inside plot.

    Case 3 - I'm still looking at this one.

    Case 4
    On your machine that segfaults from exit error, does it also segfault if that command is issued directly? I suspect that has nothing to do with function blocks, but rather a failure to handle a null error message string.

    I don't get a segfault here, nor do I catch any problem in valgrind or gdb. However passing a NULL message string to vsnprintf() is very suspect. I'm not sure whether it is sufficient to check for a missing error message in exit_command() or whether it is better to worry about other call sites as well and put an explicit test for NULL in int_error().

     

    Last edit: Ethan Merritt 2025-05-15
    • Hiroki Motoyoshi

      Indeed, Case 4 caused a SEGV independently of the function block, simply due to exit error. It seems the issue is that a NULL value is being passed to "int_error(NO_CARET, try_to_get_string());". I think it would be better if "exit_command()" checked the return value of "try_to_get_string()" and for example, passed an empty string ("") if it is NULL. Also, since the second argument of int_error() is a format string, I believe it should be restricted to a string format such as "%s". Otherwise, unexpected behavior may occur if the message string contains format specifiers—for example, exit error "%p %g %i %lf".

       
      • Ethan Merritt

        Ethan Merritt - 2025-05-15

        It is intentional that the message string can consist of a format specifier with passed paramenters . That's the point of using stdarg and vsnprintf. Here is an example from internal.c

        int_error(NO_CARET, "function %s requires %d variables", 
                                        udf->udf_name, udf->dummy_num);
        

        On my machine the va_start() implementation seems to accept a NULL format with no complaint, but I don't see an explicit statement about that in the man page. So yeah, it's probably best to check in both the caller and in int_error.

         

        Last edit: Ethan Merritt 2025-05-16
        • Hiroki Motoyoshi

          The warn command has also uncontrolled format string problem.
          For example, warn "%p %g %i %lf". This also comes from giving input from the user to the format string in "warn_command()" and should be corrected.

           
          • Ethan Merritt

            Ethan Merritt - 2025-05-16

            I think I understand your point now. The problem is not with int_error() or int_warn() per se, but the commands warn "message" and exit error "message" should act via int_error(caret, "%s", message) rather than int_error(caret, message). Right?

             
            • Hiroki Motoyoshi

              You're right!

              There is another place where the format string for the standard C "*printf()" functions is directly specified by the user:

              • set mouse mouseformat

              For example, on my macOS environment, the following script results in a "Bus error: 10". Since the implementation of "*printf()" may differ between operating systems or development environments, this may not reproduce on other systems.

              NARGS=10000
              
              set term wxt
              fmt = ""; do for [k=1:NARGS] {fmt = fmt."%x" }
              set mouse mouseformat fmt
              plot 0
              pause -1
              

              Letting the user provide raw format strings for "*printf()" introduces a potential vulnerability and should be addressed.

              In the case of set mouse mouseformat, I believe it would be safer — if possible — to internally utilize the f_sprintf() routine. This function is aware of the number of arguments and is more robust against malformed format strings, since the formatting process is kept under control.

               
  • Hiroki Motoyoshi

    Here is the LLDB output from the PC (macOS on arm64) where a SEGV occurs in Case 3.

    Note: On other PC (macOS on intel x86_64 built with MacPort library), case 3 doesn't cause SEGV.

     
    • Hiroki Motoyoshi

      In conclusion, Case 3 is also a bug (accessing freed memory).

      This occurs because f(x) is redefined during its own evaluation, causing the original action table (AT) of f(x) to be deallocated via "free_at()".

      With typical "free(3)" behavior, the memory block is often left unchanged immediately after being freed. As a result, the evaluation of the original f(x)'s AT may continue without immediate issues, even though it has already been deallocated.

      The reason this bug becomes apparent on macOS is:

      The "free(3)" function fully zeroes many blocks immediately after in macOS 13.

      Here is the man page for "free(3)" on macOS 15.4.1:
      https://man.freebsd.org/cgi/man.cgi?query=malloc\&sektion=3\&manpath=macOS+15.4.1
      See the description of "MallocZeroOnFree" in the ENVIRONMENT section.

      You can force this bug to surface (even on platforms other than macOS) by modifying "real_free_at()" in "eval.c" as follows:

      diff --git a/src/eval.c b/src/eval.c
      index cf096447d..0f02641d5 100644
      --- a/src/eval.c
      +++ b/src/eval.c
      @@ -848,6 +848,8 @@ real_free_at(struct at_type *at_ptr)
           for (i=0; i<at_ptr->a_count; i++) {
              struct at_entry *a = &(at_ptr->actions[i]);
              free_action_entry(a);
      +        /* Let's try assigning 0 (PUSH) to a->index, assuming that free(at_ptr) will be called later. */
      +        a->index = 0;
           }
           free(at_ptr);
      

      Although direct function redefinition during evaluation is not normally possible, it becomes feasible when going through a function block. Some form of restriction is needed to prevent this.

       
      • Ethan Merritt

        Ethan Merritt - 2025-05-16

        The thing that is confusing me is that the plot does switch over to the new function for points with x>0 (on a system where it doesn't segfault of course). That means the program is calling the new function, not the old function, once it gets past the point where the substitution occurs. How does that happen? So far as I can see the plot structure stores the function pointer only once at the beginning of the plot (plot2d.c:3762). So it should be using that old function throughout, not the new one.

                /* Used to generate samples and to optimize watchpoints */
                this_plot->plot_function.at = at_ptr;
        

        As to whether this really needs to be fixed, I am not sure. I don't think the user can reasonably expect it to be possible to redefine a function while inside evaluation of that same function. I do think it is useful and reasonable to define functions inside a function block, however, so I don't like the alternative of forbidding that altogether.

         
        • Hiroki Motoyoshi

          Case 3 may have caused some confusion as an example.
          I believe the following simpler reproduction script is easier to understand:

          (3')

          function $setup() <<end
            f(x) = 1
            return 0
          end
          
          f(x) = 1 + $setup()
          
          print f(1)
          

          I hardly imagine anyone else would write such a script, but it is a new behavior brought about by the introduction of the function block.

          It is not necessary to prohibit all function definitions inside a function block.
          What should be prohibited, I believe, is the redefinition of a function with the same name while that function is being executed.

           

          Last edit: Hiroki Motoyoshi 2025-05-16
          • Ethan Merritt

            Ethan Merritt - 2025-05-16

            I can think of two conceptual approaches to deal with this if it really is necessary, but they are both non-trivial.

            option 1) Instead of really freeing the at_type action table structure of a function in routine free_at(), it would save the entire structure somewhere for actual deletion later. That would mean adding a bunch of bookkeeping routines to manage this collection of dead function structures, all for the extremely rare case of a redefinition that could lead to use-after-free.

            option 2) When evaluate_at() is called to execute a function it could walk recursively through the entire list of actions to be performed looking for CALL or CALLN. It could then inspect the argument to that call to find the function name. A model for this is the code in disp_at(). Then what? Set a flag (might have to add a new field somewhere) to show that function must not be redefined will the evaluation proceeds? Then somehow clear all those flags afterwards.

            Both options seem expensive. The cost would be incurred for every function evaluation while the benefit is seen only in handling the error path for something that can be documented as not permitted. Can you think of a simpler approach?

             
            • Hiroki Motoyoshi

              Option 3)

              Although this would expand the scope of restriction, I would like to propose disallowing function definitions inside a function block if that block may be called from within a function.

              To implement this behavior, I suggest introducing two flags (tentative names):

              • A flag in the fblock structure: called_from_function
              • A global variable: prohibit_function_definition

              These names are provisional — please feel free to suggest better, more appropriate names.

              The processing logic using these flags would be as follows:

              1. During function definition
                If a function block is encountered, set the called_from_function flag of that function block.

              2. During function block evaluation
                If the called_from_function flag is set:

                • Before evaluation, set the global variable prohibit_function_definition to true.
                • After evaluation, restore prohibit_function_definition to its original state.
              3. When defining a function via define()
                If prohibit_function_definition is set, terminate with an error.

              The main concern is whether the global variable prohibit_function_definition can be reliably set and restored, especially in the presence of nested or recursive evaluations.

              This approach does not degrade the performance of normal function evaluation.

              As for the scope of restriction, I believe the impact is minimal, since I cannot think of any practical use case where defining a function inside a function block that might be called from within another function would be necessary.

              Whether it is acceptable that the prohibition rules within a function block change dynamically depending on how the block is used (i.e., whether it is called from a function).

               

              Last edit: Hiroki Motoyoshi 2025-05-17
              • Ethan Merritt

                Ethan Merritt - 2025-05-17

                We crossed paths. I came up with a different option 3 and started coding it. I think it's simpler than yours although I haven't yet convinced myself it is 100% foolproof.

                • add a field at->recursion_depth, initialized to zero when the function structure is allocated
                • every time the function is invoked by execute_at increment the depth before stepping through the action table and decrement it afterwards.
                • the only place that actually looks at this field is define. If there is already a function with that name and its depth counter is > 0, then issue an error rather than continuing with the definition.

                This works for the two sample cases you provided.
                It still needs a cleanup routine that resets the flag for all defined functions if there is an error, since the error might have occurred during function evaluation.
                edit: added the cleanup function

                 

                Last edit: Ethan Merritt 2025-05-17
                • Hiroki Motoyoshi

                  Thank you.
                  Your approach is preferable because it uses only the function itself, independent of the function block.

                  I’ve also found that redefining or undefining an array within a function block in the array initialization causes the segmentation error (accessing freed memory) with a similar logic. However, this might need to be reported as a separate ticket.

                   

                  Last edit: Hiroki Motoyoshi 2025-05-17
  • Ethan Merritt

    Ethan Merritt - 2025-06-04
    • Status: open --> closed-fixed
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.