Menu

#899 the compiler's lack of USING phrase check for the PROC.DIV. leads to a segfault

invalid
not-our-bug
5 - default
2024-03-17
2023-07-16
meerkut
No

The absence of the clause ‘USING ...’ appended to PROCEDURE DIVISION do not produce a compilation error, but running it will result in a 'Segmentation fault'.

Source:

      * The absence of the phrase ‘USING V1’ (1) appended to
      * PROCEDURE DIVISION do not produce a compilation error,
      * but leads to a segfault.
       IDENTIFICATION DIVISION.
       PROGRAM-ID.          L0.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  V1 PICTURE S9.
       PROCEDURE DIVISION.
       MAIN.
           MOVE 7 TO V1.
           CALL "N1" USING V1.
           STOP RUN.
      *
      *--======nested=======--*
       IDENTIFICATION DIVISION.
       PROGRAM-ID.          N1.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       LINKAGE SECTION.
       77  V1 PICTURE S9.
      *PROCEDURE DIVISION USING V1.
       PROCEDURE DIVISION.
      *                 ^------------------ (1)
       MAIN.
           MOVE 5 TO V1.
       END PROGRAM N1.
       END PROGRAM L0.

checked with:
- 3.1.2.0
- 3.2-rc2.0 r5127

.---
Best regards

Discussion

  • Simon Sobisch

    Simon Sobisch - 2023-07-16

    That's no bug, you can assign the LINKAGE SECTION variables with several options, CALL USING is just the most common one.

    Compile with -fec=all or --debug and you get a nice runtime error instead of a memory error.

     
    • meerkut

      meerkut - 2023-07-16

      Simon,
      when a bug can be identified without needing to run the compiled program, and a static examination of the code is sufficient, it indicates a compiler failure.

       
  • Simon Sobisch

    Simon Sobisch - 2023-07-16
    • labels: --> cobc, syntax
    • status: open --> not-our-bug
    • assigned_to: Simon Sobisch
    • Group: GC 3.x --> invalid
     
  • Simon Sobisch

    Simon Sobisch - 2023-07-16

    That's a source bug, not a compiler bug.

    I do agree that it is good when the compiler helps you at compile and/or runtime (but of course, that may lead to false-positives or decreased performance) - if you see a useful thing missing, then please create a feature-request for this (but check the warning options and runtime checks first).

    Let's have a check, using GC 3.2 (older versions will not include the stack trace on abort and won't show the context/warning level on diagnostic output):

    # test 1: "as is"
    $> cobc bad.cob -xj
    
    attempt to reference invalid memory address (signal)
    
    
    Last statement of "N1" unknown
    Last statement of "L0" unknown
     Started by ./bad
    Segmentation fault
    
    # test 2: "runtime checks"
    $> cobc bad.cob --debug -xj
    libcob: bad.cob:26: error: BASED/LINKAGE item 'V1' has NULL address
    
     Last statement of "N1" was MOVE
            MAIN at bad.cob:26
            ENTRY N1 at bad.cob:25
     Last statement of "L0" was CALL
            MAIN at bad.cob:12
            ENTRY L0 at bad.cob:10
     Started by ./bad
    
    # test 3: "compiler warnings"
    $> cobc bad.cob -Wextra
    bad.cob: in paragraph 'MAIN':
    bad.cob:12: warning: CALL statement not terminated by END-CALL [-Wterminator]
       10 |        MAIN.
       11 |            MOVE 7 TO V1.
       12 >            CALL "N1" USING V1.
       13 |            STOP RUN.
       14 |       *
    bad.cob:21: warning: LINKAGE item 'V1' is not a PROCEDURE USING parameter [-Wlinkage]
       19 |        DATA DIVISION.
       20 |        LINKAGE SECTION.
       21 >        77  V1 PICTURE S9.
       22 |       *PROCEDURE DIVISION USING V1.
       23 |        PROCEDURE DIVISION.
    

    I'd say that's a real good result (note: if using -Wextra you'll likely suppress some warnings after initial check, like in this case -Wno-terminator [which co,es in handly if you have unexpected code-flow]).

    Note: you may want to post to "help getting started" instead of the issue tracker when not 99+% sure that it is a bug), as that is read and answered to by more people.

     

    Last edit: Simon Sobisch 2023-07-16
    • meerkut

      meerkut - 2023-07-17

      Let us demonstrate that statement A: "the phrase 'USING V1' is optional in the syntax of the program for gnuCOBOL" is incorrect.

      1. Assuming the phrase 'USING V1' is optional, similar to words like IS or BY, the gnuCOBOL compiler should consistently assign a valid V1 address based solely on the information in the "LINKAGE" section.
      2. Moreover, if the address of V1 is assigned, it must not be NULL.
      3. However, the compilation result indicates that the address of V1 is NULL.

      If we do not negate 1, 2, and 3, then statement (A) contradicts the result of the experiment, which is the compilation, thus refuting what was intended to be shown.
      Q.E.D. or
      "which was to be demonstrated."

      Consequently, from the above proof, we can deduce the following:
      a) The gnuCOBOL compiler considers the 'USING V1' clause to be mandatory in the syntax.
      b) The compiler does not allocate a valid address and location for V1 from the "LINKAGE" section until it encounters the phrase 'USING V1' in the PROCEDURE section.
      c) Furthermore, although the compiler comes across a reference to address V1, it does not verify the validity of this address. This behavior is reasonable if the preceding code ensures the allocation of an address and location for V1 from the "LINKAGE" section.

      However, point (b) indicates an error in the logic of the compiler. The compiler must always allocate memory at point (b) and then, at point (c), verify the presence of the phrase 'USING V1' as mandatory in the syntax (as proven above). If the syntax is not followed, it should raise an error and refuse to compile the program.

      Simon> That's a source bug, not a compiler bug.

      quote from a book from COBOL's youth:
      "The program should cater for all invalid or unexpected input and produce appropriate and informative error messages before irreparable damage is done. Reliance should not be placed on invalid output being detected after the run. 'Garbage in, garbage out' is not an acceptable philosophy."
      "A Structured Programming Approach to Data", Derek Coleman (auth.), Springer-Verlag New York, Year: 1979, p.3

      .---
      Best regards

       
      😄
      1
      • Vincent (Bryan) Coen

        On 17/07/2023 19:16, meerkut wrote:

        /Simon>/ That's a source bug, not a compiler bug.
        

        quote from a book from COBOL's youth:
        "The program should cater for all invalid or unexpected input and
        produce appropriate and informative error messages before irreparable
        damage is done. Reliance should not be placed on invalid output being
        detected after the run. 'Garbage in, garbage out' is not an acceptable
        philosophy."
        "A Structured Programming Approach to Data", Derek Coleman (auth.),
        Springer-Verlag New York, Year: 1979, p.3

        That is hardly anywhere near the youth of Cobol - developed 58 - 9.

        Cobol that is.

         
      • Simon Sobisch

        Simon Sobisch - 2023-07-17

        Am 17.07.2023 um 20:16 schrieb meerkut:

        Let us demonstrate that statement A: "the phrase 'USING V1' is optional in the syntax of the program for gnuCOBOL" is incorrect.

        1. Assuming the phrase 'USING V1' is optional, similar to words like IS or BY, the gnuCOBOL compiler should consistently assign a valid V1 address based solely on the information in the "LINKAGE" section.
        2. Moreover, if the address of V1 is assigned, it must not be NULL.
        3. However, the compilation result indicates that the address of V1 is NULL.

        If we do not negate 1, 2, and 3, then statement (A) contradicts the result of the experiment, which is the compilation, thus refuting what was intended to be shown.
        Q.E.D. or
        "which was to be demonstrated."

        Consequently, from the above proof, we can deduce the following:
        a) The gnuCOBOL compiler considers the 'USING V1' clause to be mandatory in the syntax.
        b) The compiler does not allocate a valid address and location for V1 from the "LINKAGE" section until it encounters the phrase 'USING V1' in the PROCEDURE section.
        c) Furthermore, although the compiler comes across a reference to address V1, it does not verify the validity of this address. This behavior is reasonable if the preceding code ensures the allocation of an address and location for V1 from the "LINKAGE" section.

        However, point (b) indicates an error in the logic of the compiler. The compiler must always allocate memory at point (b) and then, at point (c), verify the presence of the phrase 'USING V1' as mandatory in the syntax (as proven above). If the syntax is not followed, it should raise an error and refuse to compile the program.

        No. LINKAGE just says "this is the structure of a variable I may provide
        memory for at any time later".

        You can do so by:
        * PROCEDURE DIVISION USING (in this - most common - case, it gets the
        address of what the caller passed in this position; which may include
        the cases "nothing", OMITTED, a variable that also has no address, a
        variable that has not enough space or a totally different USAGE)
        * ENTRY xyz USING (similar to the first)
        * SET ADDRESS OF V1 TO ... (no change for the data size or content either)
        * ALLOCATE V1

        So "in general" it is no problem that a field in LINKAGE is not in
        PROCEDURE DIVISION USING.

        If you don't have it then GnuCOBOL will ensure that its data point to
        NULL and you can check this at any time by plain COBOL yourself and also
        can tell the compiler to include checks for you (-fec=..., included in
        --debug).

        Simon> That's a source bug, not a compiler bug.

        quote from a book from COBOL's youth:
        "The program should cater for all invalid or unexpected input and produce appropriate and informative error messages before irreparable damage is done. Reliance should not be placed on invalid output being detected after the run. 'Garbage in, garbage out' is not an acceptable philosophy."
        "A Structured Programming Approach to Data", Derek Coleman (auth.), Springer-Verlag New York, Year: 1979, p.3

        This actually tells the COBOL programmer to not assume valid input
        anywhere (ACCEPT, READ, maybe not even CALL); this is not about the
        COBOL runtime or compiler :-)

        Simon

         
        • meerkut

          meerkut - 2023-07-18

          Simon>

          • PROCEDURE DIVISION USING (in this - most common - case, it gets the
            address of what the caller passed in this position; which may include
            the cases "nothing", OMITTED, a variable that also has no address, a
            variable that has not enough space or a totally different USAGE)
          • ENTRY xyz USING (similar to the first)
          • SET ADDRESS OF V1 TO ... (no change for the data size or content either)
          • ALLOCATE V1

          So "in general" it is no problem that a field in LINKAGE is not in
          PROCEDURE DIVISION USING.

          Absolutely not! By incorporating these additional inputs, we are unnecessarily broadening the scope of the original thesis. Let's stay focused on our initial point. The thesis is, in fact, remarkably narrow and straightforward:

          Why does the following program compile at all?

                 IDENTIFICATION DIVISION.
                 PROGRAM-ID.          N1.
                 ENVIRONMENT DIVISION.
                 DATA DIVISION.
                 LINKAGE SECTION.
                 77  V1 PICTURE S9.
                 PROCEDURE DIVISION.
                 MAIN.
                     MOVE 5 TO V1.
                     STOP RUN.
          

          If we acknowledge that:
          a) the compiler possesses all the necessary data about operators using address access to the variable V1, long before the compilation stage into a finished program;
          b) the compiler is obliged to halt its operation once there is evidence, even at the syntactic parsing stage, that the compiled program will result in erroneous memory access and cause a segfault;
          c) our intention is to act in the user's best interest and avoid any harm,

          then the compiler should:
          1. refrain from proceeding to the stage of compiling the finished program (point c);
          2. provide the user with an informative message.

          Nevertheless, it seems that you continue to advocate for the compilation of the source code into a finished program, even though there are clear indications of invalid memory accesses that will result in a segfault. Although the segfault may not occur immediately, it is likely to manifest after a week of running the program under production conditions.

          Then what about point (c)?

           
          • Simon Sobisch

            Simon Sobisch - 2023-07-18

            Am 18.07.2023 um 13:47 schrieb meerkut:

            The thesis is, in fact, remarkably narrow and straightforward:

            Why does the following program compile at all?

            ```cobol
            IDENTIFICATION DIVISION.
            PROGRAM-ID. N1.
            ENVIRONMENT DIVISION.
            DATA DIVISION.
            LINKAGE SECTION.
            77 V1 PICTURE S9.
            PROCEDURE DIVISION.
            MAIN.
            MOVE 5 TO V1.
            STOP RUN.

            ```

            I'm not sure that this was the original thesis, but of course, let's only check this one.

            If we acknowledge that:
            a) the compiler possesses all the necessary data about operators using address access to the variable V1, long before the compilation stage into a finished program;

            yes

            b) the compiler is obliged to halt its operation once there is evidence, even at the syntactic parsing stage, that the compiled program will result in erroneous memory access and cause a segfault;

            no, the standard make it explicit that the compiler is not obliged to do those checks and it is not requested that it aborts the compile if it sees an issue

            c) our intention is to act in the user's best interest and avoid any harm,

            yes; but we also have to take care of COBOL rules and not interfere with undefined behaviour; which is for example a reason that we cannot enable runtime checks by default and only teach the user to use --debug until there is a real reason to not use that (note: this will also lead "working" programs to abort where it otherwise does "something" that may has no effect on the end or even is what people intended - at least what people tested with).

            then the compiler should:
            1. refrain from proceeding to the stage of compiling the finished program (point c);

            no; the compiler currently has no control flow check, so it does not know if any statement that references V1 will ever happen; and even then someone will say "but I can jump over that with the debugger..." may not ever reach that ENTRY point at all

            1. provide the user with an informative message.

            definitely yes; in an ideal world GnuCOBOL would have the control flow knowledge and then errors there by default (patches very welcomed). As this information is missing it still should warn the user that V1 never gets and address assigned (in this specific program), but is still referenced, which would be a warning enabled by default (if a user wants that, then any warning could be made an error by -Werror, for example -Werror=linkage).

            Nevertheless, it seems that you continue to advocate for the compilation of the source code into a finished program, even though there are clear indications of invalid memory accesses that will result in a segfault. Although the segfault may not occur immediately, it is likely to manifest after a week of running the program under production conditions.

            ... or never (because normally programs are much bigger and more complex) , so we still want to compile that by default

            Then what about point (c)?

            A noted: a FR to add a compile-time check if LINKAGE and BASED items at ever allocated when either those or one of their child elements are referenced is totally fine. That check actually should be relative easy to add - patches welcome.

            In any case that's no compiler bug.

            Also note that compiling with -Wlinkage (implied with -Wextra/-W) willdo that check (but also raise false-positives):

            bad.cob:21: warning: LINKAGE item 'V1' is not a PROCEDURE USING parameter [-Wlinkage]
                19 |        DATA DIVISION.
                20 |        LINKAGE SECTION.
                21 >        77  V1 PICTURE S9.
                22 |       *PROCEDURE DIVISION USING V1.
                23 |        PROCEDURE DIVISION.
            

            ... and the compiler helps with the relevant checks already (according per standard) if you compile with -fec=all (or better --debug which implies the former, or only the specific -fec option for that check).

            Simon

             

            Last edit: Simon Sobisch 2023-07-18
  • Vince Esparza

    Vince Esparza - 2024-03-17

    This is a user error. I am certain it is in the COBOL Standards, but see IBM's COBOL Language Reference manual:
    https://www.ibm.com/docs/en/cobol-zos/6.4?topic=statements-call-statement
    USING phrase

    The USING phrase specifies arguments that are passed to the target program.

    Include the USING phrase in the CALL statement only if there is a USING phrase in the PROCEDURE DIVISION header or the ENTRY statement through which the called program is run.

    If you check out the PROCEDURE DIVISION header description you will see more supporting evidence that if one uses a CALL USING there MUST BE a corresponding PROCEDURE DIVISION USING matching in order and size for passed parameters. This is true for both a PROCEDURE DIVISION and an ENTRY statement.

     
  • Vince Esparza

    Vince Esparza - 2024-03-17

    Oh, one more thing, despite this being a nested program, to the compiler they are actually two separate programs (I know that from my years at IBM and Micro Focus) I am not aware of any compiler that can tie together multiple programs during their syntax and semantics phase processing. There comes a point where one has to follow the standards. A terrible example is valid numeric data. At one point IBM flatly stated '...it is your responsibility to ensure valid data ...' They have since re-worded their manuals to give the same message but in a less forceful statement. If you code a CALL USING, then the called program must have a USING coded at its entry point (main or subordinate).

     
  • Simon Sobisch

    Simon Sobisch - 2024-03-17

    Hi Vince. Thank you for your input. As noted above, the bug report is closed as "not our bug".
    The point of "why can't the compiler detect that" is fine - we would abort the compilation if we can deduce that the address is never set and that the code is always executed - but apart of the very basic version that's commonly hard to recognize as we don't have the full set of possible execution flows.
    The addition of a compiler warning that's on by default is something that should be quiete easy - patches welcome.

     
  • Vincent (Bryan) Coen

    Just a wee comment regarding :

    On new developments for any project when getting readdy for testing it is seriously recommended to compile with -d -g -fdunp=ALL so that ALL runtine checks are carried out.

    If you are experiencing and you do not know where in the program it was when if aborted also add -ftraceall AND set the enviroment variables :

    COB_SET_TRACE=1
    and
    COB_TRACE_FILE=${HOME}/trace.log

    This will put the tracing log file in your home folder.

    For my systems I have these set in .bashrc as :

    export COB_SET_TRACE=1
    export COB_TRACE_FILE=${HOME}/trace.log
    export COBCPY=~/cobolsrc/ACAS/copybooks
    export COB_SCREEN_ESC=YES
    export COB_SCREEN_EXCEPTIONS=YES
    export COB_LIBRARY_PATH=~/bin

    along with any others you need such as path, i.e.,
    export PATH=~/bin:.:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/games:/usr/lib64/qt5/bin:/home/vince/bin:/home/vince/.local/bin:/home/vince/mvs380/hercules/linux/64/bin

    Here your paths WILL differ in some areas.

    If you compile program with out the -ftraceall and/or -d -g the above will not matter and will be ignored etc.

    You instance of missing linkage elements should no indicate the cause of the problem although not always with a clear indication but the effect of -dump=ALL will show it when looking closely and the output.

    You cannot expect the compiler to read your mind when calling another module as to what is (or not) included in your calls but there are other switches for cobc that can help so use cobc -h

    These are actually listed in the Programmers Reference or guide manuals but using cobc -h always shows what is applicable for your version of the compiler as the GC developers do not always provide updates to any changes in this area and therefore updates do not get into the docs as quickly as perhaps it should.

    You should have up to date copies of these manuals as .pdf files although for experienced Cobol programmers the Guide is a bit of teaching you how to suck, etc.

    The Programmers Reference will have most of this teaching stuff removed as soon as I have time to do so but does involve some time to do.

     

Log in to post a comment.