Menu

Numerical parameters in User Defined Functions

2014-12-03
2020-06-28
  • László Erdős

    László Erdős - 2014-12-03

    Hi,

    If I use numerical parameters in an UDF, then I get wrong values. See this example.

    :::cobolfree
    
     IDENTIFICATION DIVISION.
     PROGRAM-ID. testudf.
     AUTHOR.     Laszlo Erdos.
    
     ENVIRONMENT DIVISION.
     CONFIGURATION SECTION.
     REPOSITORY.
        FUNCTION J-SETSIZE.
    
     DATA DIVISION.
    
     WORKING-STORAGE SECTION.
    *> function return value 
     01 WS-RET                             BINARY-INT.
    
    *> GUI elements
     01 WS-BUTTON                          BINARY-INT.
    
    *> function args 
     01 WS-WIDTH                           BINARY-INT.
     01 WS-HEIGHT                          BINARY-INT.
    
     PROCEDURE DIVISION.
    
    *>------------------------------------------------------------------------------
     MAIN-TESTUDF SECTION.
    *>------------------------------------------------------------------------------
    
        MOVE 1 TO WS-BUTTON
    
        DISPLAY "first call with variables"    
        MOVE 80 TO WS-WIDTH
        MOVE 20 TO WS-HEIGHT
        MOVE J-SETSIZE(WS-BUTTON, WS-WIDTH, WS-HEIGHT) TO WS-RET
    
        DISPLAY "second call without variables"    
        MOVE J-SETSIZE(WS-BUTTON, 80, 20) TO WS-RET
    
        STOP RUN
    
        .
     MAIN-TESTUDF-EX.
        EXIT.
     END PROGRAM testudf.
    
    
     IDENTIFICATION DIVISION.
     FUNCTION-ID. J-SETSIZE.
     AUTHOR.      Laszlo Erdos.
    
     ENVIRONMENT DIVISION.
     DATA DIVISION.
     WORKING-STORAGE SECTION.
    
     LINKAGE SECTION.
     01 LNK-ARG-0                          BINARY-INT.
     01 LNK-ARG-1                          BINARY-INT.
     01 LNK-ARG-2                          BINARY-INT.
     01 LNK-RET                            BINARY-INT.
    
     PROCEDURE DIVISION USING BY VALUE     LNK-ARG-0
                              BY VALUE     LNK-ARG-1
                              BY VALUE     LNK-ARG-2
                        RETURNING          LNK-RET.
    
     MAIN-J-SETSIZE SECTION.
    
        DISPLAY "LNK-ARG-0: " LNK-ARG-0
        DISPLAY "LNK-ARG-1: " LNK-ARG-1
        DISPLAY "LNK-ARG-2: " LNK-ARG-2
    
    *>    This is a C function, therefore we need "BY VALUE"
    *>    CALL "java_setsize" 
    *>         USING BY VALUE LNK-ARG-0
    *>               BY VALUE LNK-ARG-1
    *>               BY VALUE LNK-ARG-2
    *>         RETURNING OMITTED 
    *>    END-CALL 
    
        MOVE ZEROES TO LNK-RET
    
        GOBACK
    
        .
     MAIN-J-SETSIZE-EX.
        EXIT.
     END FUNCTION J-SETSIZE.
    

    Compile:

    cobc -x -free testudf.cob setsize.cob

    Start:

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +0000000001
    LNK-ARG-1: +0000000080
    LNK-ARG-2: +0000000020
    second call without variables
    LNK-ARG-0: +0000000001
    LNK-ARG-1: +0000012344
    LNK-ARG-2: +0000012338

    Question 1: Is there a problem by parameter passing?

    Question 2: This function gives no value back, but I have to define a return variable.
    I would like to write in my program only this:

    "J-SETSIZE(WS-BUTTON, 80, 20)"

    and not this:

    "MOVE J-SETSIZE(WS-BUTTON, 80, 20) TO WS-RET"

    Is it possible?

    I use GnuCOBOL with cygwin (64 bit Windows).

    $ cobc -version
    cobc (GNU Cobol) 2.0.0
    Copyright (C) 2001,2002,2003,2004,2005,2006,2007 Keisuke Nishida
    Copyright (C) 2006-2012 Roger While
    Copyright (C) 2009,2010,2012,2014 Simon Sobisch
    This is free software; see the source for copying conditions. There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    Built Dec 03 2014 09:52:44
    Packaged Jan 20 2014 07:40:53 UTC
    C version "4.8.3"

    Thx - László

     
  • László Erdős

    László Erdős - 2014-12-05

    Hi,

    Function Argument: "An identifier, a literal, or an arithmetic expression that specifies a value to be used in the evaluation of a function."

    So it can be also an arithmetic expression. I think, a numeric value is also an arithmetic expression.

    Can anybody confirm that?

    Thx - László

     
  • Bill Woodger

    Bill Woodger - 2014-12-05

    I'm wondering if there is something with a post of Simon's from a few weeks ago.

    The BINARY-INT seems to be getting storage allowing 10 digits (from the result of the DISPLAY).

    A literal number should count as an arithmetic expression, yes.

    When your literal number is "passed" to the CALLed program, I don't think the size of the literal is mapping correctly to the BINARY-INT.

    Perhaps have a go with BINARY and a PIC 9(4) instead? Also, what is the LENGTH OF/FUNCTION LENGTH of BINARY-INT?

    I suspect the literal is in two bytes, the BINARY-INT is at least four (from the number of digits in the DISPLAY) and could be even longer!

     
  • László Erdős

    László Erdős - 2014-12-05

    Thank you for the Answer. A BINARY-INT is 4 bytes.

    I changed all data types, here are some results.

    BINARY-INT: not OK

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +0000000001
    LNK-ARG-1: +0000000080
    LNK-ARG-2: +0000000020
    second call without variables
    LNK-ARG-0: +0000000001
    LNK-ARG-1: +0000012344
    LNK-ARG-2: +0000012338
    length of WS-BUTTON: 4
    FUNCTION LENGTH: 000000004

    PIC 9(4): OK (but I need negative values for another functions later)

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: 0001
    LNK-ARG-1: 0080
    LNK-ARG-2: 0020
    second call without variables
    LNK-ARG-0: 0001
    LNK-ARG-1: 80
    LNK-ARG-2: 20
    length of WS-BUTTON: 4
    FUNCTION LENGTH: 000000004

    PIC S9(4): not OK

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +0001
    LNK-ARG-1: +0080
    LNK-ARG-2: +0020
    second call without variables
    LNK-ARG-0: +0001

    Attempt to reference unallocated memory (Signal SIGSEGV)
    Abnormal termination - File contents may be incorrect
    LNK-ARG-1:

    PIC S9(8): not OK

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +00000001
    LNK-ARG-1: +00000080
    LNK-ARG-2: +00000020
    second call without variables
    LNK-ARG-0: +00000001

    Attempt to reference unallocated memory (Signal SIGSEGV)
    Abnormal termination - File contents may be incorrect
    LNK-ARG-1:

    PIC S9(4) BINARY: not OK

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +0
    LNK-ARG-1: +P0
    LNK-ARG-2: +0
    second call without variables
    LNK-ARG-0: +0

    Attempt to reference unallocated memory (Signal SIGSEGV)
    Abnormal termination - File contents may be incorrect
    LNK-ARG-1:

    I tried arithmetic expressions also, but I get errors. My cygwin terminal was freezen sometimes...

    László

     
  • László Erdős

    László Erdős - 2014-12-05

    I changed back to BINARY-INT, and I tried this:

    MOVE J-SETSIZE(WS-BUTTON, 0, 1) TO WS-RET

    Result:

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +0000000001
    LNK-ARG-1: +0000000080
    LNK-ARG-2: +0000000020
    second call without variables
    LNK-ARG-0: +0000000001
    LNK-ARG-1: +0000000048
    LNK-ARG-2: +0000000049
    length of WS-BUTTON: 4
    FUNCTION LENGTH: 000000004

    So, 48 and 49 are the ASCII values of 0 and 1.

    There are also two problems with UDF.
    - wrong interpretation of numeric values.
    - can not process an arithmetic expressions.

    László

     
  • Bill Woodger

    Bill Woodger - 2014-12-05

    OK, good work, explains the first results. You're getting character 80 and 20 then interpreted as little-endian binary. The hex of those is x'3032' and x'3038', or 02 and 08, which are 80 and 20 going from left-to-right. I saw those yesterday, but not used to thinking in ASCII :-)

    It looks like there should be an implicit BY VALUE for the literal in the function call, which is presumably there in an intrinsic function.

    An intrinsic function can have an arithmetic expression, so I assume a UDF should as well, with the result again being passed implicitly BY VALUE.

    The other parameters for a UDF call should also be checked for being passed BY VALUE. Since the source fields are defined as BINARY-INT they may well be making it look like it "works". Perhaps change those in the caller to non-binary, and we then know if the rest of the UDF call is OK or not.

     
  • Robert W.Mills

    Robert W.Mills - 2014-12-11

    I agree with Bill in that you should try a different data-type.

    You say you need a signed number so why not try an 's9(4) comp' or if that's not long enough then an 's9(9) comp'. I have a UDF that returns a number indicating the start point in a string and it allways gives the correct value.

     
  • László Erdős

    László Erdős - 2014-12-11

    OK, I tried with 's9(9) comp':

    $ ./testudf.exe
    first call with variables
    LNK-ARG-0: +000000001
    LNK-ARG-1: +000000080
    LNK-ARG-2: +000000020
    second call without variables
    LNK-ARG-0: +000000001
    LNK-ARG-1: +942669824
    LNK-ARG-2: +842006528
    length of WS-BUTTON: 4
    FUNCTION LENGTH: 000000004

    It works with variables independent from types.

    But with numeric literal:

    "MOVE J-SETSIZE(WS-BUTTON, 80, 20) TO WS-RET"

    I get this:

    second call without variables
    LNK-ARG-0: +000000001
    LNK-ARG-1: +942669824
    LNK-ARG-2: +842006528

    With an arithmetic expression in UDF:

    "MOVE J-SETSIZE(WS-BUTTON, 80 + X, 20 + Y) TO WS-RET"

    I get a runtime error:

    Attempt to reference unallocated memory (Signal SIGSEGV)
    Abnormal termination - File contents may be incorrect

    László

     
  • Brian Tiffin

    Brian Tiffin - 2015-01-20

    This bug needs fixing. There is a constraint, due to spec, UDF arguments are always call by reference, modifiable values. Always, by spec. Kinda sad, but understandable. Given that constraint, literal values become a thing. Duck typing of numeric literals in COBOL is, umm, not really conducive to foreign function interfacing, being usage display.

    Still it needs to be fixed, a fix that will likely correct the same problem with CALL by value stack frame setups. CALL has explicit (though also malfunctioning for literals) SIZE IS clauses for each argument.

    There is a shadow stack of cobol field structure accessible for CALL and UDF arguments from C. I'm leaning to believe that this is part of the problem. All parameters end up cast to size of pointer. Using missing bit fill, and usage display rules not synched with what GnuCOBOL or foreign library data needs.

    This is a repeat of a previous post, but I'm not sure it can be solely a compile time fix. The casting may have to be run-time casting, a conditional that may slow down CALL, more than s negligible amount. Burdened with selecting the proper casting during call frame setup, at runtime; a marshaling layer penalty for each and every CALL, or FUNCTION-ID.

    Guessing.

    Cheers,
    Brian

     

    Last edit: Brian Tiffin 2015-01-20
  • Anonymous

    Anonymous - 2016-09-09

    Perhaps GNU Cobol should support by-value function arguments as a language extension? FFI absolutely requires some solution (unless one wants to write C marshalling code – some languages do this).

     
    • Simon Sobisch

      Simon Sobisch - 2016-09-09

      We'll check this after 2.0 release and after the merge of rw-branch and debugger branch (likely needs some months) when the CALL interface is revised.

       
    • Brian Tiffin

      Brian Tiffin - 2016-09-11

      When we do, I'd actually volunteer to try and implement a by struct enhancement along with more robust and sane by value handling. The JSON based jq library begs out for call by struct. And jq is awesome.

      Have good, make well,
      Brian

       
  • Brian Tiffin

    Brian Tiffin - 2016-09-11

    Oh, and the spec was updated. User defined functions now support BY REFERENCE, BY VALUE, BY CONTENT phrases on arguments.

    I'm not sure of the timing, exactly, but it might have something to do with Roger pointing out that user defined and intrinsic functions only allowing BY REFERENCE in the Drafts, broke the commutative properties of ADD. But that would assume someone on the committee actually heard us complaining a few years back, so probably not a true suspicion. I don't have old drafts around to see when/if the change actually occurred or if it was just a misread oversight from years gone by.

    8.4.2.2, 8.4.2.2.3, 8.4.2.2.4.

    Just to point out the issue (which is no longer an issue when BY VALUE is allowed/used)

    ::cobolfree
    ADD f(a, b) g(a, b) GIVING answer
    

    could end up with a different result than

    ::cobolfree
    ADD g(a, b) f(a, b) GIVING answer
    

    depending on whether f() or g() changed a or b. A state of affairs that would likely send some chills down some necks.

    Oh, and one more thing. The problem turned out to be two fold. One, it has nothing to do with the reflective shadow values, it has/had to do with datatype promotion at the C level. The dynamic call nature of GnuCOBOL did not cast all arguments in C, so all integer types were promoted to int and all real types were promoted to double by C standard. Extra level of casting is now in place for CALL, but still needs work in functions. So 16 bits stay as 16 bits, etc.

    The other issue that still needs addressing is that function literals are passed to C as string literals. String literals have no explicit numeric type (1 2 4 or 8 bytes?), so the trick of moving through working storage for function literals is what fixes that part. Well that, and references to string literals aren't really numbers as one would assume them to be. We'll need to define some rules, and allow for casting literals with SIZE IS phrases when this eventually gets addressed.

    From some discussion with Zeev when he was having issues with PCRE integration in GnuCOBOL, IBM assumes a 32bit integer for numeric literals, but this is hard documentation to come by, and Zeev found out by doing experiments.

    Cheers,
    Brian

     

    Last edit: Brian Tiffin 2016-09-11
    • Bill Woodger

      Bill Woodger - 2016-09-11

      How do you find it with programs which look like they are using subscripting, but really they are using User-Defined Functions?

      BY REFERENCE for a function? Someone didn't think that through.

       
      • Brian Tiffin

        Brian Tiffin - 2016-09-11

        Well, it's fixed by spec now, or at least options exist. BY REFERENCE is still the default unless explictly told otherwise. I kinda wish I had old drafts lying around, as I might be fibbing about when/if this changed, and it might just be Roger (and I) had misread things. I complained on LinkedIn, in a group that Donald Nelson reads from time to time, and no one mentioned that we had the wrong impression, but that doesn't mean we didn't. ;-)

        Cheers,
        Brian

         
    • Simon Sobisch

      Simon Sobisch - 2016-09-11

      The last 20xx draft has the "optional not by value" part in already but these are marked as change while UDF aren't marked as change. I'll recheck the 2002 standard for UDF and assume this is "only BY REFERENCE".

       
  • Marty Heyman

    Marty Heyman - 2020-06-28

    Isn't the intention of his user FUNCTION really the same as CALL with a RETURNING clause?

     
    • Simon Sobisch

      Simon Sobisch - 2020-06-28

      No, that was only a minimal example. One of the strengths of UDF is that you can use the return value directly in any expression (like conditions or computations), for CALL ... RETURNING you always need to define a variable to hold the result, and when you want to compare the results of multiple items then it gets dirty quite fast - especially when you compare with EVALUATE ... WHEN FUNC-A ... WHEN FUNC-B (a, b) ... which also highlights one other benefit - you can use these in any place.

       
    • Ralph Linkletter

      So with my EBCDIC conversion I found a method to evaluate comparative expressions "in place"
      (Simon helped out)
      IF FUNCTION FXASCII (P1X (1:2)) > '00'
      Where: P1X is a character data element populated with EBCDIC values.
      Prior to having written the UDF function "FXASCII" it was required that I must:
      Convert P1X to ASCII
      Execute the compartive expression
      Restore P1X to EBCDIC

      Using a UDF simplified the requirement to convert EBCDIC data values by evaluating the expression "in place".

      All that glitters is not gold :-)
      There is an edict from some governing body that requires allocating huge amounts of data (something like 99,000 bytes or 1 GB) per invocation of a UDF that utilizes "Any Length" / Returning syntax.

      As implemented in 3.1 Dev memory is allocated as if an arithmetic progression is in effect.
      Hence 'Any Length' or perhaps any parameter indicating a return value diminishes the value of the UDF since the GetMain occurs with every execution.

      Given 100 or so repetitve executions of the UDF, memory grew to over 1 GB.
      I tried a "CANCEL" after each invocation - made no difference - memory allocation continued to increase.

      Another concern is portability of UDF(s) to other COBOL compilers.
      I do not think there is a standard implementation.
      Most UDF discussions contemplate SQL.

      Ralph

       

      Last edit: Ralph Linkletter 2020-06-28

Anonymous
Anonymous

Add attachments
Cancel