Menu

ExprEngine: Test the capabilities

2019-09-23
2019-09-25
  • Daniel Marjamäki

    I wonder if anybody is interested to test out the ExprEngine.

    As a first step ... I want it to handle all types of data and expressions in C code.... pointers, arrays, structs, unions, aliases, dynamic memory, ....

    For now, I do not try to handle control keywords well. for/while/switch/if/break/continue/goto. Please wait with testing that.

    what missing handling can you find? I would think there is still a lot to fix. Please wait with C++ stuff.. we can fix that in the next step.

    Here is an example that you can start with:

    void f(int x)
    {
        ;;
    }
    

    Cppcheck output:

    $ ./cppcheck --verify 1.c
    Checking 1.c ...
    2:1: { x=($1,-2147483648:2147483647)}
    3:5: { x=($1,-2147483648:2147483647)}
    

    In this output ... the "program state" is written at line 2 and 3.

     

    Last edit: Daniel Marjamäki 2019-09-23
  • Daniel Marjamäki

    My goal is to have symbolic analysis in ExprEngine ... you can for instance try stuff like:

    void f(int x)
    {
        int buf[10];
        buf[x] = 0;
        int y = buf[x];  // this is 0
        int z = buf[x+1];  // this is uninitialized
        ;;
    }
    

    Elements in buf should be either uninitialized or 0.
    y should be 0.
    z should be uninitialized.

    The current output says:

    6:21: { x=($1,-2147483648:2147483647) buf=($2,[?,?,?,?,?,?,?,?,?,?]) y=(null) z=(null)}
    

    It says that all elements in buf are uninitialized. and y and z did not get any value. So this does not work fully.

     
  • versat

    versat - 2019-09-23
    void f(unsigned char x)
    {
        x = 258;
    }
    

    output:

    $ ./cppcheck --verify expreng.c
    Checking expreng.c ...
    2:1: { x=($1,0:255)}
    3:12: { x=258}
    

    I would expect x to be 2 at 3 since it is an unsigned char.

     

    Last edit: versat 2019-09-23
  • versat

    versat - 2019-09-23
    void f(void * p)
    {
        p = NULL;
        p += 1;
    }
    

    results in:

    $ ./cppcheck --verify expreng.c
    Checking expreng.c ...
    expreng.c:4:7: error: Pointer addition with NULL pointer. [nullPointerArithmetic]
        p += 1;
          ^
    expreng.c:3:9: note: Assignment 'p=NULL', assigned value is 0
        p = NULL;
            ^
    expreng.c:4:5: note: Compound assignment '+=', assigned value is 1
        p += 1;
        ^
    expreng.c:4:7: note: Null pointer addition
        p += 1;
          ^
    Segmentation fault (core dumped)
    

    I would not expect segmentation fault to happen.

     

    Last edit: versat 2019-09-23
  • versat

    versat - 2019-09-23
    void f(unsigned long long x)
    {
        char a += x;
    }
    

    Output:

    $ ./cppcheck --verify expreng.c
    Checking expreng.c ...
    Segmentation fault (core dumped)
    

    I would not expect a segfault here too.

     
  • versat

    versat - 2019-09-23

    Since you explicitly wrote about testing dynamic memory i tried this:

    #include <stdlib.h>
    
    void f()
    {
        char * p = calloc(1, 1);
        if (p != NULL) {
            char c = *p + 1;
        }
        free(p);
    }
    

    Output:

    $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
    Checking expreng.c ...
    5:12: { p=(null)}
    5:12: p=TODO_NO_VALUE
    5:28: { p=(null)}
    5:28: p=TODO_NO_VALUE
    7:14: { p=(null)}
    7:14: { p=(null) c=(null)}
    7:14: c=TODO_NO_VALUE
    7:16: { p=(null)}
    7:16: { p=(null) c=(null)}
    7:16: c=TODO_NO_VALUE
    7:24: { p=(null)}
    7:24: { p=(null) c=(null)}
    7:24: c=TODO_NO_VALUE
    8:5: { p=(null) c=(null)}
    9:12: { p=(null) c=(null)}
    9:12: { p=(null)}
    

    I would expect c to be 0+1 or something similar.

     
  • Daniel Marjamäki

    thanks versat! I'll write down your examples and ensure we handle them asap!

     
  • Daniel Marjamäki

    .. that was exactly the kind of input I wanted to have. I hope cppcheck will do much better on the next iteration :-)

     
  • Daniel Marjamäki

    Your second example has a syntax error

       char a += x;
    

    Of course I need to handle that later.. and not crash! But for now, please run gcc -fsyntax-only on the file or something and ensure it is valid C code.

     

    Last edit: Daniel Marjamäki 2019-09-23
    • versat

      versat - 2019-09-23

      Oops, yes, that was not by intention. I added syntax check with the last example, too late.

       
  • versat

    versat - 2019-09-24

    With latest head ( https://github.com/danmar/cppcheck/commit/524c9f593699daa3c830ff553ba597c5ce099ce9 )
    C code:

    void f()
    {
        char str[] = "hello";
        str[0] = 'B';
    }
    

    Output:

    $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
    Checking expreng.c ...
    Segmentation fault (core dumped)
    

    Code is syntactically correct so i would not expect a segfault here.

     

    Last edit: versat 2019-09-24
    • versat

      versat - 2019-09-25

      This is fixed now and works like expected:

      $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
      Checking expreng.c ...
      3:25: { str=($1,[104,101,108,108,111,0])}
      4:17: { str=($1,[66,101,108,108,111,0])}
      

      If i slightly change the code the result changes but should be the same:

      void f()
      {
          char str[] = "hello";
          str[0+0] = 'B';
      }
      

      Output is:

      $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
      Checking expreng.c ...
      3:25: { str=($1,[104,101,108,108,111,0])}
      4:19: { str=($1,[104,101,108,108,111,0])}
      

      Should such calculation work or is planned to work later?

       
  • Daniel Marjamäki

    I added a temporary hack for calloc. Pretend that it does not return NULL for now. The handling of buffers/arrays need to be fixed a lot! So it is interesting to focus a little extra at calloc etc usage.

     
  • versat

    versat - 2019-09-25
    #include <stdlib.h>
    
    void f()
    {
        char str[9];
        const size_t sz = sizeof(str);
        str[sz - 1] = 0;
    }
    

    Output:

    $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
    Checking expreng.c ...
    5:16: { str=($1,[?,?,?,?,?,?,?,?,?])}
    6:34: { str=($1,[-128:127,-128:127,-128:127,-128:127,-128:127,-128:127,-128:127,-128:127,-128:127]) sz=($3,0:ffffffffffffffff)}
    7:20: { str=($1,[-128:127,-128:127,-128:127,-128:127,-128:127,-128:127,-128:127,-128:127,-128:127]) sz=($3,0:ffffffffffffffff)}
    

    Seems sizeof() is not implemented yet.
    The variable sz should have value 9 and the last element of the array str should be known to have the value 0.

     
    • Daniel Marjamäki

      hmm yes sizeof() is not handled in ExprEngine yet. It should.

      I am restructuring the array code to work better for dynamic arrays. The output will change...

      Imagine such code:

      void f(int x, int y)
      {
          char arr[x];
          arr[y] = 0;
      }
      

      The output should become something like:

        { x=$1 y=$2 arr=($3, size=$1, [:]=?, [$2]=0) }
      

      => the id for arr value is $3. The size is $1. Then comes assignments ... first all elements are uninitialized and then element $2 is zeroed.

      That tells you that arr[0] will be either ? or 0. The value of arr[$2] is 0. The value of arr[$4..] is ? or 0. The value of arr[expr<$2>] will be ?.

       
  • versat

    versat - 2019-09-25

    For this code i would expect that it is known that the array only contains 0 after the memset:

    #include <string.h>
    
    void f()
    {
        char str[4];
        memset(str, 0, 4);
    }
    

    But the values are still unknown:

    $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
    Checking expreng.c ...
    5:16: { str=($1,[?,?,?,?])}
    6:22: { str=($1,[-128:127,-128:127,-128:127,-128:127])}
    

    I am not sure if memset() should work and if it should be tested, but i tried it since it is very common.

     
    • Daniel Marjamäki

      I am not sure if memset() should work and if it should be tested, but i tried it since it is very common.

      The ExprEngine will need better information to have full analysis. I am thinking about adding some additional ExprEngine configuration.

      Maybe it's best that you don't use standard functions now to start with..

       
  • versat

    versat - 2019-09-25
    struct s
    {
        int a;
    };
    
    void f()
    {
        struct s s1;
        s1.a = 2;
        int x = s1.a + 1;
    }
    

    Output:

    $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
    Checking expreng.c ...
    10:9: { x=(null)}
    10:9: x=TODO_NO_VALUE
    10:21: { x=(null)}
    10:21: x=TODO_NO_VALUE
    

    In this simple example i guess it is safe to assume that s1.a is 2 and x is 3.
    There is no output for variables in the struct, are structs not implemented yet?

     
  • versat

    versat - 2019-09-25
    #include <stdlib.h>
    
    void f()
    {
        char * buf = calloc(2, 2);
        const size_t sz = sizeof(buf);
        char a = buf[0] / buf[2];
    }
    

    Output:

    $ gcc -fsyntax-only expreng.c && ./cppcheck --verify expreng.c
    Checking expreng.c ...
    expreng.c:7:21: error: Division by zero [verificationDivByZero]
        char a = buf[0] / buf[2];
                        ^
    5:30: { buf=($2,[0,0,0,0])}
    6:34: { buf=($2,[-128:127,-128:127,-128:127,-128:127]) sz=($4,0:ffffffffffffffff)}
    7:12: { buf=($2,[-128:127,-128:127,-128:127,-128:127]) sz=($4,0:ffffffffffffffff)}
    7:12: { buf=($2,[-128:127,-128:127,-128:127,-128:127]) sz=($4,0:ffffffffffffffff) a=($3)/($3)}
    7:29: { buf=($2,[-128:127,-128:127,-128:127,-128:127]) sz=($4,0:ffffffffffffffff)}
    7:29: { buf=($2,[-128:127,-128:127,-128:127,-128:127]) sz=($4,0:ffffffffffffffff) a=($3)/($3)}
    

    Interestingly Cppcheck already detects the division by zero, the expression engine shows unkwnon values.
    I guess that because sizeof() is not implemented in the expression engine it is assumed that it could change the content of the array.
    I would expect the value of sz and the array content to be known in this simple example.

     
  • Daniel Marjamäki

    Interestingly Cppcheck already detects the division by zero, the expression engine shows unkwnon values.

    That is by intention.. In general, the value -128:127 means that the value can be 0.

    please note that if you write buf[0] + buf[0] then the range will be -256:254. The value can be 0 but not 1. I intend to implement an algorithm for this that checks if a specific value in the range is mathematically possible.

     
  • Daniel Marjamäki

    I restructured the ArrayValue... now I need to figure out how to split the execution path in the middle of a statement somehow. If the code is :

    buf[5] = 14;
    int x = buf[y];
    

    then we need to split the programstate. one state where y is 5 and x will be 14. and one state where y is not 5 and x is something else..

    but anyway, the arrayvalue seems to be printed well now..

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.