Menu

Misra 10.1 false positive on stringstream shift operators

Ronny Soak
2020-08-17
2020-09-03
1 2 > >> (Page 1 of 2)
  • Ronny Soak

    Ronny Soak - 2020-08-17

    Hi,

    I'm putting this here before creating a ticket because this issue seems
    so common to me but despite searching

    existing tickets and discussions, I can find no mention about it. Surely
    I'm just not searching good enough.

    misra-c rule check 10.1 as currently implemented here

    https://github.com/danmar/cppcheck/blob/main/addons/misra.py#L1337

    gives false positives whenever the << (or >>) operator is not used as on
    arithmetic types but on

    stream types (and possibly when overloaded for custom classes).

    Currently the code just checks for the mere existence of a token named
    "<<"  and whenever the operands around it are unsigned (or a literal
    number).

    Obviously, putting a signed number to a stringstream is not a misra 10.1
    violation.

    Isn't the use of '<<' on strings/stringstreams/cout so common that this
    would have been noticed quite a while ago already?

    Am I using the check wrong?

    In my code this is currently fixed by checking if one of the operands
    next to it are string or char. This covers more cases but isn't perfect.

    There is a property called 'isArithmeticalOp' for the Tokens, but
    looking at
    https://github.com/danmar/cppcheck/blob/main/lib/token.cpp#L95 it seems
    it's not actually checking the base type and possible overloading of
    operands but just goes by a hard-coded list of operators.

    Anybody having a way of handling this better?

    thx,

    chaos

     
  • Georgiy Komarov

    Georgiy Komarov - 2020-08-17

    Hi,

    gives false positives whenever the << (or >>) operator is not used as on
    arithmetic types but on stream types (and possibly when overloaded for custom classes).

    Thanks for the report.

    misra.py addon is used for checking MISRA C 2012 compatibility for C90 and C99. C++ support is not provided.

    MISRA C++ (currently integrated with AUTOSAR) is a separate standard that is very different from MISRA C. It is not supported in Cppcheck.

     
  • Ronny Soak

    Ronny Soak - 2020-08-17

    I was aware that MISRA C only covers C-issues and not C++-issues, but I wasn't aware that this means that the input code needs to be free of any C++ language features.
    It's also not mentioned (to my own surprise) in neither the MISRA-C 2012 nor the MISRA-C++ 2008 standard.

    Seems I need to selectively either deactivate or modify the rules o fit my mixed C/C++ code.

     
  • Georgiy Komarov

    Georgiy Komarov - 2020-08-18

    I was aware that MISRA C only covers C-issues and not C++-issues, but I wasn't aware that this means that the input code needs to be free of any C++ language features.

    Yes, misra.py expects the C code as input. Moreover, some checks depends on version of C standard. This is because the MISRA rules check the use of standard library functions and identifiers, which differ in C90 and C99. Here is an example in the addon source: https://github.com/danmar/cppcheck/blob/b01de028667025f6f504a562692706f837d44b3f/addons/misra.py#L332-L341.

    Seems I need to selectively either deactivate or modify the rules o fit my mixed C/C++ code.

    This may work for some rules.

    Do you have any suggestions on how we could improve this situation?

     

    Last edit: Georgiy Komarov 2020-08-18
  • Ronny Soak

    Ronny Soak - 2020-08-23

    Do you have any suggestions on how we could improve this situation?

    I'm of course looking into implementing MISRA-C++ (or AUTOSAR C++) rules instead of MISRA-C rules. But that's a lot of effort and I can't really say if it woul handle mixed-language source code any better. I'm again supprised that it isn't an actual rule in either of those standards that forbids the use of language features from the other.

    My modifications are very specific at the moment. Eg. MISRA-C 2012 rule 14_8 forbids pointer arithmetic, so I had to add a check if the left-hand operand is a container, which would allow the += operation nevertheless as it's not arithmetic, but overloaded as 'append'.

    For 10_1 (shift operators with signed variables) I had to add an exception for stringstreams.
    Again, overloaded operators trip the current implementation of the rules. The needs to be a more general solution for these.

    Also, with no way of specifying classes (and enums and overloaded operators) in the .cfg files yet, I don't know if C++ rules can effectively been checked. I'm currently having better results with using<define name="Wrapper<uint>" value="uint"> in the .cfg to replace complicated types with rough equivalents of c types to get better checking. E.g. smth like class Wrapper<uint> would not receive a 'valueType-sign="unsigned"' attribute in the .dump files, while a uint of course does, which enables better checks.

     
  • Daniel Marjamäki

    I think it would make sense if misra.py would either disable rule 10.1 for C++ code entirely.. or if rule 10.1 would handle stream operators. Probably the first option is better because in the long run I envision that we want this to be an addon that target C files.

    I'm of course looking into implementing MISRA-C++ (or AUTOSAR C++) rules instead of MISRA-C rules. But that's a lot of effort and I can't really say if it woul handle mixed-language source code any better.

    Ah sounds good! I do not think you need to finish all the work yourself at once. If you create a new addon for misra-c++ and just have 2-3 rules in that then it's a start.

    My modifications are very specific at the moment.

    It sounds unfortunate to have C++ knowledge in the misra.py file. A clean bailout seems better.

     

    Last edit: Daniel Marjamäki 2020-08-23
  • Georgiy Komarov

    Georgiy Komarov - 2020-08-23

    if rule 10.1 would handle stream operators

    I don't think that's a good idea. There are probably many more places in misra.py that are incompatible with C++. If we declare support for C++, it will greatly complicate the code and test suite.

    It might be better to write in the documentation that misra.py only supports the C language? Or create warnings when user is trying to use misra.py with C++ dump in GUI and CLI interfaces?

    Again, MISRA C and MISRA/AUTOSAR C++ are different standards. I agree that it will be better to create new addon with C++ support. We can move common parts of misra.py and new misra-cpp.py to a separate Python module to reuse source code when it is possible.

     

    Last edit: Georgiy Komarov 2020-08-23
    • Daniel Marjamäki

      I don't think that's a good idea. There are probably many more places in misra.py that are incompatible with C++. If we declare support for C++, it will greatly complicate the code and test suite.

      Yes it depends. I have very little knowledge about MISRA-C++. But if we need to implement a similar rule for << in the C++ addon and handle stream operators etc then maybe that exact same rule can be reused for C code also.

      About the testing.. I think it makes sense to have separate misra-c2012 tests and c++-something tests. To ensure we do not introduce misra-c2012 regressions when working on c++.

      In the end I want that we have a misra-c2012 addon.. and if that does not work well for c++ code then that is no problem.. we should not complicate (damage?) that code to make it c++ friendly.

       

      Last edit: Daniel Marjamäki 2020-08-23
      • Daniel Marjamäki

        Again, MISRA C and MISRA/Autosar C++ are different standards. I agree that it will be better to create new addon with C++ support. We can move common parts of misra.py and new misra-cpp.py to a separate Python module to reuse source code when it is possible.

        Yes 👍

        I do not know what / how much is different. But that sounds good to me.

         

        Last edit: Daniel Marjamäki 2020-08-23
  • Georgiy Komarov

    Georgiy Komarov - 2020-08-23

    I'm not very familiar with MISRA C++ either. We can start by implementing a few simple rules, ideally, those that are already defined in the misra.py and they will work with C++ code. I am sure that later there will be users who will want to increase the number of supported rules and improve the new addon.

    @chaos-99 Can you tell us what rules from MISRA C you managed to use for the C++ code?
    And what standard do you use: MISRA C++ 2008 (supports up to C++03) or modern AUTOSAR C++ (C++14)? The MISRA+AUTOSAR standard with C++17 support is still not published yet.

     
    • Daniel Marjamäki

      it sounds interesting to start with an autosar c++ addon..
      as far as I understand the rules are public so we dont have to worry about legal.
      and I guess we can reuse rules for misra addons later.

       
      👍
      1
  • Ronny Soak

    Ronny Soak - 2020-08-23

    The 10_1 rule in Misra-C 2012 I modified like this:

    if skip_until_semicolon == True:
                if token.str == ';':
                    skip_until_semicolon = False
                continue
    // if next operand is literal string or char, it's most likely a stringstream operation
    if token.str in ('<<') and (token.next.isString or token.next.isChar):
                skip_until_semicolon = True
                continue
    // [...]
    if token.str in ('<<', '>>'):
                    if not isUnsignedType(e1) and not (token.astOperand1.isNumber and token.astOperand1.isInt and (int(token.astOperand1.str,0) >= 0)):
                        # for std::stream's the valueType is None, so we use that to avoid false errors.
                        if (t1.valueType != None):
                            reportError(token, 'Misra_2012_10_1', 'error')
    

    It's not perfect and certainly has both false positives and false negatives, but it has better coverage than the original.

    Of the 81 checks implemented in the offical addon misra.py we use 62 rules on our code. Not all of the missing once are not working, some we just didn't like ;)

    Some are slightly modified like the one above. As mentiond 18_4 was modified to exclude containers.

    The AUTOSAR rules can be found in a draft version here: https://www.autosar.org/fileadmin/user_upload/standards/adaptive/17-03/AUTOSAR_RS_CPP14Guidelines.pdf
    But I'm not sure what the legal status is. But the document only adds new rules and references MISRA C++ 2008 for older ones, so it's needed anyway. I have a licensed copy though. I'm seemingly not able to find an 'official' version on the AUTOSAR website.

    I would try to implement this as a python addon for sure, but currently there is not enough info in the dump files to analyse e.g. class hierarchy. Currently the type (=class) of a variable can only be found by the scope, which isn't even linked by unique id, but has to be searched for by name. Inheritance is currently not shown at all. I haven't seen structures to represent templates either. So this would also need some additions to the cppcheck source to get representations of these structures into the dump. Or else the addon would need to do cppcheks job again and analyse this from the raw tokens.

     
  • Daniel Marjamäki

    I would try to implement this as a python addon for sure, but currently there is not enough info in the dump files to analyse e.g. class hierarchy.

    yes it's likely we need to add more info.

    is there are a particular rule that is not possible to implement right now..

     
  • Georgiy Komarov

    Georgiy Komarov - 2020-08-24

    I think we can start with implementing something trivial. For example, rules M5-18-1 (the comma operator shall not be used) and M6-4-3 (a switch statement shall be well-formed) are exactly the same as R12.3 and R16.1 and could be added "as is".

    The only thing I don't understand yet is the legal status of AUTOSAR standard. The document from the message above marked as "AUTOSAR CONFIDENTIAL".

    Can we publish the texts of their rules in the public domain?

     
  • Ronny Soak

    Ronny Soak - 2020-08-24

    I've now started to list all rules, define if they are autosar-specific, references to MISRA-C++-2008 or variations of existing MISRA-C-2012 rules.

    I've tried to set up a test environment that runs the tests against test cases inspired by the examples in the standard.

    So here are the downpoints:
    - there is an insane number of rules, no idea how far I will get
    - so far I only found few rules that match those from MISRA-C (or at least those that have an implementation in misra.py)
    - I'm lacking a good reference for which of the rules might already be covered by base-cppcheck.
    - my setup is slightly different from the cppcheck-repo. My misra.py is heavily modified in structure. I test with a custom project, not with the makefile test target of cppcheck. So bringing this in a form that could become a pull request might take some extra-work.

    Should I give feedback in another form? Different thread here?

     
  • Daniel Marjamäki

    For information I have sent an email to AUTOSAR and asked about the license.

     
    👍
    1
  • Daniel Marjamäki

    Can we publish the texts of their rules in the public domain?

    As I read the disclaimer on page 2.. we can publish the texts if we only do it for informational purposes. I.e. it seems we could publish it on a static webpage. But all other purposes require a license.

    I really hope we can sign a license that allows us to distribute the rule texts directly in the addon!

     

    Last edit: Daniel Marjamäki 2020-08-24
  • Ronny Soak

    Ronny Soak - 2020-08-24

    Biggest problem so far: how to retrieve the type of the return value of member functions or overloaded operators (e.g. "[]") of classes. Either classes that are actually in the code or classes from libraries that need to be supplied via .cfg.

     
  • Daniel Marjamäki

    Biggest problem so far: how to retrieve the type of the return value of member functions or overloaded operators (e.g. "[]") of classes. Either classes that are actually in the code or classes from libraries that need to be supplied via .cfg.

    If a function is called you should look at the "(" token.. Example code:

    int dostuff();
    
    void foo() {
        int x = dostuff();
    }
    

    dump output:

        <token id="0x55b55578ef00" file="1.cpp" linenr="5" column="20" str="(" scope="0x55b55587a690" link="0x55b55578ecc0" astParent="0x55b55578ce80" astOperand1="0x55b55578e840" valueType-type="int" valueType-sign="signed"/>
    

    The "[" tokens should have the correct type for string/vector/etc containers.

    If you see a case where the proper type info is not set please feel free to insert a short code here..

     
  • Daniel Marjamäki

    one more example code:

    void foo(std::vector<int> v) {
        int x = v[10];
    }
    

    dump output for the "[" token:

        <token id="0x5629fe902e40" file="1.cpp" linenr="3" column="14" str="[" scope="0x5629fe9ef690" link="0x5629fe9024c0" astParent="0x5629fe901e50" astOperand1="0x5629fe903e40" astOperand2="0x5629fe9030a0" valueType-type="int" valueType-sign="signed"/>
    
     
  • Ronny Soak

    Ronny Soak - 2020-08-24

    I've tested with this piece of code:

    class Test {
        public:
        Test();
        bool test1();
        bool operator [] (int i);
    };
    
    #include <ostream.h>
    void testClass() {
        Test t = Test();
        if (t.test1())
          ;
        if (t[0])
          ;
        ostream st;
        if (st.good())
          ;
    }
    

    What I get as structured representation of the class in the dump file is this:

    <scope id="0x1be96e0" type="Class" className="Test" bodyStart="0x1c14700" bodyEnd="0x1be6f10" nestedIn="0x1c10150">
       <functionList>
         <function id="0x1db0320" tokenDef="0x1bd25d0" name="Test" type="Constructor"/>
         <function id="0x1d97920" tokenDef="0x1b459f0" name="test1" type="Function"/>
         <function id="0x1ba8af0" tokenDef="0x1b96d40" name="operator[]" type="Function">
           <arg nr="1" variable="0x1bf5bc0"/>
         </function>
       </functionList>
    </scope>
    

    Notice that the return type of the functions is not mentioned here. The id if the token with the return type is actually not mentioned anywhere else in the dump. Best bet to retrieve it is to look at the last token before the name. That's not very stable or structured.

    This is the dump at the use of the functions:

    operator [] example:

    <token id="0x1c526e0" file="misra-cpp-test.cpp" linenr="45" column="6" str="t" scope="0x1bc0170" type="name" varId="14" variable="0x1bf6030" astParent="0x1b3dad0" valueType-type="record" valueType-typeScope="0x1be96e0"/>
    <token id="0x1b3dad0" file="misra-cpp-test.cpp" linenr="45" column="7" str="[" scope="0x1bc0170" link="0x1b63eb0" astParent="0x1c54dd0" astOperand1="0x1c526e0" astOperand2="0x1c52490"/>
    <token id="0x1c52490" file="misra-cpp-test.cpp" linenr="45" column="8" str="0" scope="0x1bc0170" type="number" isInt="True" values="0x1db5950" astParent="0x1b3dad0" valueType-type="int" valueType-sign="signed"/>
    <token id="0x1b63eb0" file="misra-cpp-test.cpp" linenr="45" column="9" str="]" scope="0x1bc0170" link="0x1b3dad0"/>   
    

    member function example:

    <token id="0x1bd7e70" file="misra-cpp-test.cpp" linenr="43" column="6" str="t" scope="0x1bc0170" type="name" varId="14" variable="0x1bf6030" astParent="0x1bd9a60" valueType-type="record" valueType-typeScope="0x1be96e0"/>
    <token id="0x1bd9a60" file="misra-cpp-test.cpp" linenr="43" column="7" str="." scope="0x1bc0170" astParent="0x1bd97b0" astOperand1="0x1bd7e70" astOperand2="0x1db08f0"/>
    <token id="0x1db08f0" file="misra-cpp-test.cpp" linenr="43" column="8" str="test1" scope="0x1bc0170" type="name" function="0x1d97920" astParent="0x1bd9a60"/>
    <token id="0x1bd97b0" file="misra-cpp-test.cpp" linenr="43" column="13" str="(" scope="0x1bc0170" link="0x1bd6190" astParent="0x1bcea60" astOperand1="0x1bd9a60" valueType-type="bool"/>
    <token id="0x1bd6190" file="misra-cpp-test.cpp" linenr="43" column="14" str=")" scope="0x1bc0170" link="0x1bd97b0"/>
    

    std:ostream example:

    <token id="0x1c5d940" file="misra-cpp-test.cpp" linenr="48" column="9" str="st" scope="0x1bc0170" type="name" varId="15" variable="0x1bf5df0" astParent="0x1db1710"/>
    <token id="0x1db1710" file="misra-cpp-test.cpp" linenr="48" column="11" str="." scope="0x1bc0170" astParent="0x1b46110" astOperand1="0x1c5d940" astOperand2="0x1bc8940"/>
    <token id="0x1bc8940" file="misra-cpp-test.cpp" linenr="48" column="12" str="good" scope="0x1bc0170" type="name" astParent="0x1db1710"/>
    <token id="0x1b46110" file="misra-cpp-test.cpp" linenr="48" column="16" str="(" scope="0x1bc0170" link="0x1b45fe0" astParent="0x1c5db30" astOperand1="0x1db1710"/>
    <token id="0x1b45fe0" file="misra-cpp-test.cpp" linenr="48" column="17" str=")" scope="0x1bc0170" link="0x1b46110"/>
    

    None of these tokens have type information. And this is not yet using template classes and the return types are boring old POD types.

    Am I doing something wrong in my declarations?
    How would I declare these in the .cfg if this was in a header file not available to cppcheck
    (like in the std::ostream example)?

     

    Last edit: Ronny Soak 2020-08-24
  • Daniel Marjamäki

    The Function class in Cppcheck does not have return type information. That info was just not needed before it's enough to know the type information for function calls. what checker do you want to write...

    I am afraid operator overloads for your own classes are not handled really well yet. Maybe they should be converted into function calls. So t[0] is converted into t . operator[] ( 0 ) we can the use all the normal logic for function calls in all analysis and checkers..

    Please rewrite declaration ostream st as std::ostream then st.good() get the proper type.

     
  • Ronny Soak

    Ronny Soak - 2020-08-25

    I've corrected the declaration to std::ostream and indeed, the return type is now in the dump:

    <token id="0x17129f0" file="misra-cpp-test.cpp" linenr="48" column="9" str="st" scope="0x16a96e0" type="name" varId="15" variable="0x165e480" astParent="0x171db50"/>
    <token id="0x171db50" file="misra-cpp-test.cpp" linenr="48" column="11" str="." scope="0x16a96e0" astParent="0x1871790" astOperand1="0x17129f0" astOperand2="0x171d940"/>
    <token id="0x171d940" file="misra-cpp-test.cpp" linenr="48" column="12" str="good" scope="0x16a96e0" type="name" astParent="0x171db50"/>
    <token id="0x1871790" file="Tests/misra-cpp-test.cpp" linenr="48" column="16" str="(" scope="0x16a96e0" link="0x16889a0" astParent="0x16fed30" astOperand1="0x171db50" valueType-type="bool"/>
    

    I also just now saw that the member function test1() actually DID have a return type in the call. I just missed that yesterday.

    I've also tested with a std::vector and there access with [] works. I assume this is because of the line <access indexOperator="array-like"> in the std.cfg.
    Can I use this for custom classes too?
    Is there a way to specify other operator overloads in the .cfg file?

    The example was rule Autosar A5_0_2 ~= MISRA-C++ 5.0.13 ~= MISRA-C 2012 14.4
    "The condition of an if-statement and the condition of an iteration statement
    shall have type bool."
    But this is just an example. The return type of overloaded operators is probably meaningful elsewhere too.

     

    Last edit: Ronny Soak 2020-08-25
  • Daniel Marjamäki

    Can I use this for custom classes too?

    I think we should fix the parser to handle custom operator overloads better. It is best that the checkers see that it's a function call.

    You might be able to get better type information if you use --clang. I believe that will convert the operator overloads into function calls. Feel free to use --clang also and see if there are some issues with that... I have not used that a lot yet so I expect there are some stuff that should be fixed. Basic templates and stuff should work but I am sure there are stuff I have missed.

    Is there a way to specify other operator overloads in the .cfg file?

    I do not think so currently but that is the plan.

     
  • Ronny Soak

    Ronny Soak - 2020-08-27

    I've so far solved the += operator overloading problem by specifying the type in the .cfg file like

    <container id="FakeString" startPattern="FakeString">
        <type string='std-like'/>
      </container>
    

    With this, the type gets the valueType-type="container" attribute in the dump files which is used in the rule check to disallow += on pointer types (but allows on strings and containers).

    For recognizing the [] operator, I've used

    <define name="MyClass" value="FakeTypeOuter &lt;bool&gt;"/>
    
    <container id="FakeTypeInner" endPattern="&gt; !!::" opLessAllowed="false" itEndPattern="&gt; ::" hasInitializerListConstructor="true">
        <type templateParameter="0"/>
        <access indexOperator="array-like">
          <function name="get_data" yields="at_index"/>
        </access>
      </container>
    
      <container id="FakeTypeOuter" startPattern="FakeTypeOuter &lt;" inherits="FakeTypeInner"> 
      </container>  
    

    Which works well on a number of rules to get the type of the pod returned by [] into the dump.

    I haven't found a workaround for overloaded () yet.

    Still checking any differnences with --clang ...

     
1 2 > >> (Page 1 of 2)

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.