Menu

cppcheck2.5 multithread : last 3% slow

penguish
2021-07-31
2021-09-14
1 2 > >> (Page 1 of 2)
  • penguish

    penguish - 2021-07-31

    Hi, I'm running cppcheck 2.5 on a ~4 million line codebase (on CentOS 7.9), using the following command:
    nohup ./cppcheck-2.5/build/bin/cppcheck -j32 --std=c++17 --enable=warning,performance,portability --inline-suppr -D__CPPCHECK__ -DATOMIC_POINTER_LOCK_FREE=2 --suppress-xml=suppressions.xml -itest/ --library=Athena.xml --xml-version=2 ./athena 2>results.xml &
    The processor has 16 cores, 30GB ram. The analysis proceeds really quickly up until the last few percent: 97% completion is reached after an hour. After that it slows down exponentially; I can see the frequency of updates to the output slows to 10 minutes, 30 minutes... and the last 1% can take >5 hours. Feature or bug? The output file looks fine (if incomplete) at the 99% completed level.
    My --library file as 34 entries, and suppressions file has about 12 entries.

     

    Last edit: penguish 2021-07-31
  • john borland

    john borland - 2021-07-31

    One helpful comand is perf top. It can tell you what your Cpu's are actually doing. If I can't figure out from that then I try a flame graph. https://www.brendangregg.com/flamegraphs.html

     
  • Daniel Marjamäki

    It can often happen that the last % of analysis take most time. If it gets stuck on some file that will happen. You have 1 thread that is stuck on 1 file and your other 31 threads will continue with the other files.. your other ~ 31 threads finish the analysis of all the other files and in the end there is only 1 file left to analyse..

    The percentage is calculated based on the file size. If the files you want to analyse are 1MB in total and Cppcheck finish analysis of files that are 990kB then it will say that analysis is 99% done.

    For your information you do not get whole program analysis with those options. To get whole program analysis use the option --cppcheck-build-dir. With a cppcheck build dir it is also likely that the >5 hours analysis time can be reduced to seconds if the "hang" happens in some code that is not changed.

     

    Last edit: Daniel Marjamäki 2021-08-01
  • Daniel Marjamäki

    If you find out which file is slow then you can check that file only and use unofficial option --showtime=summary to get details about what Cppcheck is doing when analysing that file.

     
  • CHR

    CHR - 2021-08-01

    There also were some performance regressions in v2.5 that are fixed in head. Maybe you can try that?
    Sidenote: is --cppcheck-build-dirsupposed to work with multiple threads? When I last tried it (2.4), I only saw a reduction in run time when a single thread was used.

     
    • Daniel Marjamäki

      yes --cppcheck-build-dir is supposed to work with multiple threads also.

       
      • CHR

        CHR - 2021-08-01

        Ok, I'll try that again. Run times used to look like this:
        4 threads: 3h
        4 threads + --cppcheck-build-dir: 3h (always)
        1 thread + --cppcheck-build-dir: between 0.5h and 12h, depending on code changes

         
        • Daniel Marjamäki

          ok please do that sounds wrong!

           
        • CHR

          CHR - 2021-08-04

          The issue remains the same in 2.5: Apparently, all files are checked again, even for a commit that changes just a single .cpp file, when multiple threads are used. The build dir contains many .a1 and .s2 files though. I'm checking a whole source directory, not individual files.
          Anything I can do to debug this?

           
          • Daniel Marjamäki

            I can partially reproduce. I checked cppcheck source code:

            mkdir 1
            cppcheck -j4 --cppcheck-build-dir=1 -D__CPPCHECK__ lib
            cppcheck -j4 --cppcheck-build-dir=1 -D__CPPCHECK__ lib
            

            It should not recheck any files, but on my computer it does recheck some of the files. I will look into this.

             
            👍
            1
          • Daniel Marjamäki

            Anything I can do to debug this?

            Thanks! Check that the .a1 files are valid xml files.

             
            • CHR

              CHR - 2021-08-05

              I just ran xmllint on some .a1 files. It complains about a missing DTD, but that's probably not an issue. Otherwise I'm seeing parse errors due to unsecaped <, e.g. in <path file="path/file.h" line="1935" col="13" info="Assuming that condition 'Index<0' is not redundant"/>.
              I'll try your latest commit next.

               
              • CHR

                CHR - 2021-08-05

                Another example: <FileInfo check="Class"><class name="ArrayType < Element >" file="path/file.h" line="102" col="20" hash="9413124673900663442"/>

                 
                • Daniel Marjamäki

                  please verify that you still get that after https://github.com/danmar/cppcheck/commit/ad478914f7ccaa7be522e56f6aa4d8d7527d8f3a

                  this is one of the bugs that I did try to fix.

                   
                  • CHR

                    CHR - 2021-08-05

                    The first few files that I have looked at are now valid. Let's see if the next run picks them up.
                    Sidenote: Sometimes I see this string in the .a1files: "Assuming condition is Assuming condition is false" That's not correct, is it?

                     
                  • CHR

                    CHR - 2021-08-05

                    My announcement was premature, when analysing more files I saw examples like this involving function templates:
                    <function-call call-id="path/file.h:609:8" call-funcname="GetValue < std :: string >" call-argnr="1" file="path/file.h" line="913" col="22" my-id="path/file.h:911:8" my-argnr="2"/>

                     
                  • CHR

                    CHR - 2021-08-06

                    After your latest commit, the XML in the .a1 files is now valid. However, cppcheck still checks most files again, even if cached results exist. Maybe some debug output could help?

                     
                    • Daniel Marjamäki

                      sounds good. can you see if something is changed in the .a1 file after a recheck? In particular is the hash at the top changed? do you check a single configuration or multiple configurations? I don't remember if we could handle when there are multiple configurations.

                       

                      Last edit: Daniel Marjamäki 2021-08-07
                      • CHR

                        CHR - 2021-08-09

                        I have looked at a few examples of .a1files before and after a recheck, and the hash value is the only difference. We specify some values with -D, so that's a single config, right?
                        Weird thing is, if I check a single file locally, the .a1 file is generated and used in subsequent calls even with -j4. But it does not work on our build server, where the whole directory is checked at once.

                         
    • CHR

      CHR - 2021-08-10

      There were at least two problems, one related to our build environment.

      • On our build server, Jenkins calls a PowerShell script, which then calls cppcheck. The script did essentially this
      cd path/to/source/dir
      cppcheck [parameters] path/to/source/dir
      

      With this configuration, no existing .a1 files were picked up, although everything worked fine when calling cppcheck manually. The solution was to change the script to

      cd path/to/source/dir
      cppcheck [parameters] .
      
      • Now some, but not all .a1 files were used. I have added debug code to dump the strings that the hashes are computed from to a file. When comparing these files from consecutive runs, the content is the same, but the order of certain lines has changed. I'll provide an example shortly.
       
      • CHR

        CHR - 2021-08-11

        Here's an example how the input for the hash function might look like in consecutive runs on the same file.
        First run:

        2.6 devwspp [defines]  <suppressions>
            <suppression errorId="functionConst" fileName="header1.h" lineNumber="2709" />
            <suppression errorId="danglingTemporaryLifetime" fileName="header2.h" lineNumber="455" />
            <suppression errorId="functionConst" fileName="header2.h" lineNumber="1870" />
            <suppression errorId="functionConst" fileName="header2.h" lineNumber="1900" />
            [other suppressions]
          </suppressions>
        
        [tokenized code]
        

        Second run:

        2.6 devwspp [defines]  <suppressions>
            <suppression errorId="functionConst" fileName="header1.h" lineNumber="2709" />
            [other suppressions]
            <suppression errorId="danglingTemporaryLifetime" fileName="header2.h" lineNumber="455" />
            <suppression errorId="functionConst" fileName="header2.h" lineNumber="1870" />
            <suppression errorId="functionConst" fileName="header2.h" lineNumber="1900" />
          </suppressions>
        
        [tokenized code]
        
         
        👍
        1

        Last edit: CHR 2021-08-11
        • Daniel Marjamäki

          that is a great find. So the order of the suppressions might change. Very strange. The order should not matter so a sort should be allowed.

           
        • CHR

          CHR - 2021-08-11

          I'll try sorting the suppressions, let's see if that helps.

           
        • CHR

          CHR - 2021-08-12

          Now that the suppressions are sorted, I'm seeing cases where suppressions are missing or appearing from one run to the next. How can this be? Some threading issue, probably?

           
          • CHR

            CHR - 2021-08-12

            Looking at threadexecutor.cpp, the code for THREADING_MODEL_FORK and THREADING_MODEL_WIN is quite different. Hard to say what might go wrong there...

             
1 2 > >> (Page 1 of 2)

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.