cppcheck in CI - how to reuse older .sa files from previous scans?

Static source code analysis tool for C and C++ code

Brought to you by: danielmarjamaki

cppcheck in CI - how to reuse older .sa files from previous scans?

Forum: General Discussion

Creator: Thanassis Tsiodras

Created: 2022-10-13

Updated: 2022-10-13

Thanassis Tsiodras - 2022-10-13

I have configured cppcheck to continually process a source codebase with our continuous integration machinery. It takes a lot of time to process our codebase, though - and since most of the source files (C/C++) don't change per commit, I wanted to reuse the previous scans as much as possible.

What I tried in our CI build script:

cd src

mkdir cppcheck-cache

cp -a $HOME/cppcheck-cache/* cppcheck-cache/

bear -- make all # builds the compile_commands.json

cppcheck ... --project=$PWD/compile_commands.json -cppcheck-build-dir=$PWD/cppcheck-cache/

rsync -avu cppcheck-cache/ $HOME/cppcheck-cache/

This basically updates a "master copy" of the ".a1" files - that I see cppcheck creating in the per-build cache folder.

Sadly, this doesn't work - I don't see any acceleration between subsequent builds.

Note that our CI performs the builds from the same user (so $HOME/cppcheck-cache is indeed always the same folder) but checks out the source code in different paths every time - hence my need to copy them over from the "master copy" every time. That also solves race conditions that would arise if I just pointed cppcheck's build-dir to $HOME/cppcheck-cache - since multiple concurrent build jobs would then clash, writing in the same folder.

Any hints/advice most welcome.

Last edit: Thanassis Tsiodras 2022-10-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

CHR - 2022-10-13

We generate a hash based on which reanalysis is performed: https://github.com/danmar/cppcheck/blob/b8b6be48d981378b868d1327e20eacef68399eff/lib/cppcheck.cpp#L707
It doesn't seem like file path are included in that, but I might be wrong.
So you observe that the .a1 files are recreated every time?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanassis Tsiodras - 2022-10-13

So you observe that the .a1 files are recreated every time?

Correct - they do.
In the meantime, I also tried copying across the .snalyzerinfo files that are also generated.
I even made sure their timestamps are later than those of the .c files - i.e. my CI script after the copy it touches them:

find src -type f -iname '*zerinfo' -exec touch '{}' ';'

Still nothing - everything is regenerated every time.

To be clear: if I do the rebuild myself from inside the same folder that the CI did it, the second time everything goes fast (no reprocessing). The issue is to reproduce that phenomenon in a new build done in a different folder. That's what my CI process does - it checks out the code (under /builds/RANDOMNUMBER/src/.... ) and launches my CI script. Within my script, I somehow need to reproduce everything necessary for cppcheck to understand that almost all is identical to the previous run. Results so far: copying the cppcheck-build-dir contents and the .snalyzerinfo files is not enough.

Last edit: Thanassis Tsiodras 2022-10-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

CHR - 2022-10-13

Maybe you could dump the to-be-hashed strings from the same file but different runs to compare them (in a modified cppcheck build)?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanassis Tsiodras - 2022-10-13

You mean the toolinfo.str(), right?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanassis Tsiodras - 2022-10-13

You mean the toolinfo.str(), right? (at cppcheck.cpp:717)

Last edit: Thanassis Tsiodras 2022-10-13

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

CHR - 2022-10-13

I suggest saving the string here: https://github.com/danmar/cppcheck/blob/cff1cd9cda71868293764d9d240241cd0ee4c1ff/lib/preprocessor.cpp#L990

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanassis Tsiodras - 2022-10-13

I added

std::cout << "Hashing this:" << hashData << std::endl;

...and a corresponding include of iostream.

And sadly, in the first few lines of the output, I saw it:

<suppression errorId="nullPointerRedundantCheck" fileName="FULLPATHTOSRC"...

We are indeed using suppressions, but our suppressions.xml doesn't contain full paths.
It seems cppcheck creates full paths after parsing it, and then uses the full paths to do the hashing.

Damn.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanassis Tsiodras - 2022-10-13

I'll just patch it directly on my side, doing string replacing in the hashData (since I know the baseline paths - "/builds/RANDOMPREFIX_TO_REMOVE"). But it would be nice if cppcheck didn't use the complete paths when creating the suppressions - and instead used the "path from baseline top folder". This would work for any one who tries to do the same in the future.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.