I have configured cppcheck to continually process a source codebase with our continuous integration machinery. It takes a lot of time to process our codebase, though - and since most of the source files (C/C++) don't change per commit, I wanted to reuse the previous scans as much as possible.
What I tried in our CI build script:
cd src
mkdir cppcheck-cache
cp -a $HOME/cppcheck-cache/* cppcheck-cache/
bear -- make all # builds the compile_commands.json
This basically updates a "master copy" of the ".a1" files - that I see cppcheck creating in the per-build cache folder.
Sadly, this doesn't work - I don't see any acceleration between subsequent builds.
Note that our CI performs the builds from the same user (so $HOME/cppcheck-cache is indeed always the same folder) but checks out the source code in different paths every time - hence my need to copy them over from the "master copy" every time. That also solves race conditions that would arise if I just pointed cppcheck's build-dir to $HOME/cppcheck-cache - since multiple concurrent build jobs would then clash, writing in the same folder.
Any hints/advice most welcome.
Last edit: Thanassis Tsiodras 2022-10-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So you observe that the .a1 files are recreated every time?
Correct - they do.
In the meantime, I also tried copying across the .snalyzerinfo files that are also generated.
I even made sure their timestamps are later than those of the .c files - i.e. my CI script after the copy it touches them:
findsrc-typef-iname'*zerinfo'-exectouch'{}'';'
Still nothing - everything is regenerated every time.
To be clear: if I do the rebuild myself from inside the same folder that the CI did it, the second time everything goes fast (no reprocessing). The issue is to reproduce that phenomenon in a new build done in a different folder. That's what my CI process does - it checks out the code (under /builds/RANDOMNUMBER/src/.... ) and launches my CI script. Within my script, I somehow need to reproduce everything necessary for cppcheck to understand that almost all is identical to the previous run. Results so far: copying the cppcheck-build-dir contents and the .snalyzerinfo files is not enough.
Last edit: Thanassis Tsiodras 2022-10-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
We are indeed using suppressions, but our suppressions.xml doesn't contain full paths.
It seems cppcheck creates full paths after parsing it, and then uses the full paths to do the hashing.
Damn.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'll just patch it directly on my side, doing string replacing in the hashData (since I know the baseline paths - "/builds/RANDOMPREFIX_TO_REMOVE"). But it would be nice if cppcheck didn't use the complete paths when creating the suppressions - and instead used the "path from baseline top folder". This would work for any one who tries to do the same in the future.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have configured cppcheck to continually process a source codebase with our continuous integration machinery. It takes a lot of time to process our codebase, though - and since most of the source files (C/C++) don't change per commit, I wanted to reuse the previous scans as much as possible.
What I tried in our CI build script:
This basically updates a "master copy" of the ".a1" files - that I see cppcheck creating in the per-build cache folder.
Sadly, this doesn't work - I don't see any acceleration between subsequent builds.
Note that our CI performs the builds from the same user (so $HOME/cppcheck-cache is indeed always the same folder) but checks out the source code in different paths every time - hence my need to copy them over from the "master copy" every time. That also solves race conditions that would arise if I just pointed cppcheck's build-dir to $HOME/cppcheck-cache - since multiple concurrent build jobs would then clash, writing in the same folder.
Any hints/advice most welcome.
Last edit: Thanassis Tsiodras 2022-10-13
We generate a hash based on which reanalysis is performed: https://github.com/danmar/cppcheck/blob/b8b6be48d981378b868d1327e20eacef68399eff/lib/cppcheck.cpp#L707
It doesn't seem like file path are included in that, but I might be wrong.
So you observe that the .a1 files are recreated every time?
Correct - they do.
In the meantime, I also tried copying across the .snalyzerinfo files that are also generated.
I even made sure their timestamps are later than those of the .c files - i.e. my CI script after the copy it touches them:
Still nothing - everything is regenerated every time.
To be clear: if I do the rebuild myself from inside the same folder that the CI did it, the second time everything goes fast (no reprocessing). The issue is to reproduce that phenomenon in a new build done in a different folder. That's what my CI process does - it checks out the code (under /builds/RANDOMNUMBER/src/.... ) and launches my CI script. Within my script, I somehow need to reproduce everything necessary for cppcheck to understand that almost all is identical to the previous run. Results so far: copying the cppcheck-build-dir contents and the .snalyzerinfo files is not enough.
Last edit: Thanassis Tsiodras 2022-10-13
Maybe you could dump the to-be-hashed strings from the same file but different runs to compare them (in a modified cppcheck build)?
You mean the toolinfo.str(), right?
You mean the toolinfo.str(), right? (at cppcheck.cpp:717)
Last edit: Thanassis Tsiodras 2022-10-13
I suggest saving the string here: https://github.com/danmar/cppcheck/blob/cff1cd9cda71868293764d9d240241cd0ee4c1ff/lib/preprocessor.cpp#L990
I added
...and a corresponding include of iostream.
And sadly, in the first few lines of the output, I saw it:
We are indeed using suppressions, but our
suppressions.xml
doesn't contain full paths.It seems cppcheck creates full paths after parsing it, and then uses the full paths to do the hashing.
Damn.
I'll just patch it directly on my side, doing string replacing in the hashData (since I know the baseline paths - "/builds/RANDOMPREFIX_TO_REMOVE"). But it would be nice if cppcheck didn't use the complete paths when creating the suppressions - and instead used the "path from baseline top folder". This would work for any one who tries to do the same in the future.