From: Philippe W. <phi...@sk...> - 2024-08-28 21:19:10
|
On Mon, 2024-08-26 at 16:29 +0200, Mark Wielaard wrote: > Hi, > > On Mon, 2024-08-19 at 21:43 +0200, D. J. Bernstein wrote: > > There have been many successful "timing attacks" that break > > cryptographic software by working backwards from timings to secrets. One > > common use of valgrind's memcheck tool in cryptography is to catch data > > flow from secrets (marked with VALGRIND_MAKE_MEM_UNDEFINED) to branch > > instructions and array indices. Some references: > > > > https://neuromancer.sk/article/26 > > https://www.usenix.org/system/files/sec24fall-prepub-760-fourne.pdf > > https://bench.cr.yp.to/tips.html#timecop > > That is a clever trick. Some of the references mention it needs a > patched valgrind to introduce poison/unposion functions. But this looks > like just convenience. As you say, you could just directly introduce > VALGRIND_MAKE_MEM_UNDEFINED markers in the code. Are these patches > still needed/used? > > > However, there are other variable-time instructions. Our new paper > > "KyberSlash: Exploiting secret-dependent division timings in Kyber > > implementations" includes demonstrations of secret-key recovery from the > > reference software for the Kyber cryptosystem in two different > > environments, exploiting the fact that compilers sometimes use > > variable-time division instructions for divisions in that software: > > > > https://kyberslash.cr.yp.to/papers.html > > > > The paper describes a patch to valgrind to optionally catch division > > instructions on undefined data. The point of this message is to propose > > this patch for inclusion in valgrind. The patch is attached. > > > > The patch was written by Tee Kiah Chia. A few API tweaks and tests in > > valgrind's test framework were added by D. J. Bernstein. The patch > > applies cleanly to valgrind's current git repository. As per > > valgrind/README, we have licensed the patch as follows: > > > > SPDX-License-Identifer: GPL-2.0-or-later > > Thanks. The patch is easy to follow. And looks generic enough to > extend. Just have to think about how general usable this is. But your > references show people have been using valgrind memcheck for > timing/conditional jumps detection for a long time (I must admit, I > wasn't aware). > > > The patch is designed to be off by default. The user can start scanning > > for divisions using --variable-latency-errors=yes on the command line, > > VALGRIND_CLO_CHANGE("--variable-latency-errors=yes") from the program > > under test, or, easiest to use, a new environment variable > > VALGRIND_BESTEFFORT_VARIABLE_LATENCY_ERRORS=yes. > > The environment variable trick is nice. I wonder if we can generalize > that. IMO, the above is redundant with (and more specialised than) the existing VALGRIND_OPTS environment variable. In other words, the above can be achieved with: export VALGRIND_OPTS="--variable-latency-errors=yes --bidule=machin --truc=basar" (also remember we can prefix a -- option with the tool. Tools ignore then the non matching options). E.g. --memcheck:leak-check=full will be ignored by callgrind (on the command line and/or in VALGRIND_OPTS). > > > Internally, the patch is designed to allow easy future extensions to > > catch timing variations in instructions other than divisions. The patch > > catches square roots as an example. A natural long-term goal is to > > synchronize the allowed instructions with lists from CPU designers: > > > > https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/resources/data-operand-independent-timing-instructions.html > > https://developer.arm.com/documentation/ddi0595/2021-06/AArch64-Registers/DIT--Data-Independent-Timing > > > > This would also naturally resolve inconsistency #8 (vector shifts) > > documented in memcheck/mc_translate.c. Full synchronization will, > > however, be a large project. Division is an immediate problem for many > > cryptographic implementations (as shown by the scans reported in > > the KyberSlash paper), so there is immediate value in a patch that looks > > for divisions. > > Urgh. Looks like there is a huge list of instructions that might have > timing issues depending on their input. > > One issue I can see is that the current implementation works on the > Valgrind VEX IR, so after translation from the native architecture > instructions. I can imagine that translation from different > architectures into VEX might result in similar Iops, which might have > different timing characteristics. So maybe we need some way to encode > that IR expressions. Although I don't immediately know how to do that > efficiently. I guess that be able to give timing characteristics/cycle estimations per instruction in a generalised way could be useful for callgrind/cachegrind. > > Cheers, > > Mark > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |