Re: [Valgrind-developers] Proposed valgrind division patch

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi,

On Mon, 2024-08-19 at 21:43 +0200, D. J. Bernstein wrote:
> There have been many successful "timing attacks" that break
> cryptographic software by working backwards from timings to secrets. One
> common use of valgrind's memcheck tool in cryptography is to catch data
> flow from secrets (marked with VALGRIND_MAKE_MEM_UNDEFINED) to branch
> instructions and array indices. Some references: 
> 
>    https://neuromancer.sk/article/26
>    https://www.usenix.org/system/files/sec24fall-prepub-760-fourne.pdf
>    https://bench.cr.yp.to/tips.html#timecop

That is a clever trick. Some of the references mention it needs a
patched valgrind to introduce poison/unposion functions. But this looks
like just convenience. As you say, you could just directly introduce
VALGRIND_MAKE_MEM_UNDEFINED markers in the code. Are these patches
still needed/used?

> However, there are other variable-time instructions. Our new paper
> "KyberSlash: Exploiting secret-dependent division timings in Kyber
> implementations" includes demonstrations of secret-key recovery from the
> reference software for the Kyber cryptosystem in two different
> environments, exploiting the fact that compilers sometimes use
> variable-time division instructions for divisions in that software:
> 
>    https://kyberslash.cr.yp.to/papers.html
> 
> The paper describes a patch to valgrind to optionally catch division
> instructions on undefined data. The point of this message is to propose
> this patch for inclusion in valgrind. The patch is attached.
> 
> The patch was written by Tee Kiah Chia. A few API tweaks and tests in
> valgrind's test framework were added by D. J. Bernstein. The patch
> applies cleanly to valgrind's current git repository. As per
> valgrind/README, we have licensed the patch as follows:
> 
>    SPDX-License-Identifer: GPL-2.0-or-later

Thanks. The patch is easy to follow. And looks generic enough to
extend. Just have to think about how general usable this is. But your
references show people have been using valgrind memcheck for
timing/conditional jumps detection for a long time (I must admit, I
wasn't aware).

> The patch is designed to be off by default. The user can start scanning
> for divisions using --variable-latency-errors=yes on the command line,
> VALGRIND_CLO_CHANGE("--variable-latency-errors=yes") from the program
> under test, or, easiest to use, a new environment variable
> VALGRIND_BESTEFFORT_VARIABLE_LATENCY_ERRORS=yes.

The environment variable trick is nice. I wonder if we can generalize
that.

> Internally, the patch is designed to allow easy future extensions to
> catch timing variations in instructions other than divisions. The patch
> catches square roots as an example. A natural long-term goal is to
> synchronize the allowed instructions with lists from CPU designers:
> 
>    https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/resources/data-operand-independent-timing-instructions.html
>    https://developer.arm.com/documentation/ddi0595/2021-06/AArch64-Registers/DIT--Data-Independent-Timing
> 
> This would also naturally resolve inconsistency #8 (vector shifts)
> documented in memcheck/mc_translate.c. Full synchronization will,
> however, be a large project. Division is an immediate problem for many
> cryptographic implementations (as shown by the scans reported in
> the KyberSlash paper), so there is immediate value in a patch that looks
> for divisions.

Urgh. Looks like there is a huge list of instructions that might have
timing issues depending on their input.

One issue I can see is that the current implementation works on the
Valgrind VEX IR, so after translation from the native architecture
instructions. I can imagine that translation from different
architectures into VEX might result in similar Iops, which might have
different timing characteristics. So maybe we need some way to encode
that IR expressions. Although I don't immediately know how to do that
efficiently.

Cheers,

Mark