You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
1
|
2
|
3
|
4
|
5
(1) |
6
(4) |
|
7
(6) |
8
(1) |
9
(3) |
10
|
11
(6) |
12
|
13
(1) |
|
14
|
15
(1) |
16
(2) |
17
(3) |
18
|
19
(1) |
20
|
|
21
|
22
(1) |
23
|
24
|
25
|
26
(14) |
27
(2) |
|
28
|
29
(2) |
30
|
31
|
|
|
|
|
From: Wu, F. <fe...@in...> - 2023-05-26 14:09:08
|
On 5/26/2023 9:59 PM, Fei Wu wrote: > I'm from Intel RISC-V team and working on a RISC-V International > development partner project to add RISC-V vector (RVV) support on > Valgrind, the target tool is memcheck. My work bases on commit > 71272b252977 of Petr's riscv64-linux branch, many thanks to Petr for his > great work first. > https://github.com/petrpavlu/valgrind-riscv64 > > This RFC is a starting point of RVV support on Valgrind, It's far from > complete, which will take huge time, but I do think it's more effective > to have some real code for discussion, so this series adds the RVV > support to run memcpy/strcmp/strcpy/strlen/strncpy in: > https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/master/examples > In case the intrinsic version is built with extra RVV instructions which are not supported yet, here is an assembly version. All C code is from the above link with a small tweak, and the asm code is copied from: https://github.com/riscv/riscv-v-spec/tree/master/example diff --git a/rvv-examples/Makefile b/rvv-examples/Makefile new file mode 100644 index 000000000..dfae4ac31 --- /dev/null +++ b/rvv-examples/Makefile @@ -0,0 +1,23 @@ +CC := clang +CFLAGS := -g -march=rv64gcv -mllvm -riscv-v-vector-bits-min=128 -O2 +ASFLAGS := -g -march=rv64gcv -mllvm -riscv-v-vector-bits-min=128 -O2 + +BINARY = rvv_strcmp rvv_memcpy rvv_strcpy rvv_strlen rvv_strncpy + +.PHONY: all clean test + +all: $(BINARY) + +clean: + rm -f $(BINARY) + +test: $(BINARY) + for t in $(BINARY); do \ + valgrind ./$$t; \ + done + +rvv_strcmp: rvv_strcmp.c strcmp.s +rvv_memcpy: rvv_memcpy.c memcpy.s +rvv_strcpy: rvv_strcpy.c strcpy.s +rvv_strlen: rvv_strlen.c strlen.s +rvv_strncpy: rvv_strncpy.c strncpy.s diff --git a/rvv-examples/common.h b/rvv-examples/common.h new file mode 100644 index 000000000..cec96ed2b --- /dev/null +++ b/rvv-examples/common.h @@ -0,0 +1,112 @@ +// common.h +// common utilites for the test code under exmaples/ + +#include <math.h> +#include <stdbool.h> +#include <stddef.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> + +extern void *memcpy_vec(void *dst, void *src, size_t n); +extern int strcmp_vec(const char *src1, const char *src2); +extern char *strcpy_vec(char *dst, const char *src); +extern size_t strlen_vec(char *src); +extern char *strncpy_vec(char *dst, char *src, size_t count); + +void gen_rand_1d(double *a, int n) { + for (int i = 0; i < n; ++i) + a[i] = (double)rand() / (double)RAND_MAX + (double)(rand() % 1000); +} + +void gen_string(char *s, int n) { + // char value range: -128 ~ 127 + for (int i = 0; i < n - 1; ++i) + s[i] = (char)(rand() % 127) + 1; + s[n - 1] = '\0'; +} + +void gen_rand_2d(double **ar, int n, int m) { + for (int i = 0; i < n; ++i) + for (int j = 0; j < m; ++j) + ar[i][j] = (double)rand() / (double)RAND_MAX + (double)(rand() % 1000); +} + +void print_string(const char *a, const char *name) { + printf("const char *%s = \"", name); + int i = 0; + while (a[i] != 0) + putchar(a[i++]); + printf("\"\n"); + puts(""); +} + +void print_array_1d(double *a, int n, const char *type, const char *name) { + printf("%s %s[%d] = {\n", type, name, n); + for (int i = 0; i < n; ++i) { + printf("%06.2f%s", a[i], i != n - 1 ? "," : "};\n"); + if (i % 10 == 9) + puts(""); + } + puts(""); +} + +void print_array_2d(double **a, int n, int m, const char *type, + const char *name) { + printf("%s %s[%d][%d] = {\n", type, name, n, m); + for (int i = 0; i < n; ++i) { + for (int j = 0; j < m; ++j) { + printf("%06.2f", a[i][j]); + if (j == m - 1) + puts(i == n - 1 ? "};" : ","); + else + putchar(','); + } + } + puts(""); +} + +bool double_eq(double golden, double actual, double relErr) { + return (fabs(actual - golden) < relErr); +} + +bool compare_1d(double *golden, double *actual, int n) { + for (int i = 0; i < n; ++i) + if (!double_eq(golden[i], actual[i], 1e-6)) + return false; + return true; +} + +bool compare_string(const char *golden, const char *actual, int n) { + for (int i = 0; i < n; ++i) + if (golden[i] != actual[i]) + return false; + return true; +} + +bool compare_2d(double **golden, double **actual, int n, int m) { + for (int i = 0; i < n; ++i) + for (int j = 0; j < m; ++j) + if (!double_eq(golden[i][j], actual[i][j], 1e-6)) + return false; + return true; +} + +double **alloc_array_2d(int n, int m) { + double **ret; + ret = (double **)malloc(sizeof(double *) * n); + for (int i = 0; i < n; ++i) + ret[i] = (double *)malloc(sizeof(double) * m); + return ret; +} + +void init_array_one_1d(double *ar, int n) { + for (int i = 0; i < n; ++i) + ar[i] = 1; +} + +void init_array_one_2d(double **ar, int n, int m) { + for (int i = 0; i < n; ++i) + for (int j = 0; j < m; ++j) + ar[i][j] = 1; +} diff --git a/rvv-examples/memcpy.s b/rvv-examples/memcpy.s new file mode 100644 index 000000000..1b50ab670 --- /dev/null +++ b/rvv-examples/memcpy.s @@ -0,0 +1,17 @@ + .text + .balign 4 + .global memcpy_vec + # void *memcpy_vec(void* dest, const void* src, size_t n) + # a0=dest, a1=src, a2=n + # + memcpy_vec: + mv a3, a0 # Copy destination + loop: + vsetvli t0, a2, e8, m8, ta, ma # Vectors of 8b + vle8.v v0, (a1) # Load bytes + add a1, a1, t0 # Bump pointer + sub a2, a2, t0 # Decrement count + vse8.v v0, (a3) # Store bytes + add a3, a3, t0 # Bump pointer + bnez a2, loop # Any more? + ret # Return diff --git a/rvv-examples/rvv_memcpy.c b/rvv-examples/rvv_memcpy.c new file mode 100644 index 000000000..d78b9b604 --- /dev/null +++ b/rvv-examples/rvv_memcpy.c @@ -0,0 +1,21 @@ +#include "common.h" +#include <riscv_vector.h> +#include <string.h> + +int main() { + const int N = 127; + const uint32_t seed = 0xdeadbeef; + srand(seed); + + // data gen + double A[N]; + gen_rand_1d(A, N); + + // compute + double golden[N], actual[N]; + memcpy(golden, A, sizeof(A)); + memcpy_vec(actual, A, sizeof(A)); + + // compare + puts(compare_1d(golden, actual, N) ? "pass" : "fail"); +} diff --git a/rvv-examples/rvv_strcmp.c b/rvv-examples/rvv_strcmp.c new file mode 100644 index 000000000..d10cac133 --- /dev/null +++ b/rvv-examples/rvv_strcmp.c @@ -0,0 +1,25 @@ +#include "common.h" +#include <riscv_vector.h> +#include <string.h> + +int main() { + const int N = 1023; + const uint32_t seed = 0xdeadbeef; + srand(seed); + + // data gen + char s0[N], s1[N]; + gen_string(s0, N); + gen_string(s1, N); + + // compute + int golden, actual; + golden = strcmp(s0, s1); + actual = strcmp_vec(s0, s1); + + golden = (golden == 0) ? 0 : (golden > 0) ? 1 : -1; + actual = (golden == 0) ? 0 : (golden > 0) ? 1 : -1; + + // compare + puts(golden == actual ? "pass" : "fail"); +} diff --git a/rvv-examples/rvv_strcpy.c b/rvv-examples/rvv_strcpy.c new file mode 100644 index 000000000..7e5af8673 --- /dev/null +++ b/rvv-examples/rvv_strcpy.c @@ -0,0 +1,22 @@ +#include "common.h" +#include <assert.h> +#include <riscv_vector.h> +#include <string.h> + +int main() { + const int N = 2000; + const uint32_t seed = 0xdeadbeef; + srand(seed); + + // data gen + char s0[N]; + gen_string(s0, N); + + // compute + char golden[N], actual[N]; + strcpy(golden, s0); + strcpy_vec(actual, s0); + + // compare + puts(strcmp(golden, actual) == 0 ? "pass" : "fail"); +} diff --git a/rvv-examples/rvv_strlen.c b/rvv-examples/rvv_strlen.c new file mode 100644 index 000000000..e1142f883 --- /dev/null +++ b/rvv-examples/rvv_strlen.c @@ -0,0 +1,22 @@ +#include "common.h" +#include <riscv_vector.h> +#include <string.h> + +int main() { + const uint32_t seed = 0xdeadbeef; + srand(seed); + + int N = rand() % 2000; + + // data gen + char s0[N]; + gen_string(s0, N); + + // compute + size_t golden, actual; + golden = strlen(s0); + actual = strlen_vec(s0); + + // compare + puts(golden == actual ? "pass" : "fail"); +} diff --git a/rvv-examples/rvv_strncpy.c b/rvv-examples/rvv_strncpy.c new file mode 100644 index 000000000..f1d14ac52 --- /dev/null +++ b/rvv-examples/rvv_strncpy.c @@ -0,0 +1,25 @@ +#include "common.h" +#include <riscv_vector.h> +#include <string.h> + +int main() { + const int N = 1320; + const uint32_t seed = 0xdeadbeef; + srand(seed); + + // data gen + char s0[N]; + gen_string(s0, N); + char s1[] = "the quick brown fox jumps over the lazy dog"; + size_t count = strlen(s1) + rand() % 500; + + // compute + char golden[N], actual[N]; + strcpy(golden, s0); + strcpy(actual, s0); + strncpy(golden, s1, count); + strncpy_vec(actual, s1, count); + + // compare + puts(compare_string(golden, actual, N) ? "pass" : "fail"); +} diff --git a/rvv-examples/strcmp.s b/rvv-examples/strcmp.s new file mode 100644 index 000000000..85d32c96d --- /dev/null +++ b/rvv-examples/strcmp.s @@ -0,0 +1,34 @@ + .text + .balign 4 + .global strcmp_vec + # int strcmp_vec(const char *src1, const char* src2) +strcmp_vec: + ## Using LMUL=2, but same register names work for larger LMULs + li t1, 0 # Initial pointer bump +loop: + vsetvli t0, x0, e8, m2, ta, ma # Max length vectors of bytes + add a0, a0, t1 # Bump src1 pointer + vle8ff.v v8, (a0) # Get src1 bytes + add a1, a1, t1 # Bump src2 pointer + vle8ff.v v16, (a1) # Get src2 bytes + + vmseq.vi v0, v8, 0 # Flag zero bytes in src1 + vmsne.vv v1, v8, v16 # Flag if src1 != src2 + vmor.mm v0, v0, v1 # Combine exit conditions + + vfirst.m a2, v0 # ==0 or != ? + csrr t1, vl # Get number of bytes fetched + + bltz a2, loop # Loop if all same and no zero byte + + add a0, a0, a2 # Get src1 element address + lbu a3, (a0) # Get src1 byte from memory + + add a1, a1, a2 # Get src2 element address + lbu a4, (a1) # Get src2 byte from memory + + sub a0, a3, a4 # Return value. + + ret + + diff --git a/rvv-examples/strcpy.s b/rvv-examples/strcpy.s new file mode 100644 index 000000000..292df25ac --- /dev/null +++ b/rvv-examples/strcpy.s @@ -0,0 +1,20 @@ + .text + .balign 4 + .global strcpy_vec + # char* strcpy_vec(char *dst, const char* src) +strcpy_vec: + mv a2, a0 # Copy dst + li t0, -1 # Infinite AVL +loop: + vsetvli x0, t0, e8, m8, ta, ma # Max length vectors of bytes + vle8ff.v v8, (a1) # Get src bytes + csrr t1, vl # Get number of bytes fetched + vmseq.vi v1, v8, 0 # Flag zero bytes + vfirst.m a3, v1 # Zero found? + add a1, a1, t1 # Bump pointer + vmsif.m v0, v1 # Set mask up to and including zero byte. + vse8.v v8, (a2), v0.t # Write out bytes + add a2, a2, t1 # Bump pointer + bltz a3, loop # Zero byte not found, so loop + + ret diff --git a/rvv-examples/strlen.s b/rvv-examples/strlen.s new file mode 100644 index 000000000..721c0257e --- /dev/null +++ b/rvv-examples/strlen.s @@ -0,0 +1,22 @@ + .text + .balign 4 + .global strlen_vec +# size_t strlen_vec(const char *str) +# a0 holds *str + +strlen_vec: + mv a3, a0 # Save start +loop: + vsetvli a1, x0, e8, m8, ta, ma # Vector of bytes of maximum length + vle8ff.v v8, (a3) # Load bytes + csrr a1, vl # Get bytes read + vmseq.vi v0, v8, 0 # Set v0[i] where v8[i] = 0 + vfirst.m a2, v0 # Find first set bit + add a3, a3, a1 # Bump pointer + bltz a2, loop # Not found? + + add a0, a0, a1 # Sum start + bump + add a3, a3, a2 # Add index + sub a0, a3, a0 # Subtract start address+bump + + ret diff --git a/rvv-examples/strncpy.s b/rvv-examples/strncpy.s new file mode 100644 index 000000000..f7114c5ca --- /dev/null +++ b/rvv-examples/strncpy.s @@ -0,0 +1,36 @@ + .text + .balign 4 + .global strncpy_vec + # char* strncpy_vec(char *dst, const char* src, size_t n) +strncpy_vec: + mv a3, a0 # Copy dst +loop: + vsetvli x0, a2, e8, m8, ta, ma # Vectors of bytes. + vle8ff.v v8, (a1) # Get src bytes + vmseq.vi v1, v8, 0 # Flag zero bytes + csrr t1, vl # Get number of bytes fetched + vfirst.m a4, v1 # Zero found? + vmsbf.m v0, v1 # Set mask up to before zero byte. + vse8.v v8, (a3), v0.t # Write out non-zero bytes + bgez a4, zero_tail # Zero remaining bytes. + sub a2, a2, t1 # Decrement count. + add a3, a3, t1 # Bump dest pointer + add a1, a1, t1 # Bump src pointer + bnez a2, loop # Anymore? + + ret + +zero_tail: + sub a2, a2, a4 # Subtract count on non-zero bytes. + add a3, a3, a4 # Advance past non-zero bytes. + vsetvli t1, a2, e8, m8, ta, ma # Vectors of bytes. + vmv.v.i v0, 0 # Splat zero. + +zero_loop: + vse8.v v0, (a3) # Store zero. + sub a2, a2, t1 # Decrement count. + add a3, a3, t1 # Bump pointer + vsetvli t1, a2, e8, m8, ta, ma # Vectors of bytes. + bnez a2, zero_loop # Anymore? + + ret Thanks, Fei. > The whole idea is splitting the vector instructions into scalar > instructions which have already been well supported on Petr's branch, > the correctness of binary translation (tool=none) is simple to ensure, > but the logic of tool=memcheck should not be broken, one of the keys is > to deal with the instructions with mask: > > * for load/store with mask, LoadG/StoreG are enabled, the same semantics > as other architectures > > * for other instructions such as vadd, if the vector mask agnostic (vma) > is set to undisturbed, the masked original value is read first then > write back, the V bit won't change even after write back, it's not > necessary to have another guard type like LoadG/StoreG. > > Pros > ---- > * by leveraging the existing scalar instructions support on Valgrind, > usually adding a new instruction involves only the frontend in > guest_riscv64_toIR, other parts are rare touched, so effort is much > reduced to enable new instructions. > > * As the backend only sees the scalar IRs and generates scalar > instructions, it's possible to run valgrind ./vec-test on non-RVV host. > > Cons > ---- > * as this method splits RVV instruction at frontend, there is less > chance to optimize at other stages, e.g. the vbits tracking. > > * with larger vlen such as 1K, at most 1 RVV instruction will split into > 1K ops, besides the performance penalty, it causes pressure to other > components such as tmp space too. Some of this can be relieved by > grouping multiple elements together. > > > There are some alternatives, but none seems perfect: > * helper function. It's much easier to make tool=none work, but how good > is it to handle the V+A tracking and other tools? Generally speaking, it > should not be a general solution for too many instructions. > > * define and pass the RVV IR to backend, instead of splitting it too > early. This introduces much effort, we should evaluate what level of > profit can be attained. > > At last, if the performance is tolerable, is this the right way to go? > > > Fei Wu (12): > riscv64: Starting Vector support, registers added > riscv64: Pass riscv guest_state for translation > riscv64: Add SyncupEnv & TooManyIR jump kinds > riscv64: Add LoadG/StoreG support > riscv64: Shift guest_state -2048 on calling helper > riscv64: Add cpu_state to TB > riscv64: Introduce dis_RV64V and add vsetvl > riscv64: Add load/store > riscv64: Add csrr vl > riscv64: add vfirst > riscv64: Add vmsgtu/vmseq/vmsne/vmsbf/vmsif/vmor/vmv/vid > riscv64: Add vadd > > VEX/priv/guest_riscv64_toIR.c | 974 +++++++++++++++++++++++++++++- > VEX/priv/host_riscv64_defs.c | 133 ++++ > VEX/priv/host_riscv64_defs.h | 23 + > VEX/priv/host_riscv64_isel.c | 89 ++- > VEX/priv/ir_defs.c | 8 + > VEX/priv/ir_opt.c | 4 +- > VEX/pub/libvex.h | 4 + > VEX/pub/libvex_guest_riscv64.h | 47 +- > VEX/pub/libvex_ir.h | 9 +- > coregrind/m_scheduler/scheduler.c | 17 +- > coregrind/m_translate.c | 5 + > coregrind/m_transtab.c | 26 +- > coregrind/pub_core_transtab.h | 5 + > memcheck/mc_machine.c | 35 ++ > memcheck/mc_translate.c | 4 + > 15 files changed, 1368 insertions(+), 15 deletions(-) > |
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:59
|
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 51 +++++++++++++++++++++++++++++++++++
1 file changed, 51 insertions(+)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index aaa906f1b..13be0d01d 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -3569,6 +3569,17 @@ static IRExpr* widen_Sto64(IRExpr* e, IRType ty)
}
}
+static IRExpr* narrow_64to(IRExpr* e, UInt bits)
+{
+ switch (bits) {
+ case 8: return unop(Iop_64to8, e);
+ case 16: return unop(Iop_64to16, e);
+ case 32: return unop(Iop_64to32, e);
+ case 64: return e;
+ default: vassert(0);
+ }
+}
+
static Bool dis_vmsgtu_vx(/*MB_OUT*/ DisResult* dres,
/*OUT*/ IRSB* irsb,
UInt insn,
@@ -3686,6 +3697,44 @@ static Bool dis_vmseq_vi(/*MB_OUT*/ DisResult* dres,
return True;
}
+static Bool dis_vadd_vv(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ UInt vs1 = INSN(19, 15);
+ UInt vd = INSN(11, 7);
+
+ UInt sew = get_sew(guest);
+ UInt sew_b = sew / 8;
+ IRType ty = integerIRTypeOfSize(sew_b);
+
+ for (UInt i = 0; i < guest->guest_vl; ++i) {
+ UInt offset = i * sew_b;
+ IRExpr* res = narrow_64to(
+ binop(Iop_Add64,
+ widen_Sto64(getVReg(vs2, offset, ty), ty),
+ widen_Sto64(getVReg(vs1, offset, ty), ty)),
+ sew);
+ if (vm == 0) {
+ UInt mask_outer_offset = i / 64 * 8;
+ UInt mask_inner_offset = i % 64;
+ IRExpr* guard = binop(Iop_CmpNE64,
+ mkU64(0),
+ binop(Iop_And64,
+ getVReg(0 /* v0 */, mask_outer_offset, Ity_I64),
+ mkU64(1UL << mask_inner_offset)));
+ res = IRExpr_ITE(guard, res, getVReg(vd, offset, ty));
+ }
+ putVReg(irsb, vd, offset, res);
+ }
+
+ return True;
+}
+
static Bool dis_vmsne_vv(/*MB_OUT*/ DisResult* dres,
/*OUT*/ IRSB* irsb,
UInt insn,
@@ -3977,6 +4026,8 @@ static Bool dis_opivv(/*MB_OUT*/ DisResult* dres,
UInt funct6 = INSN(31, 26);
switch (funct6) {
+ case 0b000000:
+ return dis_vadd_vv(dres, irsb, insn, guest_pc_curr_instr, guest);
case 0b011001:
return dis_vmsne_vv(dres, irsb, insn, guest_pc_curr_instr, guest);
default:
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:58
|
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 454 ++++++++++++++++++++++++++++++++++
VEX/priv/host_riscv64_isel.c | 20 +-
2 files changed, 471 insertions(+), 3 deletions(-)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index 3ef0aeb77..aaa906f1b 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -3534,6 +3534,320 @@ static Bool dis_vsetvl(/*MB_OUT*/ DisResult* dres,
return True;
}
+// return sew in bits
+static UInt get_sew(VexGuestRISCV64State* guest)
+{
+ UInt raw_sew = SLICE_UInt(guest->guest_vtype, 5, 3);
+ switch (raw_sew) {
+ case 0b000: return 8;
+ case 0b001: return 16;
+ case 0b010: return 32;
+ case 0b011: return 64;
+ default: vassert(0);
+ }
+}
+
+static IRExpr* widen_Uto64(IRExpr* e, IRType ty)
+{
+ switch (ty) {
+ case Ity_I8: return unop(Iop_8Uto64, e);
+ case Ity_I16: return unop(Iop_16Uto64, e);
+ case Ity_I32: return unop(Iop_32Uto64, e);
+ case Ity_I64: return e;
+ default: vassert(0);
+ }
+}
+
+static IRExpr* widen_Sto64(IRExpr* e, IRType ty)
+{
+ switch (ty) {
+ case Ity_I8: return unop(Iop_8Sto64, e);
+ case Ity_I16: return unop(Iop_16Sto64, e);
+ case Ity_I32: return unop(Iop_32Sto64, e);
+ case Ity_I64: return e;
+ default: vassert(0);
+ }
+}
+
+static Bool dis_vmsgtu_vx(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ UInt rs1 = INSN(19, 15);
+ UInt vd = INSN(11, 7);
+
+ UInt vma = SLICE_UInt(guest->guest_vtype, 7, 7);
+ UInt sew_b = get_sew(guest) / 8;
+ IRType ty = integerIRTypeOfSize(sew_b);
+
+ UInt offset = 0;
+ for (UInt o = 0; o < guest->guest_vl; o += 64) {
+ // generate res w/o mask
+ UInt remain = guest->guest_vl - o;
+ UInt step = (remain > 64) ? 64 : remain;
+ IRExpr* res = mkU64(0);
+ for (UInt i = 0; i < step; ++i) {
+ IRExpr* bit = binop(Iop_Shl64,
+ unop(Iop_1Uto64,
+ binop(Iop_CmpLT64U,
+ getIReg64(rs1),
+ widen_Uto64(getVReg(vs2, offset, ty), ty))),
+ mkU8(i));
+
+ res = binop(Iop_Or64, res, bit);
+ offset += sew_b;
+ }
+
+ // modify res according to mask
+ UInt mask_offset = o / 8;
+ if (vm == 0) {
+ IRExpr* v0_step = getVReg(0, mask_offset, Ity_I64);
+
+ IRExpr* inactive;
+ if (vma == 0) { // undisturbed, read it first
+ IRExpr* vd_step = getVReg(vd, mask_offset, Ity_I64);
+ inactive = binop(Iop_And64, unop(Iop_Not64, v0_step), vd_step);
+ } else { // agnostic, set to 1
+ inactive = binop(Iop_And64, unop(Iop_Not64, v0_step), mkU64(-1UL));
+ }
+ IRExpr* active = binop(Iop_And64, v0_step, res);
+ res = binop(Iop_Or64, active, inactive);
+ }
+ putVReg(irsb, vd, mask_offset, res);
+ }
+
+ putPC(irsb, mkU64(guest_pc_curr_instr + 4));
+ dres->whatNext = Dis_StopHere;
+ dres->jk_StopHere = Ijk_TooManyIR;
+
+ return True;
+}
+
+static Bool dis_vmseq_vi(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ ULong imm = sext_slice_ulong(insn, 19, 15);
+ UInt vd = INSN(11, 7);
+
+ UInt vma = SLICE_UInt(guest->guest_vtype, 7, 7);
+ UInt sew_b = get_sew(guest) / 8;
+ IRType ty = integerIRTypeOfSize(sew_b);
+
+ UInt offset = 0;
+ for (UInt o = 0; o < guest->guest_vl; o += 64) {
+ // generate res w/o mask
+ UInt remain = guest->guest_vl - o;
+ UInt step = (remain > 64) ? 64 : remain;
+ IRExpr* res = mkU64(0);
+ for (UInt i = 0; i < step; ++i) {
+ IRExpr* bit = binop(Iop_Shl64,
+ unop(Iop_1Uto64,
+ binop(Iop_CmpEQ64,
+ mkU64(imm),
+ widen_Sto64(getVReg(vs2, offset, ty), ty))),
+ mkU8(i));
+
+ res = binop(Iop_Or64, res, bit);
+ offset += sew_b;
+ }
+
+ // modify res according to mask
+ UInt mask_offset = o / 8;
+ if (vm == 0) {
+ IRExpr* v0_step = getVReg(0, mask_offset, Ity_I64);
+
+ IRExpr* inactive;
+ if (vma == 0) { // undisturbed, read it first
+ IRExpr* vd_step = getVReg(vd, mask_offset, Ity_I64);
+ inactive = binop(Iop_And64, unop(Iop_Not64, v0_step), vd_step);
+ } else { // agnostic, set to 1
+ inactive = binop(Iop_And64, unop(Iop_Not64, v0_step), mkU64(-1UL));
+ }
+ IRExpr* active = binop(Iop_And64, v0_step, res);
+ res = binop(Iop_Or64, active, inactive);
+ }
+
+ putVReg(irsb, vd, mask_offset, res);
+ }
+
+ putPC(irsb, mkU64(guest_pc_curr_instr + 4));
+ dres->whatNext = Dis_StopHere;
+ dres->jk_StopHere = Ijk_TooManyIR;
+
+ return True;
+}
+
+static Bool dis_vmsne_vv(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ UInt vs1 = INSN(19, 15);
+ UInt vd = INSN(11, 7);
+
+ UInt vma = SLICE_UInt(guest->guest_vtype, 7, 7);
+ UInt sew_b = get_sew(guest) / 8;
+ IRType ty = integerIRTypeOfSize(sew_b);
+
+ UInt offset = 0;
+ for (UInt o = 0; o < guest->guest_vl; o += 64) {
+ // generate res w/o mask
+ UInt remain = guest->guest_vl - o;
+ UInt step = (remain > 64) ? 64 : remain;
+ IRExpr* res = mkU64(0);
+ for (UInt i = 0; i < step; ++i) {
+ IRExpr* bit = binop(Iop_Shl64,
+ unop(Iop_1Uto64,
+ binop(Iop_CmpNE64,
+ widen_Sto64(getVReg(vs1, offset, ty), ty),
+ widen_Sto64(getVReg(vs2, offset, ty), ty))),
+ mkU8(i));
+
+ res = binop(Iop_Or64, res, bit);
+ offset += sew_b;
+ }
+
+ // modify res according to mask
+ UInt mask_offset = o / 8;
+ if (vm == 0) {
+ IRExpr* v0_step = getVReg(0, mask_offset, Ity_I64);
+
+ IRExpr* inactive;
+ if (vma == 0) { // undisturbed, read it first
+ IRExpr* vd_step = getVReg(vd, mask_offset, Ity_I64);
+ inactive = binop(Iop_And64, unop(Iop_Not64, v0_step), vd_step);
+ } else { // agnostic, set to 1
+ inactive = binop(Iop_And64, unop(Iop_Not64, v0_step), mkU64(-1UL));
+ }
+ IRExpr* active = binop(Iop_And64, v0_step, res);
+ res = binop(Iop_Or64, active, inactive);
+ }
+
+ putVReg(irsb, vd, mask_offset, res);
+ }
+
+ putPC(irsb, mkU64(guest_pc_curr_instr + 4));
+ dres->whatNext = Dis_StopHere;
+ dres->jk_StopHere = Ijk_TooManyIR;
+
+ return True;
+}
+
+static Bool dis_vmsbf_m(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ UInt vd = INSN(11, 7);
+
+ vassert(vm == 1); // mask not supported yet
+
+ IRExpr* not_found = mkU64(-1UL);
+ IRExpr* prev = not_found;
+ for (UInt i = 0; i < guest->guest_vl; i += 64) {
+ UInt mask_offset = i / 8;
+ // x = n - (n & n - 1) with only the rightmost set bit
+ // y = (x - 1) for vmsbf
+ IRExpr* n = getVReg(vs2, mask_offset, Ity_I64);
+ IRExpr* x = binop(Iop_Sub64,
+ n,
+ binop(Iop_And64,
+ n,
+ binop(Iop_Sub64,
+ n,
+ mkU64(1))));
+ IRExpr* y = binop(Iop_Sub64,
+ x,
+ mkU64(1));
+
+ IRExpr* cond = binop(Iop_CmpEQ64, prev, not_found);
+ IRExpr* res = IRExpr_ITE(cond, y, mkU64(0));
+
+ putVReg(irsb, vd, mask_offset, res);
+ prev = res;
+ }
+
+ return True;
+}
+
+static Bool dis_vmsif_m(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ UInt vd = INSN(11, 7);
+
+ vassert(vm == 1); // mask not supported yet
+
+ IRExpr* not_found = mkU64(-1UL);
+ IRExpr* prev = not_found;
+ for (UInt i = 0; i < guest->guest_vl; i += 64) {
+ UInt mask_offset = i / 8;
+ // x = n - (n & n - 1) with only the rightmost set bit
+ // y = x + (x - 1) for vmsif
+ IRExpr* n = getVReg(vs2, mask_offset, Ity_I64);
+ IRExpr* x = binop(Iop_Sub64,
+ n,
+ binop(Iop_And64,
+ n,
+ binop(Iop_Sub64,
+ n,
+ mkU64(1))));
+ IRExpr* y = binop(Iop_Add64,
+ x,
+ binop(Iop_Sub64,
+ x,
+ mkU64(1)));
+
+ IRExpr* cond = binop(Iop_CmpEQ64, prev, not_found);
+ IRExpr* res = IRExpr_ITE(cond, y, mkU64(0));
+
+ putVReg(irsb, vd, mask_offset, res);
+ prev = res;
+ }
+
+ return True;
+}
+
+static Bool dis_vmor_mm(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vs2 = INSN(24, 20);
+ UInt vs1 = INSN(19, 15);
+ UInt vd = INSN(11, 7);
+
+ for (UInt i = 0; i < guest->guest_vl; i += 64) {
+ UInt mask_offset = i / 8;
+ IRExpr* mask = binop(Iop_Or64,
+ getVReg(vs1, mask_offset, Ity_I64),
+ getVReg(vs2, mask_offset, Ity_I64));
+ putVReg(irsb, vd, mask_offset, mask);
+ }
+
+ return True;
+}
+
static ULong riscv_vfirst(VexGuestRISCV64State* guest, UInt vs2, UInt vm)
{
ULong index = -1UL;
@@ -3600,6 +3914,77 @@ static Bool dis_vfirst_m(/*MB_OUT*/ DisResult* dres,
return True;
}
+static Bool dis_vid_v(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vs2 = INSN(24, 20);
+ UInt vd = INSN(11, 7);
+
+ vassert(vs2 == 0);
+
+ UInt sew_b = get_sew(guest) / 8;
+ IRType ty = integerIRTypeOfSize(sew_b);
+
+ UInt offset = 0;
+ for (UInt i = 0; i < guest->guest_vl; ++i) {
+ putVReg(irsb, vd, offset, mkU(ty, i));
+ offset += sew_b;
+ }
+
+ return True;
+}
+
+static Bool dis_vmv_vi(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vs2 = INSN(24, 20);
+ ULong imm = sext_slice_ulong(insn, 19, 15);
+ UInt vd = INSN(11, 7);
+
+ vassert(vs2 == 0);
+
+ UInt sew_b = get_sew(guest) / 8;
+ IRExpr* e_imm;
+ switch (sew_b) {
+ case 1: e_imm = mkU8((UChar)imm); break;
+ case 2: e_imm = mkU16((UShort)imm); break;
+ case 4: e_imm = mkU32((UInt)imm); break;
+ case 8: e_imm = mkU64(imm); break;
+ default: vassert(0);
+ }
+
+ UInt offset = 0;
+ for (UInt i = 0; i < guest->guest_vl; ++i) {
+ putVReg(irsb, vd, offset, e_imm);
+ offset += sew_b;
+ }
+
+ return True;
+}
+
+static Bool dis_opivv(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt funct6 = INSN(31, 26);
+
+ switch (funct6) {
+ case 0b011001:
+ return dis_vmsne_vv(dres, irsb, insn, guest_pc_curr_instr, guest);
+ default:
+ return False;
+ }
+ return False;
+}
+
static Bool dis_opmvv(/*MB_OUT*/ DisResult* dres,
/*OUT*/ IRSB* irsb,
UInt insn,
@@ -3609,6 +3994,8 @@ static Bool dis_opmvv(/*MB_OUT*/ DisResult* dres,
UInt funct6 = INSN(31, 26);
switch (funct6) {
+ case 0b011010:
+ return dis_vmor_mm(dres, irsb, insn, guest_pc_curr_instr, guest);
case 0b010000:
switch (INSN(19, 15)) {
case 0b10001:
@@ -3617,12 +4004,71 @@ static Bool dis_opmvv(/*MB_OUT*/ DisResult* dres,
return False;
}
return False;
+ case 0b010100:
+ switch (INSN(19, 15)) {
+ case 0b00001:
+ return dis_vmsbf_m(dres, irsb, insn, guest_pc_curr_instr, guest);
+ case 0b00011:
+ return dis_vmsif_m(dres, irsb, insn, guest_pc_curr_instr, guest);
+ case 0b10001:
+ return dis_vid_v(dres, irsb, insn, guest_pc_curr_instr, guest);
+ default:
+ return False;
+ }
+ return False;
default:
return False;
}
return False;
}
+static Bool dis_opivi(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt funct6 = INSN(31, 26);
+ UInt vm = INSN(25, 25);
+
+ switch (funct6) {
+ case 0b011000:
+ return dis_vmseq_vi(dres, irsb, insn, guest_pc_curr_instr, guest);
+ case 0b010111:
+ if (vm == 1) {
+ return dis_vmv_vi(dres, irsb, insn, guest_pc_curr_instr, guest);
+ }
+ return False;
+ default:
+ return False;
+ }
+}
+
+static Bool dis_opivx(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt funct6 = INSN(31, 26);
+
+ switch (funct6) {
+ case 0b011110:
+ return dis_vmsgtu_vx(dres, irsb, insn, guest_pc_curr_instr, guest);
+ default:
+ return False;
+ }
+}
+
+static Bool dis_opmvx(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ return False;
+}
+
static UInt decode_eew(UInt raw_eew)
{
switch (raw_eew) {
@@ -3728,8 +4174,16 @@ static Bool dis_RV64V(/*MB_OUT*/ DisResult* dres,
switch (INSN(6, 0)) {
case 0b1010111:
switch (INSN(14, 12)) {
+ case 0b000: // OPIVV
+ return dis_opivv(dres, irsb, insn, guest_pc_curr_instr, guest);
case 0b010: // OPMVV
return dis_opmvv(dres, irsb, insn, guest_pc_curr_instr, guest);
+ case 0b011: // OPIVI
+ return dis_opivi(dres, irsb, insn, guest_pc_curr_instr, guest);
+ case 0b100: // OPIVX
+ return dis_opivx(dres, irsb, insn, guest_pc_curr_instr, guest);
+ case 0b110: // OPMVX
+ return dis_opmvx(dres, irsb, insn, guest_pc_curr_instr, guest);
case 0b111: // vsetvl
return dis_vsetvl(dres, irsb, insn, guest_pc_curr_instr);
default:
diff --git a/VEX/priv/host_riscv64_isel.c b/VEX/priv/host_riscv64_isel.c
index 127200d8e..06e08ca8d 100644
--- a/VEX/priv/host_riscv64_isel.c
+++ b/VEX/priv/host_riscv64_isel.c
@@ -634,6 +634,8 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
case Iop_Xor32:
case Iop_Or64:
case Iop_Or32:
+ case Iop_Or16:
+ case Iop_Or8:
case Iop_Or1:
case Iop_And64:
case Iop_And32:
@@ -670,6 +672,8 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
break;
case Iop_Or64:
case Iop_Or32:
+ case Iop_Or16:
+ case Iop_Or8:
case Iop_Or1:
op = RISCV64op_OR;
break;
@@ -982,11 +986,12 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
addInstr(env, RISCV64Instr_ALUImm(RISCV64op_SLTIU, dst, src, 1));
return dst;
}
+ case Iop_1Uto32:
case Iop_8Uto32:
case Iop_8Uto64:
case Iop_16Uto64:
case Iop_32Uto64: {
- UInt shift =
+ UInt shift = (e->Iex.Unop.op == Iop_1Uto32) ? 63 :
64 - 8 * sizeofIRType(typeOfIRExpr(env->type_env, e->Iex.Unop.arg));
HReg tmp = newVRegI(env);
HReg src = iselIntExpr_R(env, e->Iex.Unop.arg);
@@ -995,6 +1000,8 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
addInstr(env, RISCV64Instr_ALUImm(RISCV64op_SRLI, dst, tmp, shift));
return dst;
}
+ case Iop_1Sto8:
+ case Iop_1Sto16:
case Iop_1Sto32:
case Iop_1Sto64: {
HReg tmp = newVRegI(env);
@@ -1010,12 +1017,14 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
case Iop_32Sto64:
/* These are no-ops. */
return iselIntExpr_R(env, e->Iex.Unop.arg);
+ case Iop_32to1:
case Iop_32to8:
case Iop_32to16:
case Iop_64to8:
case Iop_64to16:
case Iop_64to32: {
- UInt shift = 64 - 8 * sizeofIRType(ty);
+ UInt shift = (e->Iex.Unop.op == Iop_32to1) ? 63 :
+ 64 - 8 * sizeofIRType(ty);
HReg tmp = newVRegI(env);
HReg src = iselIntExpr_R(env, e->Iex.Unop.arg);
addInstr(env, RISCV64Instr_ALUImm(RISCV64op_SLLI, tmp, src, shift));
@@ -1047,6 +1056,7 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
return dst;
}
case Iop_CmpNEZ8:
+ case Iop_CmpNEZ16:
case Iop_CmpNEZ32:
case Iop_CmpNEZ64: {
HReg dst = newVRegI(env);
@@ -1166,6 +1176,10 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
vassert(ty == Ity_I8);
u = vex_sx_to_64(e->Iex.Const.con->Ico.U8, 8);
break;
+ case Ico_U1:
+ vassert(ty == Ity_I1);
+ u = vex_sx_to_64(e->Iex.Const.con->Ico.U1, 1);
+ break;
default:
goto irreducible;
}
@@ -1176,7 +1190,7 @@ static HReg iselIntExpr_R_wrk(ISelEnv* env, IRExpr* e)
/* ---------------------- MULTIPLEX ---------------------- */
case Iex_ITE: {
/* ITE(ccexpr, iftrue, iffalse) */
- if (ty == Ity_I64 || ty == Ity_I32) {
+ if (ty == Ity_I64 || ty == Ity_I32 || ty == Ity_I16 || ty == Ity_I8) {
HReg dst = newVRegI(env);
HReg iftrue = iselIntExpr_R(env, e->Iex.ITE.iftrue);
HReg iffalse = iselIntExpr_R(env, e->Iex.ITE.iffalse);
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:55
|
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 91 +++++++++++++++++++++++++++++++++++
1 file changed, 91 insertions(+)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index ccad384d4..3ef0aeb77 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -3534,6 +3534,95 @@ static Bool dis_vsetvl(/*MB_OUT*/ DisResult* dres,
return True;
}
+static ULong riscv_vfirst(VexGuestRISCV64State* guest, UInt vs2, UInt vm)
+{
+ ULong index = -1UL;
+ ULong* p0 = (ULong *)((char *)guest + OFFB_V0);
+ ULong* p = (ULong *)((char *)guest + OFFB_V0 + vs2 * sizeof(guest->guest_v0));
+
+ for (UInt o = 0; o < guest->guest_vl && index == -1; o += 64) {
+ UInt remain = guest->guest_vl - o;
+ UInt step = (remain > 64) ? 64 : remain;
+
+ ULong v = *p++;
+ ULong v0 = (vm == 1) ? -1UL : *p0++;
+ v &= v0;
+ for (ULong i = 0; i < step; ++i) {
+ if (v & (1UL << i)) {
+ index = i + o;
+ break;
+ }
+ }
+ }
+
+ return index;
+}
+
+// From Hacker's Delight
+static UInt round_down_to_pow2(UInt x)
+{
+ x = x | (x >> 1);
+ x = x | (x >> 2);
+ x = x | (x >> 4);
+ x = x | (x >> 8);
+ x = x | (x >> 16);
+ return x - (x >> 1);
+}
+
+static Bool dis_vfirst_m(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vm = INSN(25, 25);
+ UInt vs2 = INSN(24, 20);
+ UInt rd = INSN(11, 7);
+
+ // lack ctz (count trailing zeros) like instruction in the backend, so use
+ // helper function
+ IRTemp index = newTemp(irsb, Ity_I64);
+ IRDirty *d = unsafeIRDirty_1_N(index,
+ 0,
+ "riscv_vfirst",
+ &riscv_vfirst,
+ mkIRExprVec_3(IRExpr_GSPTR(), mkU32(vs2), mkU32(vm)));
+ d->nFxState = 1;
+ vex_bzero(&d->fxState, sizeof(d->fxState));
+ d->fxState[0].fx = Ifx_Read;
+ d->fxState[0].offset = offsetVReg(vs2);
+ // do_shadow_Dirty doesn't accept non-power-2 size yet
+ d->fxState[0].size = round_down_to_pow2(guest->guest_vl / 8);
+
+ stmt(irsb, IRStmt_Dirty(d));
+ putIReg64(irsb, rd, mkexpr(index));
+
+ return True;
+}
+
+static Bool dis_opmvv(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt funct6 = INSN(31, 26);
+
+ switch (funct6) {
+ case 0b010000:
+ switch (INSN(19, 15)) {
+ case 0b10001:
+ return dis_vfirst_m(dres, irsb, insn, guest_pc_curr_instr, guest);
+ default:
+ return False;
+ }
+ return False;
+ default:
+ return False;
+ }
+ return False;
+}
+
static UInt decode_eew(UInt raw_eew)
{
switch (raw_eew) {
@@ -3639,6 +3728,8 @@ static Bool dis_RV64V(/*MB_OUT*/ DisResult* dres,
switch (INSN(6, 0)) {
case 0b1010111:
switch (INSN(14, 12)) {
+ case 0b010: // OPMVV
+ return dis_opmvv(dres, irsb, insn, guest_pc_curr_instr, guest);
case 0b111: // vsetvl
return dis_vsetvl(dres, irsb, insn, guest_pc_curr_instr);
default:
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:53
|
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index 30644e171..ccad384d4 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -826,6 +826,8 @@ static const HChar* nameCSR(UInt csr)
return "frm";
case 0x003:
return "fcsr";
+ case 0xc20:
+ return "vl";
default:
vpanic("nameCSR(riscv64)");
}
@@ -3376,7 +3378,7 @@ static Bool dis_RV64Zicsr(/*MB_OUT*/ DisResult* dres,
UInt rd = INSN(11, 7);
UInt rs1 = INSN(19, 15);
UInt csr = INSN(31, 20);
- if (csr != 0x001 && csr != 0x002 && csr != 0x003) {
+ if (csr != 0x001 && csr != 0x002 && csr != 0x003 && csr != 0xc20) {
/* Invalid CSRRS, fall through. */
} else {
switch (csr) {
@@ -3419,6 +3421,15 @@ static Bool dis_RV64Zicsr(/*MB_OUT*/ DisResult* dres,
binop(Iop_And32, getIReg32(rs1), mkU32(0xff))));
break;
}
+ case 0xc20: {
+ /* vl */
+ IRTemp vl = newTemp(irsb, Ity_I64);
+ assign(irsb, vl, IRExpr_Get(OFFB_VL, Ity_I64));
+ if (rd != 0)
+ putIReg64(irsb, rd, mkexpr(vl));
+ vassert(rs1 == 0);
+ break;
+ }
default:
vassert(0);
}
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:51
|
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 110 ++++++++++++++++++++++++++++++++++
1 file changed, 110 insertions(+)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index 6407692f9..30644e171 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -123,6 +123,8 @@ static IRExpr* mkU64(ULong i) { return IRExpr_Const(IRConst_U64(i)); }
/* Create an expression to produce a 32-bit constant. */
static IRExpr* mkU32(UInt i) { return IRExpr_Const(IRConst_U32(i)); }
+static IRExpr* mkU16(UInt i) { return IRExpr_Const(IRConst_U16((UShort)i)); }
+
/* Create an expression to produce an 8-bit constant. */
static IRExpr* mkU8(UInt i)
{
@@ -130,6 +132,17 @@ static IRExpr* mkU8(UInt i)
return IRExpr_Const(IRConst_U8((UChar)i));
}
+static IRExpr* mkU(IRType ty, ULong i)
+{
+ switch (ty) {
+ case Ity_I8: return mkU8((UChar)i);
+ case Ity_I16: return mkU16((UShort)i);
+ case Ity_I32: return mkU32((UInt)i);
+ case Ity_I64: return mkU64(i);
+ default: vassert(0);
+ }
+}
+
/* Create an expression to read a temporary. */
static IRExpr* mkexpr(IRTemp tmp) { return IRExpr_RdTmp(tmp); }
@@ -3510,6 +3523,98 @@ static Bool dis_vsetvl(/*MB_OUT*/ DisResult* dres,
return True;
}
+static UInt decode_eew(UInt raw_eew)
+{
+ switch (raw_eew) {
+ case 0b000: return 8;
+ case 0b101: return 16;
+ case 0b110: return 32;
+ case 0b111: return 64;
+ default: vassert(0);
+ }
+}
+
+static Bool dis_ldst(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ VexGuestRISCV64State* guest)
+{
+ UInt vd = INSN(11, 7);
+ UInt width = INSN(14, 12);
+ UInt rs1 = INSN(19, 15);
+ UInt umop = INSN(24, 20);
+ UInt vm = INSN(25, 25);
+ UInt mew_mop = INSN(28, 26);
+
+ // TODO: only part of all ld/st instructions are handled
+ if (!(mew_mop == 0b000 &&
+ (umop == 0b00000 || umop == 0b10000))) { // ignore fault-only-first
+ return False;
+ }
+
+ Bool is_load = INSN(6, 0) == 0b0000111;
+
+ DIP("%s - vl: %llu, insn: %x, vtype: %llx, vreg: %s\n",
+ is_load ? "vload" : "vstore",
+ guest->guest_vl, insn, guest->guest_vtype, nameVReg(vd));
+
+ UInt eew_b = decode_eew(width) / 8;
+ IRType ty = integerIRTypeOfSize(eew_b);
+ UInt offset = 0;
+ if (vm == 1) { // disabled
+ // It's possible to use larger ty them elem size
+ for (UInt i = 0; i < guest->guest_vl; ++i) {
+ IRExpr* addr = binop(Iop_Add64, getIReg64(rs1), mkU64(offset));
+
+ if (is_load) {
+ putVReg(irsb, vd, offset, loadLE(ty, addr));
+ } else {
+ storeLE(irsb, addr, getVReg(vd, offset, ty));
+ }
+
+ offset += eew_b;
+ }
+ } else { // enabled
+ for (UInt i = 0; i < guest->guest_vl; ++i) {
+ IRExpr* addr = binop(Iop_Add64, getIReg64(rs1), mkU64(offset));
+ UInt mask_offset = i / 64 * 8;
+ IRExpr* guard = binop(Iop_CmpNE64,
+ mkU64(0),
+ binop(Iop_And64,
+ getVReg(0 /* v0 */, mask_offset, Ity_I64),
+ mkU64(1UL << (i % 64))));
+
+ if (is_load) {
+ IRLoadGOp no_cvt = ILGop_INVALID;
+ switch (ty) {
+ case Ity_I8: no_cvt = ILGop_Ident8; break;
+ case Ity_I16: no_cvt = ILGop_Ident16; break;
+ case Ity_I32: no_cvt = ILGop_Ident32; break;
+ case Ity_I64: no_cvt = ILGop_Ident64; break;
+ default: vassert(0);
+ }
+
+ UInt vma = SLICE_UInt(guest->guest_vtype, 7, 7);
+ IRExpr* alt = (vma == 0) ? getVReg(vd, offset, ty) : mkU(ty, -1UL);
+ IRTemp res = newTemp(irsb, ty);
+ stmt(irsb, IRStmt_LoadG(Iend_LE, no_cvt, res, addr, alt, guard));
+ putVReg(irsb, vd, offset, mkexpr(res));
+ } else {
+ stmt(irsb, IRStmt_StoreG(Iend_LE, addr, getVReg(vd, offset, ty), guard));
+ }
+
+ offset += eew_b;
+ }
+ }
+
+ putPC(irsb, mkU64(guest_pc_curr_instr + 4));
+ dres->whatNext = Dis_StopHere;
+ dres->jk_StopHere = Ijk_TooManyIR;
+
+ return True;
+}
+
static Bool dis_RV64V(/*MB_OUT*/ DisResult* dres,
/*OUT*/ IRSB* irsb,
UInt insn,
@@ -3517,6 +3622,8 @@ static Bool dis_RV64V(/*MB_OUT*/ DisResult* dres,
const VexAbiInfo* abiinfo)
{
+ VexGuestRISCV64State* guest = abiinfo->riscv64_guest_state;
+
// spec - 10. Vector Arithmetic Instruction Formats
switch (INSN(6, 0)) {
case 0b1010111:
@@ -3526,6 +3633,9 @@ static Bool dis_RV64V(/*MB_OUT*/ DisResult* dres,
default:
return False;
}
+ case 0b0000111: // load
+ case 0b0100111: // store
+ return dis_ldst(dres, irsb, insn, guest_pc_curr_instr, guest);
default:
return False;
}
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:49
|
There is no conditional move instruction on riscv so far, support
LoadG/StoreG with branches in host isel.
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/host_riscv64_defs.c | 133 +++++++++++++++++++++++++++++++++++
VEX/priv/host_riscv64_defs.h | 23 ++++++
VEX/priv/host_riscv64_isel.c | 58 +++++++++++++++
VEX/priv/ir_defs.c | 6 ++
VEX/priv/ir_opt.c | 4 +-
VEX/pub/libvex_ir.h | 2 +
memcheck/mc_translate.c | 4 ++
7 files changed, 229 insertions(+), 1 deletion(-)
diff --git a/VEX/priv/host_riscv64_defs.c b/VEX/priv/host_riscv64_defs.c
index f6137b55b..457b2fde4 100644
--- a/VEX/priv/host_riscv64_defs.c
+++ b/VEX/priv/host_riscv64_defs.c
@@ -440,6 +440,20 @@ RISCV64Instr_Load(RISCV64LoadOp op, HReg dst, HReg base, Int soff12)
return i;
}
+RISCV64Instr*
+RISCV64Instr_LoadG(RISCV64LoadOp op, HReg dst, HReg base, Int soff12, HReg guard, HReg alt)
+{
+ RISCV64Instr* i = LibVEX_Alloc_inline(sizeof(RISCV64Instr));
+ i->tag = RISCV64in_LoadG;
+ i->RISCV64in.LoadG.op = op;
+ i->RISCV64in.LoadG.dst = dst;
+ i->RISCV64in.LoadG.base = base;
+ i->RISCV64in.LoadG.soff12 = soff12;
+ i->RISCV64in.LoadG.guard = guard;
+ i->RISCV64in.LoadG.alt = alt;
+ return i;
+}
+
RISCV64Instr*
RISCV64Instr_Store(RISCV64StoreOp op, HReg src, HReg base, Int soff12)
{
@@ -452,6 +466,19 @@ RISCV64Instr_Store(RISCV64StoreOp op, HReg src, HReg base, Int soff12)
return i;
}
+RISCV64Instr*
+RISCV64Instr_StoreG(RISCV64StoreOp op, HReg src, HReg base, Int soff12, HReg guard)
+{
+ RISCV64Instr* i = LibVEX_Alloc_inline(sizeof(RISCV64Instr));
+ i->tag = RISCV64in_StoreG;
+ i->RISCV64in.StoreG.op = op;
+ i->RISCV64in.StoreG.src = src;
+ i->RISCV64in.StoreG.base = base;
+ i->RISCV64in.StoreG.soff12 = soff12;
+ i->RISCV64in.StoreG.guard = guard;
+ return i;
+}
+
RISCV64Instr* RISCV64Instr_LoadR(RISCV64LoadROp op, HReg dst, HReg addr)
{
RISCV64Instr* i = LibVEX_Alloc_inline(sizeof(RISCV64Instr));
@@ -703,6 +730,16 @@ void ppRISCV64Instr(const RISCV64Instr* i, Bool mode64)
ppHRegRISCV64(i->RISCV64in.Load.base);
vex_printf(")");
return;
+ case RISCV64in_LoadG:
+ vex_printf("%-7s ", showRISCV64LoadOp(i->RISCV64in.LoadG.op));
+ ppHRegRISCV64(i->RISCV64in.LoadG.dst);
+ vex_printf(", %d(", i->RISCV64in.LoadG.soff12);
+ ppHRegRISCV64(i->RISCV64in.LoadG.base);
+ vex_printf("), ");
+ ppHRegRISCV64(i->RISCV64in.LoadG.guard);
+ vex_printf(", ");
+ ppHRegRISCV64(i->RISCV64in.LoadG.alt);
+ return;
case RISCV64in_Store:
vex_printf("%-7s ", showRISCV64StoreOp(i->RISCV64in.Store.op));
ppHRegRISCV64(i->RISCV64in.Store.src);
@@ -710,6 +747,14 @@ void ppRISCV64Instr(const RISCV64Instr* i, Bool mode64)
ppHRegRISCV64(i->RISCV64in.Store.base);
vex_printf(")");
return;
+ case RISCV64in_StoreG:
+ vex_printf("%-7s ", showRISCV64StoreOp(i->RISCV64in.StoreG.op));
+ ppHRegRISCV64(i->RISCV64in.StoreG.src);
+ vex_printf(", %d(", i->RISCV64in.StoreG.soff12);
+ ppHRegRISCV64(i->RISCV64in.StoreG.base);
+ vex_printf("), ");
+ ppHRegRISCV64(i->RISCV64in.StoreG.guard);
+ return;
case RISCV64in_LoadR:
vex_printf("%-7s ", showRISCV64LoadROp(i->RISCV64in.LoadR.op));
ppHRegRISCV64(i->RISCV64in.LoadR.dst);
@@ -997,10 +1042,21 @@ void getRegUsage_RISCV64Instr(HRegUsage* u, const RISCV64Instr* i, Bool mode64)
addHRegUse(u, HRmWrite, i->RISCV64in.Load.dst);
addHRegUse(u, HRmRead, i->RISCV64in.Load.base);
return;
+ case RISCV64in_LoadG:
+ addHRegUse(u, HRmWrite, i->RISCV64in.LoadG.dst);
+ addHRegUse(u, HRmRead, i->RISCV64in.LoadG.base);
+ addHRegUse(u, HRmRead, i->RISCV64in.LoadG.guard);
+ addHRegUse(u, HRmRead, i->RISCV64in.LoadG.alt);
+ return;
case RISCV64in_Store:
addHRegUse(u, HRmRead, i->RISCV64in.Store.src);
addHRegUse(u, HRmRead, i->RISCV64in.Store.base);
return;
+ case RISCV64in_StoreG:
+ addHRegUse(u, HRmRead, i->RISCV64in.StoreG.src);
+ addHRegUse(u, HRmRead, i->RISCV64in.StoreG.base);
+ addHRegUse(u, HRmRead, i->RISCV64in.StoreG.guard);
+ return;
case RISCV64in_LoadR:
addHRegUse(u, HRmWrite, i->RISCV64in.LoadR.dst);
addHRegUse(u, HRmRead, i->RISCV64in.LoadR.addr);
@@ -1218,10 +1274,21 @@ void mapRegs_RISCV64Instr(HRegRemap* m, RISCV64Instr* i, Bool mode64)
mapReg(m, &i->RISCV64in.Load.dst);
mapReg(m, &i->RISCV64in.Load.base);
return;
+ case RISCV64in_LoadG:
+ mapReg(m, &i->RISCV64in.LoadG.dst);
+ mapReg(m, &i->RISCV64in.LoadG.base);
+ mapReg(m, &i->RISCV64in.LoadG.guard);
+ mapReg(m, &i->RISCV64in.LoadG.alt);
+ return;
case RISCV64in_Store:
mapReg(m, &i->RISCV64in.Store.src);
mapReg(m, &i->RISCV64in.Store.base);
return;
+ case RISCV64in_StoreG:
+ mapReg(m, &i->RISCV64in.StoreG.src);
+ mapReg(m, &i->RISCV64in.StoreG.base);
+ mapReg(m, &i->RISCV64in.StoreG.guard);
+ return;
case RISCV64in_LoadR:
mapReg(m, &i->RISCV64in.LoadR.dst);
mapReg(m, &i->RISCV64in.LoadR.addr);
@@ -1914,6 +1981,43 @@ Int emit_RISCV64Instr(/*MB_MOD*/ Bool* is_profInc,
}
break;
}
+ case RISCV64in_LoadG: {
+ /* beq guard, zero, 1f
+ * l<size> dst, soff12(base)
+ * c.j 2f
+ * 1: c.mv dst, alt
+ * 2:
+ */
+ UInt guard = iregEnc(i->RISCV64in.LoadG.guard);
+ p = emit_B(p, 0b1100011, (8 >> 1) & 0xfff, 0b000, guard, 0 /*x0/zero*/);
+
+ UInt dst = iregEnc(i->RISCV64in.LoadG.dst);
+ UInt base = iregEnc(i->RISCV64in.LoadG.base);
+ Int soff12 = i->RISCV64in.LoadG.soff12;
+ vassert(soff12 >= -2048 && soff12 < 2048);
+ UInt imm11_0 = soff12 & 0xfff;
+ switch (i->RISCV64in.LoadG.op) {
+ case RISCV64op_LD:
+ p = emit_I(p, 0b0000011, dst, 0b011, base, imm11_0);
+ goto done;
+ case RISCV64op_LW:
+ p = emit_I(p, 0b0000011, dst, 0b010, base, imm11_0);
+ goto done;
+ case RISCV64op_LH:
+ p = emit_I(p, 0b0000011, dst, 0b001, base, imm11_0);
+ goto done;
+ case RISCV64op_LB:
+ p = emit_I(p, 0b0000011, dst, 0b000, base, imm11_0);
+ goto done;
+ }
+
+ p = emit_CJ(p, 0b01, (4 >> 1) & 0x7ff, 0b101);
+
+ UInt alt = iregEnc(i->RISCV64in.LoadG.alt);
+ p = emit_CR(p, 0b10, alt, dst, 0b1000);
+
+ break;
+ }
case RISCV64in_Store: {
/* s<size> src, soff12(base) */
UInt src = iregEnc(i->RISCV64in.Store.src);
@@ -1937,6 +2041,35 @@ Int emit_RISCV64Instr(/*MB_MOD*/ Bool* is_profInc,
}
goto done;
}
+ case RISCV64in_StoreG: {
+ /* beq guard, zero, 1f
+ * s<size> src, soff12(base)
+ * 1:
+ */
+ UInt guard = iregEnc(i->RISCV64in.StoreG.guard);
+ p = emit_B(p, 0b1100011, (8 >> 1) & 0xfff, 0b000, guard, 0 /*x0/zero*/);
+
+ UInt src = iregEnc(i->RISCV64in.StoreG.src);
+ UInt base = iregEnc(i->RISCV64in.StoreG.base);
+ Int soff12 = i->RISCV64in.StoreG.soff12;
+ vassert(soff12 >= -2048 && soff12 < 2048);
+ UInt imm11_0 = soff12 & 0xfff;
+ switch (i->RISCV64in.StoreG.op) {
+ case RISCV64op_SD:
+ p = emit_S(p, 0b0100011, imm11_0, 0b011, base, src);
+ goto done;
+ case RISCV64op_SW:
+ p = emit_S(p, 0b0100011, imm11_0, 0b010, base, src);
+ goto done;
+ case RISCV64op_SH:
+ p = emit_S(p, 0b0100011, imm11_0, 0b001, base, src);
+ goto done;
+ case RISCV64op_SB:
+ p = emit_S(p, 0b0100011, imm11_0, 0b000, base, src);
+ goto done;
+ }
+ goto done;
+ }
case RISCV64in_LoadR: {
/* lr.<size> dst, (addr) */
UInt dst = iregEnc(i->RISCV64in.LoadR.dst);
diff --git a/VEX/priv/host_riscv64_defs.h b/VEX/priv/host_riscv64_defs.h
index 1990fe3f5..45fadeb6c 100644
--- a/VEX/priv/host_riscv64_defs.h
+++ b/VEX/priv/host_riscv64_defs.h
@@ -324,7 +324,9 @@ typedef enum {
RISCV64in_ALUImm, /* Computational binary instruction, with
an immediate as the second input. */
RISCV64in_Load, /* Load from memory (sign-extended). */
+ RISCV64in_LoadG, /* Load from memory (sign-extended) with guard. */
RISCV64in_Store, /* Store to memory. */
+ RISCV64in_StoreG, /* Store to memory with guard. */
RISCV64in_LoadR, /* Load-reserved from memory (sign-extended). */
RISCV64in_StoreC, /* Store-conditional to memory. */
RISCV64in_CSRRW, /* Atomic swap of values in a CSR and an integer
@@ -382,6 +384,15 @@ typedef struct {
HReg base;
Int soff12; /* -2048 .. +2047 */
} Load;
+ /* Load from memory (sign-extended) with guard. */
+ struct {
+ RISCV64LoadOp op;
+ HReg dst;
+ HReg base;
+ Int soff12; /* -2048 .. +2047 */
+ HReg guard;
+ HReg alt;
+ } LoadG;
/* Store to memory. */
struct {
RISCV64StoreOp op;
@@ -389,6 +400,14 @@ typedef struct {
HReg base;
Int soff12; /* -2048 .. +2047 */
} Store;
+ /* Store to memory with guard. */
+ struct {
+ RISCV64StoreOp op;
+ HReg src;
+ HReg base;
+ Int soff12; /* -2048 .. +2047 */
+ HReg guard;
+ } StoreG;
/* Load-reserved from memory (sign-extended). */
struct {
RISCV64LoadROp op;
@@ -536,7 +555,11 @@ RISCV64Instr_ALUImm(RISCV64ALUImmOp op, HReg dst, HReg src, Int imm12);
RISCV64Instr*
RISCV64Instr_Load(RISCV64LoadOp op, HReg dst, HReg base, Int soff12);
RISCV64Instr*
+RISCV64Instr_LoadG(RISCV64LoadOp op, HReg dst, HReg base, Int soff12, HReg guard, HReg alt);
+RISCV64Instr*
RISCV64Instr_Store(RISCV64StoreOp op, HReg src, HReg base, Int soff12);
+RISCV64Instr*
+RISCV64Instr_StoreG(RISCV64StoreOp op, HReg src, HReg base, Int soff12, HReg guard);
RISCV64Instr* RISCV64Instr_LoadR(RISCV64LoadROp op, HReg dst, HReg addr);
RISCV64Instr*
RISCV64Instr_StoreC(RISCV64StoreCOp op, HReg res, HReg src, HReg addr);
diff --git a/VEX/priv/host_riscv64_isel.c b/VEX/priv/host_riscv64_isel.c
index 87213fb86..355f559bd 100644
--- a/VEX/priv/host_riscv64_isel.c
+++ b/VEX/priv/host_riscv64_isel.c
@@ -1587,6 +1587,37 @@ static void iselStmt(ISelEnv* env, IRStmt* stmt)
}
switch (stmt->tag) {
+ /* ----------------------- LoadG ------------------------ */
+ case Ist_LoadG: {
+ IRLoadG* lg = stmt->Ist.LoadG.details;
+ if (lg->end != Iend_LE)
+ goto stmt_fail;
+
+ IRType tyd = typeOfIRExpr(env->type_env, lg->alt);
+ if (tyd == Ity_I64 || tyd == Ity_I32 || tyd == Ity_I16 || tyd == Ity_I8) {
+ HReg dst = lookupIRTemp(env, lg->dst);
+ HReg addr = iselIntExpr_R(env, lg->addr);
+ HReg guard = iselIntExpr_R(env, lg->guard);
+ HReg alt = iselIntExpr_R(env, lg->alt);
+
+ vassert(lg->cvt == ILGop_Ident8 || lg->cvt == ILGop_Ident16 ||
+ lg->cvt == ILGop_Ident32 || lg->cvt == ILGop_Ident64);
+
+ if (tyd == Ity_I64)
+ addInstr(env, RISCV64Instr_LoadG(RISCV64op_LD, dst, addr, 0, guard, alt));
+ else if (tyd == Ity_I32)
+ addInstr(env, RISCV64Instr_LoadG(RISCV64op_LW, dst, addr, 0, guard, alt));
+ else if (tyd == Ity_I16)
+ addInstr(env, RISCV64Instr_LoadG(RISCV64op_LH, dst, addr, 0, guard, alt));
+ else if (tyd == Ity_I8)
+ addInstr(env, RISCV64Instr_LoadG(RISCV64op_LB, dst, addr, 0, guard, alt));
+ else
+ vassert(0);
+ return;
+ }
+ return;
+ }
+
/* ------------------------ STORE ------------------------ */
/* Little-endian write to memory. */
case Ist_Store: {
@@ -1623,6 +1654,33 @@ static void iselStmt(ISelEnv* env, IRStmt* stmt)
break;
}
+ /* ----------------------- StoreG ------------------------ */
+ case Ist_StoreG: {
+ IRStoreG* sg = stmt->Ist.StoreG.details;
+ if (sg->end != Iend_LE)
+ goto stmt_fail;
+
+ IRType tyd = typeOfIRExpr(env->type_env, sg->data);
+ if (tyd == Ity_I64 || tyd == Ity_I32 || tyd == Ity_I16 || tyd == Ity_I8) {
+ HReg src = iselIntExpr_R(env, sg->data);
+ HReg addr = iselIntExpr_R(env, sg->addr);
+ HReg guard = iselIntExpr_R(env, sg->guard);
+
+ if (tyd == Ity_I64)
+ addInstr(env, RISCV64Instr_StoreG(RISCV64op_SD, src, addr, 0, guard));
+ else if (tyd == Ity_I32)
+ addInstr(env, RISCV64Instr_StoreG(RISCV64op_SW, src, addr, 0, guard));
+ else if (tyd == Ity_I16)
+ addInstr(env, RISCV64Instr_StoreG(RISCV64op_SH, src, addr, 0, guard));
+ else if (tyd == Ity_I8)
+ addInstr(env, RISCV64Instr_StoreG(RISCV64op_SB, src, addr, 0, guard));
+ else
+ vassert(0);
+ return;
+ }
+ return;
+ }
+
/* ------------------------- PUT ------------------------- */
/* Write guest state, fixed offset. */
case Ist_Put: {
diff --git a/VEX/priv/ir_defs.c b/VEX/priv/ir_defs.c
index 875816c78..697e34313 100644
--- a/VEX/priv/ir_defs.c
+++ b/VEX/priv/ir_defs.c
@@ -2032,6 +2032,8 @@ void ppIRLoadGOp ( IRLoadGOp cvt )
case ILGop_IdentV128: vex_printf("IdentV128"); break;
case ILGop_Ident64: vex_printf("Ident64"); break;
case ILGop_Ident32: vex_printf("Ident32"); break;
+ case ILGop_Ident16: vex_printf("Ident16"); break;
+ case ILGop_Ident8: vex_printf("Ident8"); break;
case ILGop_16Uto32: vex_printf("16Uto32"); break;
case ILGop_16Sto32: vex_printf("16Sto32"); break;
case ILGop_8Uto32: vex_printf("8Uto32"); break;
@@ -4261,6 +4263,10 @@ void typeOfIRLoadGOp ( IRLoadGOp cvt,
*t_res = Ity_I64; *t_arg = Ity_I64; break;
case ILGop_Ident32:
*t_res = Ity_I32; *t_arg = Ity_I32; break;
+ case ILGop_Ident16:
+ *t_res = Ity_I16; *t_arg = Ity_I16; break;
+ case ILGop_Ident8:
+ *t_res = Ity_I8; *t_arg = Ity_I8; break;
case ILGop_16Uto32: case ILGop_16Sto32:
*t_res = Ity_I32; *t_arg = Ity_I16; break;
case ILGop_8Uto32: case ILGop_8Sto32:
diff --git a/VEX/priv/ir_opt.c b/VEX/priv/ir_opt.c
index 93dd6188e..e790acb5b 100644
--- a/VEX/priv/ir_opt.c
+++ b/VEX/priv/ir_opt.c
@@ -2996,7 +2996,9 @@ static IRSB* cprop_BB_WRK ( IRSB* in, Bool mustRetainNoOps, Bool doFolding )
switch (lg->cvt) {
case ILGop_IdentV128:
case ILGop_Ident64:
- case ILGop_Ident32: break;
+ case ILGop_Ident32:
+ case ILGop_Ident16:
+ case ILGop_Ident8: break;
case ILGop_8Uto32: cvtOp = Iop_8Uto32; break;
case ILGop_8Sto32: cvtOp = Iop_8Sto32; break;
case ILGop_16Uto32: cvtOp = Iop_16Uto32; break;
diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h
index b4b1e9d6e..c7b97c11d 100644
--- a/VEX/pub/libvex_ir.h
+++ b/VEX/pub/libvex_ir.h
@@ -2829,6 +2829,8 @@ typedef
ILGop_IdentV128, /* 128 bit vector, no conversion */
ILGop_Ident64, /* 64 bit, no conversion */
ILGop_Ident32, /* 32 bit, no conversion */
+ ILGop_Ident16, /* 16 bit, no conversion */
+ ILGop_Ident8, /* 8 bit, no conversion */
ILGop_16Uto32, /* 16 bit load, Z-widen to 32 */
ILGop_16Sto32, /* 16 bit load, S-widen to 32 */
ILGop_8Uto32, /* 8 bit load, Z-widen to 32 */
diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c
index 72ccb3c8c..b6c63aa05 100644
--- a/memcheck/mc_translate.c
+++ b/memcheck/mc_translate.c
@@ -6987,6 +6987,8 @@ static void do_shadow_LoadG ( MCEnv* mce, IRLoadG* lg )
case ILGop_IdentV128: loadedTy = Ity_V128; vwiden = Iop_INVALID; break;
case ILGop_Ident64: loadedTy = Ity_I64; vwiden = Iop_INVALID; break;
case ILGop_Ident32: loadedTy = Ity_I32; vwiden = Iop_INVALID; break;
+ case ILGop_Ident16: loadedTy = Ity_I16; vwiden = Iop_INVALID; break;
+ case ILGop_Ident8: loadedTy = Ity_I8; vwiden = Iop_INVALID; break;
case ILGop_16Uto32: loadedTy = Ity_I16; vwiden = Iop_16Uto32; break;
case ILGop_16Sto32: loadedTy = Ity_I16; vwiden = Iop_16Sto32; break;
case ILGop_8Uto32: loadedTy = Ity_I8; vwiden = Iop_8Uto32; break;
@@ -7619,6 +7621,8 @@ static void do_origins_LoadG ( MCEnv* mce, IRLoadG* lg )
case ILGop_IdentV128: loadedTy = Ity_V128; break;
case ILGop_Ident64: loadedTy = Ity_I64; break;
case ILGop_Ident32: loadedTy = Ity_I32; break;
+ case ILGop_Ident16: loadedTy = Ity_I16; break;
+ case ILGop_Ident8: loadedTy = Ity_I8; break;
case ILGop_16Uto32: loadedTy = Ity_I16; break;
case ILGop_16Sto32: loadedTy = Ity_I16; break;
case ILGop_8Uto32: loadedTy = Ity_I8; break;
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:49
|
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 120 ++++++++++++++++++++++++++++++++++
1 file changed, 120 insertions(+)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index 423260679..6407692f9 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -58,6 +58,8 @@
#include "main_globals.h"
#include "main_util.h"
+#include "coregrind/pub_core_transtab.h"
+
/*------------------------------------------------------------*/
/*--- Debugging output ---*/
/*------------------------------------------------------------*/
@@ -3415,6 +3417,122 @@ static Bool dis_RV64Zicsr(/*MB_OUT*/ DisResult* dres,
return False;
}
+static inline Long sext_slice_ulong(ULong value, UInt bmax, UInt bmin)
+{
+ return ((Long)value) << (63 - bmax) >> (63 - (bmax - bmin));
+}
+
+#define MAX_VL (-1UL)
+#define KEEP_VL (-2UL)
+
+static ULong helper_vsetvl(VexGuestRISCV64State* guest, ULong avl, ULong vtype)
+{
+ UInt sew = SLICE_UInt(vtype, 5, 3);
+ Int lmul = sext_slice_ulong(vtype, 3, 0);
+
+ ULong vlmax = VLEN >> (sew + 3 - lmul);
+ ULong vl = guest->guest_vl;
+ if (avl != KEEP_VL)
+ vl = (avl < vlmax) ? avl : vlmax;
+
+ guest->guest_vl = vl;
+ guest->guest_vtype = vtype;
+
+ invalidateFastCache();
+
+ DIP("vsetvl - vl: %llu, sew: 0x%x, lmul: %d, avl: %llu, vtype: %llx\n",
+ vl, sew, lmul, avl, vtype);
+
+ return vl;
+}
+
+static Bool dis_vsetvl(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr)
+{
+ UInt rd = INSN(11, 7);
+ IRExpr* avl;
+ IRExpr* vtype;
+
+ if (INSN(31, 30) == 0b11) { // vsetivli
+ UInt uimm = INSN(19, 15);
+ Int zimm = INSN(29, 20);
+ avl = mkU64(uimm);
+ vtype = mkU64(zimm);
+ } else if (INSN(31, 31) == 0b0 || INSN(31, 25) == 0b1000000) {
+ UInt rs1 = INSN(19, 15);
+ if (rs1 != 0) {
+ avl = getIReg64(rs1);
+ } else if (rd == 0) {
+ avl = mkU64(KEEP_VL);
+ } else {
+ avl = mkU64(MAX_VL);
+ }
+
+ if (INSN(31, 31) == 0b0) { // vsetvli
+ Int zimm = INSN(30, 20);
+ vtype = mkU64(zimm);
+ } else { // vsetvl
+ UInt rs2 = INSN(24, 20);
+ vtype = getIReg64(rs2);
+ }
+ } else {
+ vassert(0);
+ }
+
+ IRTemp vl = newTemp(irsb, Ity_I64);
+ IRDirty *d = unsafeIRDirty_1_N(vl,
+ 0,
+ "helper_vsetvl",
+ &helper_vsetvl,
+ mkIRExprVec_3(IRExpr_GSPTR(), avl, vtype));
+
+ d->nFxState = 2;
+ vex_bzero(&d->fxState, sizeof(d->fxState));
+ d->fxState[0].fx = Ifx_Write;
+ d->fxState[0].offset = OFFB_VL;
+ d->fxState[0].size = sizeof(ULong);
+ d->fxState[1].fx = Ifx_Write;
+ d->fxState[1].offset = OFFB_VTYPE;
+ d->fxState[1].size = sizeof(ULong);
+
+ stmt(irsb, IRStmt_Dirty(d));
+
+ if (rd != 0) {
+ putIReg64(irsb, rd, mkexpr(vl));
+ }
+
+ putPC(irsb, mkU64(guest_pc_curr_instr + 4));
+ dres->whatNext = Dis_StopHere;
+ dres->jk_StopHere = Ijk_SyncupEnv;
+
+ return True;
+}
+
+static Bool dis_RV64V(/*MB_OUT*/ DisResult* dres,
+ /*OUT*/ IRSB* irsb,
+ UInt insn,
+ Addr guest_pc_curr_instr,
+ const VexAbiInfo* abiinfo)
+
+{
+ // spec - 10. Vector Arithmetic Instruction Formats
+ switch (INSN(6, 0)) {
+ case 0b1010111:
+ switch (INSN(14, 12)) {
+ case 0b111: // vsetvl
+ return dis_vsetvl(dres, irsb, insn, guest_pc_curr_instr);
+ default:
+ return False;
+ }
+ default:
+ return False;
+ }
+
+ return False;
+}
+
static Bool dis_RISCV64_standard(/*MB_OUT*/ DisResult* dres,
/*OUT*/ IRSB* irsb,
UInt insn,
@@ -3437,6 +3555,8 @@ static Bool dis_RISCV64_standard(/*MB_OUT*/ DisResult* dres,
ok = dis_RV64D(dres, irsb, insn);
if (!ok)
ok = dis_RV64Zicsr(dres, irsb, insn);
+ if (!ok)
+ ok = dis_RV64V(dres, irsb, insn, guest_pc_curr_instr, abiinfo);
if (ok)
return True;
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:49
|
disp_run_translations in dispatch-riscv64-linux.S shift guest_state by
2048, it needs to adjust accordingly on calling helper.
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/host_riscv64_isel.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/VEX/priv/host_riscv64_isel.c b/VEX/priv/host_riscv64_isel.c
index 355f559bd..127200d8e 100644
--- a/VEX/priv/host_riscv64_isel.c
+++ b/VEX/priv/host_riscv64_isel.c
@@ -425,8 +425,10 @@ static Bool doHelperCall(/*OUT*/ UInt* stackAdjustAfterCall,
} else if (arg->tag == Iex_GSPTR) {
if (nextArgReg >= RISCV64_N_ARGREGS)
return False; /* Out of argregs. */
+ /* See dispatch-riscv64-linux.S for -2048 */
addInstr(env,
- RISCV64Instr_MV(argregs[nextArgReg], hregRISCV64_x8()));
+ RISCV64Instr_ALUImm(RISCV64op_ADDI, argregs[nextArgReg],
+ hregRISCV64_x8(), -2048));
nextArgReg++;
} else if (arg->tag == Iex_VECRET) {
/* Because of the go_fast logic above, we can't get here, since
@@ -461,7 +463,10 @@ static Bool doHelperCall(/*OUT*/ UInt* stackAdjustAfterCall,
} else if (arg->tag == Iex_GSPTR) {
if (nextArgReg >= RISCV64_N_ARGREGS)
return False; /* Out of argregs. */
- tmpregs[nextArgReg] = hregRISCV64_x8();
+
+ addInstr(env,
+ RISCV64Instr_ALUImm(RISCV64op_ADDI, tmpregs[nextArgReg],
+ hregRISCV64_x8(), -2048));
nextArgReg++;
} else if (arg->tag == Iex_VECRET) {
vassert(!hregIsInvalid(r_vecRetAddr));
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:49
|
During translation to IR, the states such guest vl and vtype are
referenced directly, add them to cpu_state to differentiate the same
guest code with different cpu_state.
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/pub/libvex_guest_riscv64.h | 9 +++++++++
coregrind/m_scheduler/scheduler.c | 17 +++++++++++++----
coregrind/m_translate.c | 3 +++
coregrind/m_transtab.c | 26 ++++++++++++++++++++++++--
coregrind/pub_core_transtab.h | 5 +++++
5 files changed, 54 insertions(+), 6 deletions(-)
diff --git a/VEX/pub/libvex_guest_riscv64.h b/VEX/pub/libvex_guest_riscv64.h
index 50bec58bd..36149bbf2 100644
--- a/VEX/pub/libvex_guest_riscv64.h
+++ b/VEX/pub/libvex_guest_riscv64.h
@@ -177,6 +177,15 @@ typedef struct {
/* Initialise all guest riscv64 state. */
void LibVEX_GuestRISCV64_initialise(/*OUT*/ VexGuestRISCV64State* vex_state);
+static inline ULong get_cpu_state(VexGuestRISCV64State* guest)
+{
+#if defined(VGA_riscv64)
+ return guest->guest_vl | (guest->guest_vtype << 16);
+#else
+ return 0;
+#endif
+}
+
#endif /* ndef __LIBVEX_PUB_GUEST_RISCV64_H */
/*--------------------------------------------------------------------*/
diff --git a/coregrind/m_scheduler/scheduler.c b/coregrind/m_scheduler/scheduler.c
index 4e18c80fa..6d9a721c0 100644
--- a/coregrind/m_scheduler/scheduler.c
+++ b/coregrind/m_scheduler/scheduler.c
@@ -948,6 +948,8 @@ void run_thread_for_a_while ( /*OUT*/HWord* two_words,
do_pre_run_checks( tst );
/* end Paranoia */
+ ULong cpu_state = get_cpu_state(&tst->arch.vex);
+
/* Futz with the XIndir stats counters. */
vg_assert(VG_(stats__n_xIndirs_32) == 0);
vg_assert(VG_(stats__n_xIndir_hits1_32) == 0);
@@ -977,6 +979,7 @@ void run_thread_for_a_while ( /*OUT*/HWord* two_words,
to the scheduler. */
Bool found = VG_(search_transtab)(&res, NULL, NULL,
(Addr)tst->arch.vex.VG_INSTR_PTR,
+ cpu_state,
True/*upd cache*/
);
if (LIKELY(found)) {
@@ -1133,16 +1136,19 @@ static void handle_tt_miss ( ThreadId tid )
Bool found;
Addr ip = VG_(get_IP)(tid);
+ volatile ThreadState* tst = VG_(get_ThreadState)(tid);
+ ULong cpu_state = get_cpu_state(&tst->arch.vex);
+
/* Trivial event. Miss in the fast-cache. Do a full
lookup for it. */
found = VG_(search_transtab)( NULL, NULL, NULL,
- ip, True/*upd_fast_cache*/ );
+ ip, cpu_state, True/*upd_fast_cache*/ );
if (UNLIKELY(!found)) {
/* Not found; we need to request a translation. */
if (VG_(translate)( tid, ip, /*debug*/False, 0/*not verbose*/,
bbs_done, True/*allow redirection*/ )) {
found = VG_(search_transtab)( NULL, NULL, NULL,
- ip, True );
+ ip, cpu_state, True );
vg_assert2(found, "handle_tt_miss: missing tt_fast entry");
} else {
@@ -1163,14 +1169,17 @@ void handle_chain_me ( ThreadId tid, void* place_to_chain, Bool toFastEP )
SECno to_sNo = INV_SNO;
TTEno to_tteNo = INV_TTE;
+ volatile ThreadState* tst = VG_(get_ThreadState)(tid);
+ ULong cpu_state = get_cpu_state(&tst->arch.vex);
+
found = VG_(search_transtab)( NULL, &to_sNo, &to_tteNo,
- ip, False/*dont_upd_fast_cache*/ );
+ ip, cpu_state, False/*dont_upd_fast_cache*/ );
if (!found) {
/* Not found; we need to request a translation. */
if (VG_(translate)( tid, ip, /*debug*/False, 0/*not verbose*/,
bbs_done, True/*allow redirection*/ )) {
found = VG_(search_transtab)( NULL, &to_sNo, &to_tteNo,
- ip, False );
+ ip, cpu_state, False );
vg_assert2(found, "handle_chain_me: missing tt_fast entry");
} else {
// If VG_(translate)() fails, it's because it had to throw a
diff --git a/coregrind/m_translate.c b/coregrind/m_translate.c
index dc3c65814..cad9184b9 100644
--- a/coregrind/m_translate.c
+++ b/coregrind/m_translate.c
@@ -1510,6 +1510,7 @@ Bool VG_(translate) ( ThreadId tid,
VexTranslateArgs vta;
VexTranslateResult tres;
VgCallbackClosure closure;
+ ULong cpu_state = 0;
/* Make sure Vex is initialised right. */
@@ -1754,6 +1755,7 @@ Bool VG_(translate) ( ThreadId tid,
vex_abiinfo.guest__use_fallback_LLSC = True;
ThreadState *tst = VG_(get_ThreadState)(tid);
vex_abiinfo.riscv64_guest_state = &tst->arch.vex;
+ cpu_state = get_cpu_state(&tst->arch.vex);
# endif
/* Set up closure args. */
@@ -1868,6 +1870,7 @@ Bool VG_(translate) ( ThreadId tid,
// addr, which might have been changed by the redirection
VG_(add_to_transtab)( &vge,
nraddr,
+ cpu_state,
(Addr)(&tmpbuf[0]),
tmpbuf_used,
tres.n_sc_extents > 0,
diff --git a/coregrind/m_transtab.c b/coregrind/m_transtab.c
index 102108a35..06019efa1 100644
--- a/coregrind/m_transtab.c
+++ b/coregrind/m_transtab.c
@@ -192,6 +192,9 @@ typedef
may not be a lie, depending on whether or not we're doing
redirection. */
Addr entry;
+#ifdef VGA_riscv64
+ ULong cpu_state;
+#endif
/* Address range summary info: these are pointers back to
eclass[] entries in the containing Sector. Those entries in
@@ -1461,7 +1464,7 @@ static inline HTTno HASH_TT ( Addr key )
}
/* Invalidate the fast cache VG_(tt_fast). */
-static void invalidateFastCache ( void )
+void invalidateFastCache ( void )
{
for (UWord j = 0; j < VG_TT_FAST_SETS; j++) {
FastCacheSet* set = &VG_(tt_fast)[j];
@@ -1734,6 +1737,7 @@ static void initialiseSector ( SECno sno )
*/
void VG_(add_to_transtab)( const VexGuestExtents* vge,
Addr entry,
+ ULong cpu_state,
Addr code,
UInt code_len,
Bool is_self_checking,
@@ -1845,6 +1849,9 @@ void VG_(add_to_transtab)( const VexGuestExtents* vge,
(code_len == 0 ? 1 : (code_len / 4));
sectors[y].ttC[tteix].entry = entry;
+#ifdef VGA_riscv64
+ sectors[y].ttC[tteix].cpu_state = cpu_state;
+#endif
TTEntryH__from_VexGuestExtents( §ors[y].ttH[tteix], vge );
sectors[y].ttH[tteix].status = InUse;
@@ -1905,6 +1912,14 @@ void VG_(add_to_transtab)( const VexGuestExtents* vge,
upd_eclasses_after_add( §ors[y], tteix );
}
+static inline Bool cpu_state_match(TTEntryC* ttC, ULong cpu_state)
+{
+#ifdef VGA_riscv64
+ return ttC->cpu_state == cpu_state;
+#else
+ return True;
+#endif
+}
/* Search for the translation of the given guest address. If
requested, a successful search can also cause the fast-caches to be
@@ -1914,6 +1929,7 @@ Bool VG_(search_transtab) ( /*OUT*/Addr* res_hcode,
/*OUT*/SECno* res_sNo,
/*OUT*/TTEno* res_tteNo,
Addr guest_addr,
+ ULong cpu_state,
Bool upd_cache )
{
SECno i, sno;
@@ -1940,7 +1956,9 @@ Bool VG_(search_transtab) ( /*OUT*/Addr* res_hcode,
n_lookup_probes++;
tti = sectors[sno].htt[k];
if (tti < N_TTES_PER_SECTOR
- && sectors[sno].ttC[tti].entry == guest_addr) {
+ && sectors[sno].ttC[tti].entry == guest_addr
+ && cpu_state_match(§ors[sno].ttC[tti], cpu_state)
+ ) {
/* found it */
if (upd_cache)
setFastCacheEntry(
@@ -2553,7 +2571,11 @@ void VG_(init_tt_tc) ( void )
have a lot of TTEntryCs so let's check that too. */
if (sizeof(HWord) == 8) {
vg_assert(sizeof(TTEntryH) <= 32);
+#ifdef VGA_riscv64
+ vg_assert(sizeof(TTEntryC) <= 120);
+#else
vg_assert(sizeof(TTEntryC) <= 112);
+#endif
}
else if (sizeof(HWord) == 4) {
vg_assert(sizeof(TTEntryH) <= 20);
diff --git a/coregrind/pub_core_transtab.h b/coregrind/pub_core_transtab.h
index cc70a2944..b352891cf 100644
--- a/coregrind/pub_core_transtab.h
+++ b/coregrind/pub_core_transtab.h
@@ -171,6 +171,7 @@ extern void VG_(init_tt_tc) ( void );
extern
void VG_(add_to_transtab)( const VexGuestExtents* vge,
Addr entry,
+ ULong cpu_state,
Addr code,
UInt code_len,
Bool is_self_checking,
@@ -194,6 +195,7 @@ extern Bool VG_(search_transtab) ( /*OUT*/Addr* res_hcode,
/*OUT*/SECno* res_sNo,
/*OUT*/TTEno* res_tteNo,
Addr guest_addr,
+ ULong cpu_state,
Bool upd_cache );
extern void VG_(discard_translations) ( Addr start, ULong range,
@@ -216,6 +218,9 @@ extern
Bool VG_(search_unredir_transtab) ( /*OUT*/Addr* result,
Addr guest_addr );
+extern
+void invalidateFastCache ( void );
+
// SB profiling stuff
typedef struct _SBProfEntry {
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:48
|
SyncupEnv is added for syncup the environment so that the following
instructions can get the update from the previous instructions, e.g.
vl set by vsetvl.
TooManyIR is added for the cases one guest instruction is translated to
many IRs, this stops TB including more guest instructions and it is used
to avoid too many IRs in one TB.
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/host_riscv64_isel.c | 2 ++
VEX/priv/ir_defs.c | 2 ++
VEX/pub/libvex_ir.h | 7 ++++++-
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/VEX/priv/host_riscv64_isel.c b/VEX/priv/host_riscv64_isel.c
index 76fc3fd5c..87213fb86 100644
--- a/VEX/priv/host_riscv64_isel.c
+++ b/VEX/priv/host_riscv64_isel.c
@@ -1942,6 +1942,8 @@ static void iselNext(ISelEnv* env, IRExpr* next, IRJumpKind jk, Int offsIP)
/* Case: call/return (==boring) transfer to any address. */
switch (jk) {
case Ijk_Boring:
+ case Ijk_SyncupEnv:
+ case Ijk_TooManyIR:
case Ijk_Ret:
case Ijk_Call: {
HReg r = iselIntExpr_R(env, next);
diff --git a/VEX/priv/ir_defs.c b/VEX/priv/ir_defs.c
index 2d82c41a1..875816c78 100644
--- a/VEX/priv/ir_defs.c
+++ b/VEX/priv/ir_defs.c
@@ -2083,6 +2083,8 @@ void ppIRJumpKind ( IRJumpKind kind )
case Ijk_Sys_int145: vex_printf("Sys_int145"); break;
case Ijk_Sys_int210: vex_printf("Sys_int210"); break;
case Ijk_Sys_sysenter: vex_printf("Sys_sysenter"); break;
+ case Ijk_SyncupEnv: vex_printf("SyncupEnv"); break;
+ case Ijk_TooManyIR: vex_printf("TooManyIR"); break;
default: vpanic("ppIRJumpKind");
}
}
diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h
index 8c47be090..b4b1e9d6e 100644
--- a/VEX/pub/libvex_ir.h
+++ b/VEX/pub/libvex_ir.h
@@ -2513,8 +2513,13 @@ typedef
Ijk_Sys_int130, /* amd64/x86 'int $0x82' */
Ijk_Sys_int145, /* amd64/x86 'int $0x91' */
Ijk_Sys_int210, /* amd64/x86 'int $0xD2' */
- Ijk_Sys_sysenter /* x86 'sysenter'. guest_EIP becomes
+ Ijk_Sys_sysenter, /* x86 'sysenter'. guest_EIP becomes
invalid at the point this happens. */
+ Ijk_SyncupEnv, /* rvv syncup so that following instructions can read
+ the env set here */
+ Ijk_TooManyIR /* some rvv instructions generate too many IRs to
+ exhaust storage, break out early to reduce the
+ risk */
}
IRJumpKind;
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:46
|
Vector instruction needs this info in guest_riscv64_toIR.c
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/pub/libvex.h | 4 ++++
coregrind/m_translate.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/VEX/pub/libvex.h b/VEX/pub/libvex.h
index 2c54a8d8f..7cabc36aa 100644
--- a/VEX/pub/libvex.h
+++ b/VEX/pub/libvex.h
@@ -37,6 +37,7 @@
#include "libvex_basictypes.h"
#include "libvex_ir.h"
+#include "pub_tool_guest.h"
/*---------------------------------------------------------------*/
@@ -470,6 +471,9 @@ typedef
/* MIPS32/MIPS64 GUESTS only: emulated FPU mode. */
UInt guest_mips_fp_mode;
+
+ /* RISC-V vector needs guest state on translation */
+ VexGuestArchState* riscv64_guest_state;
}
VexAbiInfo;
diff --git a/coregrind/m_translate.c b/coregrind/m_translate.c
index 75dca062d..dc3c65814 100644
--- a/coregrind/m_translate.c
+++ b/coregrind/m_translate.c
@@ -1752,6 +1752,8 @@ Bool VG_(translate) ( ThreadId tid,
# if defined(VGP_riscv64_linux)
vex_abiinfo.guest__use_fallback_LLSC = True;
+ ThreadState *tst = VG_(get_ThreadState)(tid);
+ vex_abiinfo.riscv64_guest_state = &tst->arch.vex;
# endif
/* Set up closure args. */
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:46
|
Add v0-v31, vl and vtype
Signed-off-by: Fei Wu <fe...@in...>
---
VEX/priv/guest_riscv64_toIR.c | 135 +++++++++++++++++++++++++++++++++
VEX/pub/libvex_guest_riscv64.h | 38 +++++++++-
memcheck/mc_machine.c | 35 +++++++++
3 files changed, 207 insertions(+), 1 deletion(-)
diff --git a/VEX/priv/guest_riscv64_toIR.c b/VEX/priv/guest_riscv64_toIR.c
index 93ea5a173..423260679 100644
--- a/VEX/priv/guest_riscv64_toIR.c
+++ b/VEX/priv/guest_riscv64_toIR.c
@@ -289,6 +289,42 @@ static IRExpr* narrowFrom64(IRType dstTy, IRExpr* e)
#define OFFB_LLSC_ADDR offsetof(VexGuestRISCV64State, guest_LLSC_ADDR)
#define OFFB_LLSC_DATA offsetof(VexGuestRISCV64State, guest_LLSC_DATA)
+#define OFFB_V0 offsetof(VexGuestRISCV64State, guest_v0)
+#define OFFB_V1 offsetof(VexGuestRISCV64State, guest_v1)
+#define OFFB_V2 offsetof(VexGuestRISCV64State, guest_v2)
+#define OFFB_V3 offsetof(VexGuestRISCV64State, guest_v3)
+#define OFFB_V4 offsetof(VexGuestRISCV64State, guest_v4)
+#define OFFB_V5 offsetof(VexGuestRISCV64State, guest_v5)
+#define OFFB_V6 offsetof(VexGuestRISCV64State, guest_v6)
+#define OFFB_V7 offsetof(VexGuestRISCV64State, guest_v7)
+#define OFFB_V8 offsetof(VexGuestRISCV64State, guest_v8)
+#define OFFB_V9 offsetof(VexGuestRISCV64State, guest_v9)
+#define OFFB_V10 offsetof(VexGuestRISCV64State, guest_v10)
+#define OFFB_V11 offsetof(VexGuestRISCV64State, guest_v11)
+#define OFFB_V12 offsetof(VexGuestRISCV64State, guest_v12)
+#define OFFB_V13 offsetof(VexGuestRISCV64State, guest_v13)
+#define OFFB_V14 offsetof(VexGuestRISCV64State, guest_v14)
+#define OFFB_V15 offsetof(VexGuestRISCV64State, guest_v15)
+#define OFFB_V16 offsetof(VexGuestRISCV64State, guest_v16)
+#define OFFB_V17 offsetof(VexGuestRISCV64State, guest_v17)
+#define OFFB_V18 offsetof(VexGuestRISCV64State, guest_v18)
+#define OFFB_V19 offsetof(VexGuestRISCV64State, guest_v19)
+#define OFFB_V20 offsetof(VexGuestRISCV64State, guest_v20)
+#define OFFB_V21 offsetof(VexGuestRISCV64State, guest_v21)
+#define OFFB_V22 offsetof(VexGuestRISCV64State, guest_v22)
+#define OFFB_V23 offsetof(VexGuestRISCV64State, guest_v23)
+#define OFFB_V24 offsetof(VexGuestRISCV64State, guest_v24)
+#define OFFB_V25 offsetof(VexGuestRISCV64State, guest_v25)
+#define OFFB_V26 offsetof(VexGuestRISCV64State, guest_v26)
+#define OFFB_V27 offsetof(VexGuestRISCV64State, guest_v27)
+#define OFFB_V28 offsetof(VexGuestRISCV64State, guest_v28)
+#define OFFB_V29 offsetof(VexGuestRISCV64State, guest_v29)
+#define OFFB_V30 offsetof(VexGuestRISCV64State, guest_v30)
+#define OFFB_V31 offsetof(VexGuestRISCV64State, guest_v31)
+
+#define OFFB_VL offsetof(VexGuestRISCV64State, guest_vl)
+#define OFFB_VTYPE offsetof(VexGuestRISCV64State, guest_vtype)
+
/*------------------------------------------------------------*/
/*--- Integer registers ---*/
/*------------------------------------------------------------*/
@@ -413,6 +449,105 @@ static void putPC(/*OUT*/ IRSB* irsb, /*IN*/ IRExpr* e)
stmt(irsb, IRStmt_Put(OFFB_PC, e));
}
+/*------------------------------------------------------------*/
+/*--- Vector registers ---*/
+/*------------------------------------------------------------*/
+static Int offsetVReg(UInt vregNo)
+{
+ switch (vregNo) {
+ case 0:
+ return OFFB_V0;
+ case 1:
+ return OFFB_V1;
+ case 2:
+ return OFFB_V2;
+ case 3:
+ return OFFB_V3;
+ case 4:
+ return OFFB_V4;
+ case 5:
+ return OFFB_V5;
+ case 6:
+ return OFFB_V6;
+ case 7:
+ return OFFB_V7;
+ case 8:
+ return OFFB_V8;
+ case 9:
+ return OFFB_V9;
+ case 10:
+ return OFFB_V10;
+ case 11:
+ return OFFB_V11;
+ case 12:
+ return OFFB_V12;
+ case 13:
+ return OFFB_V13;
+ case 14:
+ return OFFB_V14;
+ case 15:
+ return OFFB_V15;
+ case 16:
+ return OFFB_V16;
+ case 17:
+ return OFFB_V17;
+ case 18:
+ return OFFB_V18;
+ case 19:
+ return OFFB_V19;
+ case 20:
+ return OFFB_V20;
+ case 21:
+ return OFFB_V21;
+ case 22:
+ return OFFB_V22;
+ case 23:
+ return OFFB_V23;
+ case 24:
+ return OFFB_V24;
+ case 25:
+ return OFFB_V25;
+ case 26:
+ return OFFB_V26;
+ case 27:
+ return OFFB_V27;
+ case 28:
+ return OFFB_V28;
+ case 29:
+ return OFFB_V29;
+ case 30:
+ return OFFB_V30;
+ case 31:
+ return OFFB_V31;
+ default:
+ vassert(0);
+ }
+}
+
+static const HChar* nameVReg(UInt iregNo)
+{
+ vassert(iregNo < 32);
+ static const HChar* names[32] = {
+ "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7",
+ "v8", "v9", "v10", "v11", "v12", "v13", "v14", "v15",
+ "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",
+ "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31"};
+ return names[iregNo];
+}
+
+static IRExpr* getVReg(UInt vregNo, UInt offset, IRType ty)
+{
+ vassert(vregNo < 32);
+ return IRExpr_Get(offsetVReg(vregNo) + offset, ty);
+}
+
+static void putVReg(/*OUT*/ IRSB* irsb, UInt vregNo, UInt offset, /*IN*/ IRExpr* e)
+{
+ vassert(vregNo < 32);
+ stmt(irsb, IRStmt_Put(offsetVReg(vregNo) + offset, e));
+}
+
+
/*------------------------------------------------------------*/
/*--- Floating-point registers ---*/
/*------------------------------------------------------------*/
diff --git a/VEX/pub/libvex_guest_riscv64.h b/VEX/pub/libvex_guest_riscv64.h
index 31264b124..50bec58bd 100644
--- a/VEX/pub/libvex_guest_riscv64.h
+++ b/VEX/pub/libvex_guest_riscv64.h
@@ -128,8 +128,44 @@ typedef struct {
/* 576 */ ULong guest_LLSC_ADDR; /* Address of the transaction. */
/* 584 */ ULong guest_LLSC_DATA; /* Original value at ADDR, sign-extended. */
+#define VLEN 128
+ /* 592 */ V128 guest_v0;
+ V128 guest_v1;
+ V128 guest_v2;
+ V128 guest_v3;
+ V128 guest_v4;
+ V128 guest_v5;
+ V128 guest_v6;
+ V128 guest_v7;
+ V128 guest_v8;
+ V128 guest_v9;
+ V128 guest_v10;
+ V128 guest_v11;
+ V128 guest_v12;
+ V128 guest_v13;
+ V128 guest_v14;
+ V128 guest_v15;
+ V128 guest_v16;
+ V128 guest_v17;
+ V128 guest_v18;
+ V128 guest_v19;
+ V128 guest_v20;
+ V128 guest_v21;
+ V128 guest_v22;
+ V128 guest_v23;
+ V128 guest_v24;
+ V128 guest_v25;
+ V128 guest_v26;
+ V128 guest_v27;
+ V128 guest_v28;
+ V128 guest_v29;
+ V128 guest_v30;
+ V128 guest_v31;
+
+ /* 1104 */ ULong guest_vl;
+ /* 1112 */ ULong guest_vtype;
+
/* Padding to 16 bytes. */
- /* 592 */
} VexGuestRISCV64State;
/*------------------------------------------------------------*/
diff --git a/memcheck/mc_machine.c b/memcheck/mc_machine.c
index 34df0011a..acda0bd95 100644
--- a/memcheck/mc_machine.c
+++ b/memcheck/mc_machine.c
@@ -1489,6 +1489,41 @@ static Int get_otrack_shadow_offset_wrk ( Int offset, Int szB )
if (o == GOF(LLSC_ADDR) && sz == 8) return o;
if (o == GOF(LLSC_DATA) && sz == 8) return o;
+ if (o >= GOF(v0) && o+sz <= GOF(v0) +SZB(v0)) return GOF(v0);
+ if (o >= GOF(v1) && o+sz <= GOF(v1) +SZB(v1)) return GOF(v1);
+ if (o >= GOF(v2) && o+sz <= GOF(v2) +SZB(v2)) return GOF(v2);
+ if (o >= GOF(v3) && o+sz <= GOF(v3) +SZB(v3)) return GOF(v3);
+ if (o >= GOF(v4) && o+sz <= GOF(v4) +SZB(v4)) return GOF(v4);
+ if (o >= GOF(v5) && o+sz <= GOF(v5) +SZB(v5)) return GOF(v5);
+ if (o >= GOF(v6) && o+sz <= GOF(v6) +SZB(v6)) return GOF(v6);
+ if (o >= GOF(v7) && o+sz <= GOF(v7) +SZB(v7)) return GOF(v7);
+ if (o >= GOF(v8) && o+sz <= GOF(v8) +SZB(v8)) return GOF(v8);
+ if (o >= GOF(v9) && o+sz <= GOF(v9) +SZB(v9)) return GOF(v9);
+ if (o >= GOF(v10) && o+sz <= GOF(v10)+SZB(v10)) return GOF(v10);
+ if (o >= GOF(v11) && o+sz <= GOF(v11)+SZB(v11)) return GOF(v11);
+ if (o >= GOF(v12) && o+sz <= GOF(v12)+SZB(v12)) return GOF(v12);
+ if (o >= GOF(v13) && o+sz <= GOF(v13)+SZB(v13)) return GOF(v13);
+ if (o >= GOF(v14) && o+sz <= GOF(v14)+SZB(v14)) return GOF(v14);
+ if (o >= GOF(v15) && o+sz <= GOF(v15)+SZB(v15)) return GOF(v15);
+ if (o >= GOF(v16) && o+sz <= GOF(v16)+SZB(v16)) return GOF(v16);
+ if (o >= GOF(v17) && o+sz <= GOF(v17)+SZB(v17)) return GOF(v17);
+ if (o >= GOF(v18) && o+sz <= GOF(v18)+SZB(v18)) return GOF(v18);
+ if (o >= GOF(v19) && o+sz <= GOF(v19)+SZB(v19)) return GOF(v19);
+ if (o >= GOF(v20) && o+sz <= GOF(v20)+SZB(v20)) return GOF(v20);
+ if (o >= GOF(v21) && o+sz <= GOF(v21)+SZB(v21)) return GOF(v21);
+ if (o >= GOF(v22) && o+sz <= GOF(v22)+SZB(v22)) return GOF(v22);
+ if (o >= GOF(v23) && o+sz <= GOF(v23)+SZB(v23)) return GOF(v23);
+ if (o >= GOF(v24) && o+sz <= GOF(v24)+SZB(v24)) return GOF(v24);
+ if (o >= GOF(v25) && o+sz <= GOF(v25)+SZB(v25)) return GOF(v25);
+ if (o >= GOF(v26) && o+sz <= GOF(v26)+SZB(v26)) return GOF(v26);
+ if (o >= GOF(v27) && o+sz <= GOF(v27)+SZB(v27)) return GOF(v27);
+ if (o >= GOF(v28) && o+sz <= GOF(v28)+SZB(v28)) return GOF(v28);
+ if (o >= GOF(v29) && o+sz <= GOF(v29)+SZB(v29)) return GOF(v29);
+ if (o >= GOF(v30) && o+sz <= GOF(v30)+SZB(v30)) return GOF(v30);
+ if (o >= GOF(v31) && o+sz <= GOF(v31)+SZB(v31)) return GOF(v31);
+ if (o >= GOF(vl) && o+sz <= GOF(vl)+SZB(vl)) return GOF(vl);
+ if (o >= GOF(vtype) && o+sz <= GOF(vtype)+SZB(vtype)) return GOF(vtype);
+
VG_(printf)("MC_(get_otrack_shadow_offset)(riscv64)(off=%d,sz=%d)\n",
offset,szB);
tl_assert(0);
--
2.25.1
|
|
From: Fei Wu <fe...@in...> - 2023-05-26 13:57:45
|
I'm from Intel RISC-V team and working on a RISC-V International
development partner project to add RISC-V vector (RVV) support on
Valgrind, the target tool is memcheck. My work bases on commit
71272b252977 of Petr's riscv64-linux branch, many thanks to Petr for his
great work first.
https://github.com/petrpavlu/valgrind-riscv64
This RFC is a starting point of RVV support on Valgrind, It's far from
complete, which will take huge time, but I do think it's more effective
to have some real code for discussion, so this series adds the RVV
support to run memcpy/strcmp/strcpy/strlen/strncpy in:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/master/examples
The whole idea is splitting the vector instructions into scalar
instructions which have already been well supported on Petr's branch,
the correctness of binary translation (tool=none) is simple to ensure,
but the logic of tool=memcheck should not be broken, one of the keys is
to deal with the instructions with mask:
* for load/store with mask, LoadG/StoreG are enabled, the same semantics
as other architectures
* for other instructions such as vadd, if the vector mask agnostic (vma)
is set to undisturbed, the masked original value is read first then
write back, the V bit won't change even after write back, it's not
necessary to have another guard type like LoadG/StoreG.
Pros
----
* by leveraging the existing scalar instructions support on Valgrind,
usually adding a new instruction involves only the frontend in
guest_riscv64_toIR, other parts are rare touched, so effort is much
reduced to enable new instructions.
* As the backend only sees the scalar IRs and generates scalar
instructions, it's possible to run valgrind ./vec-test on non-RVV host.
Cons
----
* as this method splits RVV instruction at frontend, there is less
chance to optimize at other stages, e.g. the vbits tracking.
* with larger vlen such as 1K, at most 1 RVV instruction will split into
1K ops, besides the performance penalty, it causes pressure to other
components such as tmp space too. Some of this can be relieved by
grouping multiple elements together.
There are some alternatives, but none seems perfect:
* helper function. It's much easier to make tool=none work, but how good
is it to handle the V+A tracking and other tools? Generally speaking, it
should not be a general solution for too many instructions.
* define and pass the RVV IR to backend, instead of splitting it too
early. This introduces much effort, we should evaluate what level of
profit can be attained.
At last, if the performance is tolerable, is this the right way to go?
Fei Wu (12):
riscv64: Starting Vector support, registers added
riscv64: Pass riscv guest_state for translation
riscv64: Add SyncupEnv & TooManyIR jump kinds
riscv64: Add LoadG/StoreG support
riscv64: Shift guest_state -2048 on calling helper
riscv64: Add cpu_state to TB
riscv64: Introduce dis_RV64V and add vsetvl
riscv64: Add load/store
riscv64: Add csrr vl
riscv64: add vfirst
riscv64: Add vmsgtu/vmseq/vmsne/vmsbf/vmsif/vmor/vmv/vid
riscv64: Add vadd
VEX/priv/guest_riscv64_toIR.c | 974 +++++++++++++++++++++++++++++-
VEX/priv/host_riscv64_defs.c | 133 ++++
VEX/priv/host_riscv64_defs.h | 23 +
VEX/priv/host_riscv64_isel.c | 89 ++-
VEX/priv/ir_defs.c | 8 +
VEX/priv/ir_opt.c | 4 +-
VEX/pub/libvex.h | 4 +
VEX/pub/libvex_guest_riscv64.h | 47 +-
VEX/pub/libvex_ir.h | 9 +-
coregrind/m_scheduler/scheduler.c | 17 +-
coregrind/m_translate.c | 5 +
coregrind/m_transtab.c | 26 +-
coregrind/pub_core_transtab.h | 5 +
memcheck/mc_machine.c | 35 ++
memcheck/mc_translate.c | 4 +
15 files changed, 1368 insertions(+), 15 deletions(-)
--
2.25.1
|