User Activity

  • Posted a comment on discussion Open Discussion on 7-max

    Hi again! In case you are interested, this is the code produced by GCC for the z7_SwapBytes4 functions, the 'th.' instructions are the non-standard instructions in this CPU. The code seems optimized enough: (gdb) Dump of assembler code for function z7_SwapBytes4: 0x0000002aaab033f6 <+0>: beqz a1,0x2aaab03412 <z7_SwapBytes4+28> 0x0000002aaab033f8 <+2>: and a5,a0,31 0x0000002aaab033fc <+6>: beqz a5,0x2aaab03414 <z7_SwapBytes4+30> 0x0000002aaab033fe <+8>: lw a5,0(a0) 0x0000002aaab03400 <+10>: add a0,a0,4...

  • Posted a comment on discussion Open Discussion on 7-max

    Hi again! I discovered that GCC can compile to target the C906 core in the SG2002, and this uses the non-standard bit manipulation extensions, includeing the "REV" ins. So, I updated the patch to include the "xtheadbb" extension: #define Z7_CPU_FAST_BSWAP_SUPPORTED #elif (!defined(MY_CPU_RISCV) || defined (__riscv_zbb) || defined(__riscv_xtheadbb) ) \ && !defined(MY_CPU_SPARC) \ && ( \ (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) \ || (defined(__clang__) && Z7_has_builtin(__builtin_bswap16))...

  • Posted a comment on discussion Open Discussion on 7-max

    Hi! Compiled with your patch, this is the result, basically more than doubled the bandwidth: daniel@duo256:~/src/7zip-23.01+dfsg$ ./CPP/7zip/Bundles/Alone2/b/g/7zz b -mm=swap4 -mtic=26 -md25 -bt 7-Zip (z) 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=swap4 tic=26 d25 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 991 986...

  • Posted a comment on discussion Open Discussion on 7-max

    Hi! With GCC-14, if you activate the "Zbkb" extension, the compiler uses the new "REV8" instruction: https://godbolt.org/z/8MG9zea9T But, this CPU does not have the Zbkb extension (those are the "bit manipulation instructions"). Also with GCC-14, if you specify the "V" (vector) extension, the compiler vectorizes the code using the Gather instruction: https://godbolt.org/z/YYaWK38fv I don't have GCC-14 installed now, so I can't compile the full 7-zip code to test the speed. The problem is that Linux...

  • Posted a comment on discussion Open Discussion on 7-max

    Hi! Tried those, indeed the bench took a lot less time: daniel@duo256:~$ 7z b -mm=* -mtic=28 -bt 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=* tic=28 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 983 994 993 993 997 1022 1022 RAM size: 235 MB, # CPU hardware threads: 1 Dictionary reduced to: 23 RAM usage: 119 MB,...

  • Posted a comment on discussion Open Discussion on 7-max

    Hi! The MilkV Duo is a small and cheap SBC, with an SG2002 SOC that integrates 256MB RAM, tested on Debian SID: daniel@duo256:~$ 7z b 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 982 993 993 990 983 925 1022 RAM size: 235 MB, # CPU hardware threads: 1 RAM usage: 220 MB,...

  • Committed [dfe1d9]

    Align start of bootloader at $800.

  • Committed [c646ca]

    Move screen setup to program end, uses less memory for the loader.

View All

Personal Data

Username:
dserpell
Joined:
2004-09-28 20:02:42

Projects

This is a list of open source software projects that Daniel Serpell is associated with:

Personal Tools