Activity for Daniel Serpell

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi again! In case you are interested, this is the code produced by GCC for the z7_SwapBytes4 functions, the 'th.' instructions are the non-standard instructions in this CPU. The code seems optimized enough: (gdb) Dump of assembler code for function z7_SwapBytes4: 0x0000002aaab033f6 <+0>: beqz a1,0x2aaab03412 <z7_SwapBytes4+28> 0x0000002aaab033f8 <+2>: and a5,a0,31 0x0000002aaab033fc <+6>: beqz a5,0x2aaab03414 <z7_SwapBytes4+30> 0x0000002aaab033fe <+8>: lw a5,0(a0) 0x0000002aaab03400 <+10>: add a0,a0,4...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi again! I discovered that GCC can compile to target the C906 core in the SG2002, and this uses the non-standard bit manipulation extensions, includeing the "REV" ins. So, I updated the patch to include the "xtheadbb" extension: #define Z7_CPU_FAST_BSWAP_SUPPORTED #elif (!defined(MY_CPU_RISCV) || defined (__riscv_zbb) || defined(__riscv_xtheadbb) ) \ && !defined(MY_CPU_SPARC) \ && ( \ (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) \ || (defined(__clang__) && Z7_has_builtin(__builtin_bswap16))...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! Compiled with your patch, this is the result, basically more than doubled the bandwidth: daniel@duo256:~/src/7zip-23.01+dfsg$ ./CPP/7zip/Bundles/Alone2/b/g/7zz b -mm=swap4 -mtic=26 -md25 -bt 7-Zip (z) 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=swap4 tic=26 d25 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 991 986...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! With GCC-14, if you activate the "Zbkb" extension, the compiler uses the new "REV8" instruction: https://godbolt.org/z/8MG9zea9T But, this CPU does not have the Zbkb extension (those are the "bit manipulation instructions"). Also with GCC-14, if you specify the "V" (vector) extension, the compiler vectorizes the code using the Gather instruction: https://godbolt.org/z/YYaWK38fv I don't have GCC-14 installed now, so I can't compile the full 7-zip code to test the speed. The problem is that Linux...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! Tried those, indeed the bench took a lot less time: daniel@duo256:~$ 7z b -mm=* -mtic=28 -bt 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=* tic=28 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 983 994 993 993 997 1022 1022 RAM size: 235 MB, # CPU hardware threads: 1 Dictionary reduced to: 23 RAM usage: 119 MB,...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! The MilkV Duo is a small and cheap SBC, with an SG2002 SOC that integrates 256MB RAM, tested on Debian SID: daniel@duo256:~$ 7z b 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 982 993 993 990 983 925 1022 RAM size: 235 MB, # CPU hardware threads: 1 RAM usage: 220 MB,...

  • Daniel Serpell Daniel Serpell committed [dfe1d9]

    Align start of bootloader at $800.

  • Daniel Serpell Daniel Serpell committed [c646ca]

    Move screen setup to program end, uses less memory for the loader.

  • Daniel Serpell Daniel Serpell committed [f9aa57]

    Factorice getting a block of 2 bytes. 10 bytes less.

  • Daniel Serpell Daniel Serpell committed [06e602]

    Pass block load address on A/X to GETBLK. 12 bytes less.

  • Daniel Serpell Daniel Serpell committed [ba3062]

    Inline FOPEN, 2 bytes less.

  • Daniel Serpell Daniel Serpell committed [905f5a]

    Factorice calling of CIOV on channel 1. 11 bytes less.

  • Daniel Serpell Daniel Serpell committed [84ae00]

    Reuse X value to reduce one byte.

  • Daniel Serpell Daniel Serpell committed [579c03]

    Move uninitialized data outside of loadable area. 5 bytes less.

  • Daniel Serpell Daniel Serpell committed [6e11d0]

    Use $FFFF to initialize RUNAD. 7 bytes less.

  • Daniel Serpell Daniel Serpell committed [7cc0ad]

    Use proper 16 bit arithmetic calculating block length. 12 bytes less.

  • Daniel Serpell Daniel Serpell committed [199d56]

    Replace JMP with BEQ, condition already known from above. 1 byte less.

  • Daniel Serpell Daniel Serpell committed [29e05d]

    Use CMP directly to wait for key press. 2 bytes less.

  • Daniel Serpell Daniel Serpell committed [9362a8]

    Use AND to compare both header bytes, suggested by xxl at Atariage.

  • Daniel Serpell Daniel Serpell committed [3b09a7]

    Move error handler up to avoid using JMP, 9 bytes less.

  • Daniel Serpell Daniel Serpell committed [ed92dc]

    On CIO calls status is already in Y and CPU flags. 6 bytes less.

  • Daniel Serpell Daniel Serpell committed [cc4fa0]

    Reuse X = 0 in CIO call. 4 bytes less.

  • Daniel Serpell Daniel Serpell committed [82abcb]

    Inline call to SCREEN, used only once. 3 bytes less.

  • Daniel Serpell Daniel Serpell committed [5edfc8]

    Replace JSR/RTS with JMP. 3 bytes less.

  • Daniel Serpell Daniel Serpell committed [d03f5e]

    Inline DINI inside STARTUP, 4 bytes less.

  • Daniel Serpell Daniel Serpell committed [110272]

    Reuse X value in copy loop, 2 bytes less.

  • Daniel Serpell Daniel Serpell committed [r10]

    Adds output directory.

  • Daniel Serpell Daniel Serpell committed [r9]

    Cleanup sources.

  • Daniel Serpell Daniel Serpell committed [r8]

    Modify sources to allow compilation on Linux using Lazarus.

  • Daniel Serpell Daniel Serpell committed [r7]

    Move image to a new folder, cleaning up sources.

  • Daniel Serpell Daniel Serpell committed [r6]

    Remove auto-generaetd files.

1