Perform validation on the input for the host device letter.
Use a text input to select the host device letter.
Move all host device options to the sub-menu.
Reorder P (printer) device options.
Cleanup and rename "advanced H options" menu to "host device status".
Add menu option to set 'H' device letter.
Update AltirraOS to 3.41.
Hi again! In case you are interested, this is the code produced by GCC for the z7_SwapBytes4 functions, the 'th.' instructions are the non-standard instructions in this CPU. The code seems optimized enough: (gdb) Dump of assembler code for function z7_SwapBytes4: 0x0000002aaab033f6 <+0>: beqz a1,0x2aaab03412 <z7_SwapBytes4+28> 0x0000002aaab033f8 <+2>: and a5,a0,31 0x0000002aaab033fc <+6>: beqz a5,0x2aaab03414 <z7_SwapBytes4+30> 0x0000002aaab033fe <+8>: lw a5,0(a0) 0x0000002aaab03400 <+10>: add a0,a0,4...
Hi again! I discovered that GCC can compile to target the C906 core in the SG2002, and this uses the non-standard bit manipulation extensions, includeing the "REV" ins. So, I updated the patch to include the "xtheadbb" extension: #define Z7_CPU_FAST_BSWAP_SUPPORTED #elif (!defined(MY_CPU_RISCV) || defined (__riscv_zbb) || defined(__riscv_xtheadbb) ) \ && !defined(MY_CPU_SPARC) \ && ( \ (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) \ || (defined(__clang__) && Z7_has_builtin(__builtin_bswap16))...
Hi! Compiled with your patch, this is the result, basically more than doubled the bandwidth: daniel@duo256:~/src/7zip-23.01+dfsg$ ./CPP/7zip/Bundles/Alone2/b/g/7zz b -mm=swap4 -mtic=26 -md25 -bt 7-Zip (z) 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=swap4 tic=26 d25 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 991 986...
Hi! With GCC-14, if you activate the "Zbkb" extension, the compiler uses the new "REV8" instruction: https://godbolt.org/z/8MG9zea9T But, this CPU does not have the Zbkb extension (those are the "bit manipulation instructions"). Also with GCC-14, if you specify the "V" (vector) extension, the compiler vectorizes the code using the Gather instruction: https://godbolt.org/z/YYaWK38fv I don't have GCC-14 installed now, so I can't compile the full 7-zip code to test the speed. The problem is that Linux...
Hi! Tried those, indeed the bench took a lot less time: daniel@duo256:~$ 7z b -mm=* -mtic=28 -bt 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=* tic=28 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 983 994 993 993 997 1022 1022 RAM size: 235 MB, # CPU hardware threads: 1 Dictionary reduced to: 23 RAM usage: 119 MB,...
Hi! The MilkV Duo is a small and cheap SBC, with an SG2002 SOC that integrates 256MB RAM, tested on Debian SID: daniel@duo256:~$ 7z b 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 982 993 993 990 983 925 1022 RAM size: 235 MB, # CPU hardware threads: 1 RAM usage: 220 MB,...
Align start of bootloader at $800.
Move screen setup to program end, uses less memory for the loader.
Factorice getting a block of 2 bytes. 10 bytes less.
Pass block load address on A/X to GETBLK. 12 bytes less.
Inline FOPEN, 2 bytes less.
Factorice calling of CIOV on channel 1. 11 bytes less.
Reuse X value to reduce one byte.
Move uninitialized data outside of loadable area. 5 bytes less.
Use $FFFF to initialize RUNAD. 7 bytes less.
Use proper 16 bit arithmetic calculating block length. 12 bytes less.
Replace JMP with BEQ, condition already known from above. 1 byte less.
Use CMP directly to wait for key press. 2 bytes less.
Use AND to compare both header bytes, suggested by xxl at Atariage.
Move error handler up to avoid using JMP, 9 bytes less.
On CIO calls status is already in Y and CPU flags. 6 bytes less.
Reuse X = 0 in CIO call. 4 bytes less.
Inline call to SCREEN, used only once. 3 bytes less.
Replace JSR/RTS with JMP. 3 bytes less.
Inline DINI inside STARTUP, 4 bytes less.
Reuse X value in copy loop, 2 bytes less.
Adds output directory.
Cleanup sources.
Modify sources to allow compilation on Linux using Lazarus.
Move image to a new folder, cleaning up sources.
Remove auto-generaetd files.