Activity for 7-max

  • necros necros posted a comment on ticket #2

    we`ll test hopefully eventually, low usage because not on Github)

  • Sam Tansy Sam Tansy posted a comment on discussion Open Discussion

    Sometimes we could need stdout for timer There is no way to safely pipe anything with timer spoiling stdout. How do you imagine that: x:\> timer.exe gzip.exe -c a.exe > a.exe.gz It will work, sure, but the resulting archive is corrupt. That's it. Or piping, or simply decompressing through stdout some database backup: x:\> timer.exe lzma.exe -dc db.lzma | process-a-users-script Good luck. Maybe we should use some use some switch between stdout/stderr That makes more sense. Printing to stderr should...

  • Igor Pavlov Igor Pavlov posted a comment on ticket #2

    We have low user usage for 7-max program still. I hoped that someone will try to test games with 7-max. But nobody did it. So we don't know how 7-max can be useful.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Sometimes we could need stdout for timer. For example, that way we can see what operation was performed, if we have many calls of timer in same file. Maybe we should use some additional switch to switch between stdout/stderr.

  • necros necros created ticket #2

    Create 7max Github page

  • Sam Tansy Sam Tansy posted a comment on discussion Open Discussion

    I have used timer to measure processing time and to my surprise, when redirected tested application output to a file timer also was redirected. What's worse it mixes with tested program output. This should never happen. It should be printed to standard error, not to standard output. Checked source, and sure enough - it uses `printf'. It could use `fprintf(stderr, ...'.

  • Animadversor Animadversor modified a comment on discussion Open Discussion

    Your i5-3570 can install and work in Windows 11 without any problem? I had to bypass the checks for TPM v. 2.0 (mine is v. 1.2), but Windows 11 installed and is running with no problems. Please try 7-max test again after reboot, and close all another programs including browser. Done; file attached.

  • Animadversor Animadversor modified a comment on discussion Open Discussion

    Your i5-3570 can install and work in Windows 11 without any problem? I had to bypass the checks for TPM v. 2.0 (mine is v. 1.2), but Windows 11 installed and is running with no problems. Please try 7-max test again after reboot, and close all another programs including browser. Done; file attached.

  • Animadversor Animadversor modified a comment on discussion Open Discussion

    Your i5-3570 can install and work in Windows 11 without any problem? I had to bypass the checks for TPM v. 2.0 (mine is v. 1.2), but Windows 11 installed and is running with no problems. Please try 7-max test again after reboot, and close all another programs including browser. Done; file attached.

  • Animadversor Animadversor posted a comment on discussion Open Discussion

    Your i5-3570 can install and work in Windows 11 without any problem? I had to bypass the checks for TPM v. 2.0 (mine is v. 1.2), but Windows 11 installed and is running with no problems. Please try 7-max test again after reboot, and close all another programs including browser. Done; file attached.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Your i5-3570 can install and work in Windows 11 without any problem? Please try 7-max test again after reboot, and close all another programs including browser.

  • Animadversor Animadversor posted a comment on discussion Open Discussion

    Intel Core i5-3570 L1 Cache:Instruction: 4 x 32 KBytes, Data: 4 x 32 KBytes L2 Cache:Integrated: 4 x 256 KBytes L3 Cache: 6 MBytes Instruction TLB: 2MB/4MB Pages, Fully associative, 8 entries Data TLB: 4 KB Pages, 4-way set associative, 64 entries Total Memory Size: 16 GBytes Maximum Supported Memory Clock: 800.0 MHz Current Memory Clock: 653.7 MHz Current Timing (tCAS-tRCD-tRP-tRAS): 9-9-9-24 Memory Channels Supported: 2 Memory Channels Active: 2 960 GB SSD Microsoft Windows [Version 10.0.26120...

  • edison edison posted a comment on discussion Open Discussion

    Maybe there are some "Branch Prediction Optimization": https://wccftech.com/amd-branch-prediction-optimization-ryzen-9000-7000-cpus-available-windows-11-23h2/ https://wccftech.com/amd-ryzen-9000-gaming-performance-update-revised-testing-parity-intel-14th-gen-cpus-optimized-branch-prediction-boost/

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    R-110 probably overflows mop cache.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    R-xxx tests will overflow caches. xxx - the number of additional instructions between branches. but also there are several branches per iteration. So total number of instructions per loop iteration can be big. There were some changes in 22.00 for linux version as I remember, because it used C code instead of asm code. Maybe there were some changes for apple m1 support. Windows version used asm code and it doesn't depend from C compiler. Why does KB5041587 affect results?

  • edison edison modified a comment on discussion Open Discussion

    I took some time today to resolve the compilation issues in Visual Studio 2022. After running some tests, I found that the results for 2200 and 1400 were not significantly different(with Windows KB5041587 update). btw, Which tests overflow the uop-cache?

  • edison edison modified a comment on discussion Open Discussion

    I took some time today to resolve the compilation issues in Visual Studio 2022. After running some tests, I found that the results for 2200 and 1400 were not significantly different. btw, Which tests overflow the uop-cache?

  • edison edison posted a comment on discussion Open Discussion

    I took some time today to resolve the compilation issues in Visual Studio 2022. After running some tests, I found that the results for 2200 and 1400 were not significantly different. btw, Which tests overflow the uop-cache?

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    it was not tested with new compilers. I just wanted to get some changes in new version. You can fix that code to ignore warning/error.

  • edison edison posted a comment on discussion Open Discussion

    ********************************************************************** ** Visual Studio 2017 Developer Command Prompt v15.9.64 ** Copyright (c) 2017 Microsoft Corporation ********************************************************************** [vcvarsall.bat] Environment initialized for: 'x86' C:\Program Files (x86)\Microsoft Visual Studio\2017\Community>cd D:\backup\benchmark\7bench2200\7bench2200-src\CPP\Utils\CPUTest\MemLat C:\Program Files (x86)\Microsoft Visual Studio\2017\Community>d: D:\backup\benchmark\7bench2200\7bench2200-src\CPP\Utils\CPUTest\MemLat>nmake...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    cd /D CPP\Utils\CPUTest\MemLat\ nmake

  • edison edison posted a comment on discussion Open Discussion

    Thanks for your reply. I download the version 2200, but I don't know how to build it. I have try nmake -f CPP/Build.mak, but get error: nmake -f build.mak Microsoft (R) Program Maintenance Utility Version 14.40.33813.0 Copyright (C) Microsoft Corporation. All rights reserved. link -nologo -OPT:REF -OPT:ICF /LARGEADDRESSAWARE /FIXED:NO -out:o\ oleaut32.lib ole32.lib user32.lib advapi32.lib shell32.lib LINK : fatal error LNK1104: cannot open file 'o\' NMAKE : fatal error U1077: 'link -nologo -OPT:REF...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    If you want to test latest version of 7-benchmark, you can download it here (source code only): https://sourceforge.net/projects/sevenmax/files/7-Benchmark/7bench2200-src.7z/download

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    if you want some new version (source code only), you can download it here: https://sourceforge.net/projects/sevenmax/files/7-Benchmark/7bench2200-src.7z/download

  • 7-max 7-max released /7-Benchmark/7bench2200-src.7z

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    That "r2" test doesn't depend from l2 cache. But if you run Windows versions with pipelen r, there are some tests that have big code. And that big code overflows micro-op cache. And we can try to estimate micro-op cache miss penalty.

  • edison edison posted a comment on discussion Open Discussion

    perf stat -C 0 -e cpu_core/instructions/,cpu_core/branch/,cpu_core/branch-miss/,cpu_core/l2_request.all/ taskset -c 0 ./pipelen r2 PipeLen64 14.00 : Igor Pavlov : Public domain : 2014-01-04 r2 Branches 0 1 0-1 Random Len1 Len2 32 4.39 24.52 13.98 12.91 -3.09 -2.14 64 7.56 28.58 17.79 18.64 1.14 1.69 128 9.24 25.25 19.77 19.35 4.21 -0.83 256 9.96 26.09 17.05 17.45 -1.14 0.80 512 10.30 26.49 17.42 16.84 -3.11 -1.16 1-K 10.47 26.69 17.59 17.95 -1.26 0.72 2-K 8.70 26.78 17.67 17.92 0.36 0.50 4-K 8.73...

  • Pipe Pipe posted a comment on discussion Open Discussion

    Intel i7-8750H CPU, GTX 1050Ti Max Q, 1Tb SSD, 16Gb 2666 MT/s

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    CPU Usage column shows incorrect value for 1-thread benchmark. Now I don't know the reason of that issue. Maybe something was changed in Windows 11.

  • Abraxas Lee Abraxas Lee posted a comment on discussion Open Discussion

    GIGABYTE B760M GAMING AC DDR4 Microsoft Windows 11 Professional (x64) Build 22631.3880 (23H2) 2x 32GB = 64GB DDR4-3600 18-22-22-42 CR2 Intel Core i5-13500 (Raptor Lake-S 6+8) C:\Program Files\7-Zip>7z b -mmt1 7-Zip 24.07 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-06-19 mt1 Compiler: MSC 1400.140040310 Windows 10.0 22631 : Microsoft Hv : Hv#1 : 10.0.22621.3.0.3880 x64 6.BF02 threads:20 128TB f:5F310C2774C 13th Gen Intel(R) Core(TM) i5-13500 (B06F2) (35->35) 1T CPU Freq (MHz): 4398 4459 4476...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Size 6 12 24-K 3.97 3.96 32-K 4.16 3.97 48-K 11.86 3.96 64-K 11.87 3.97 96-K 11.92 3.97 128-K 11.83 3.97 192-K 11.91 3.97 256-K 11.94 3.97 384-K 11.97 3.97 512-K 15.11 3.97 768-K 34.19 3.96 1024-K 38.38 5.17 1536-K 43.94 9.58 2-M 46.46 13.01 3-M 50.13 16.74 4-M 50.88 18.17 6-M 52.35 18.64 8-M 53.05 19.14 column 6 (256-K) : L1 cache miss - 12 cycles for (L2 latency) column 12: ( 8-M) : ~19 cycles, 19 cycles includes both L1 cache miss and TLB miss. 19 cycles - 12 cycles = 7 cycles : that is DTLB L1...

  • Abraxas Lee Abraxas Lee posted a comment on discussion Open Discussion

    Sorry, Turbo Boost to 2208 MHz in setting, and cpufreq governor set as schedutil.

  • edison edison posted a comment on discussion Open Discussion

    https://www.7-cpu.com/cpu/Zen3.html 4 KB pages mode (64-bit, Linux) Data TLB L1: 64 items (about 800 KB of memory). ?-assoc. Miss penalty = ? cycles. Parallel miss: ? cycle per access Data TLB L2: 2048 items. 8-way. Miss penalty = ? cycles. Parallel miss: ? cycles per access (read from L3) Size Latency Increase Description 32 K 4 64 K 8 4 + 8 (L2) 128 K 10 2 256 K 11 1 512 K 21 10 + 7 (L1 TLB miss) 1 M 36 15 + 35 (L3) 2 M 46 10 4 M 51 5 8 M 53 2 My 5800X is 3.96 cycle in 64K Column "12", not 8 cycle,...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    The results show 2200 MHz. Is it bug for timer? Or real frequency is 2200 MHz instead of 2000 MHz?

  • Abraxas Lee Abraxas Lee modified a comment on discussion Open Discussion

    GIGABYTE Z390 AORUS PRO WIFI Microsoft Windows 11 Professional (x64) Build 22631.3958 (23H2) KHX3200C18D4/16G * 4 = 64G 18-21-21-39 CR2 PS C:\Program Files\7-Zip> .\7z.exe b -mmt1 7-Zip 24.07 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-06-19 mt1 Compiler: MSC 1400.140040310 Windows 10.0 22631 : Microsoft Hv : Hv#1 : 10.0.22621.3.0.3958 x64 6.9E0D threads:8 128TB f:5F110C2774C Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz (906ED) (FA->FA) 1T CPU Freq (MHz): 4593 4588 4572 4552 4553 4558 4574 RAM...

  • Abraxas Lee Abraxas Lee posted a comment on discussion Open Discussion

    GIGABYTE Z390 AORUS PRO WIFI Microsoft Windows 11 Professional (x64) Build 22631.3958 (23H2) KHX3200C18D4/16G * 4 = 64G 18-21-21-39 CR2 PS C:\Program Files\7-Zip> .\7z.exe b -mmt1 7-Zip 24.07 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-06-19 mt1 Compiler: MSC 1400.140040310 Windows 10.0 22631 : Microsoft Hv : Hv#1 : 10.0.22621.3.0.3958 x64 6.9E0D threads:8 128TB f:5F110C2774C Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz (906ED) (FA->FA) 1T CPU Freq (MHz): 4593 4588 4572 4552 4553 4558 4574 RAM...

  • Abraxas Lee Abraxas Lee posted a comment on discussion Open Discussion

    SoC: Rockchip RK3399 CPU: big.LITTLE,arm64/aarch64, Dual-Core Cortex-A72(up to 2.0GHz) + Quad-Core Cortex-A53(up to 1.5GHz) RAM: 4GB LPDDR4 SYS: OpenWRT 23.05.4 (kernel 6.6.43) mt1 Compiler: ver:9.2.1 20191025 GCC 9.2.1 : UNALIGNED Linux : 6.6.43 : #0 SMP PREEMPT Sat Jul 27 17:42:11 2024 : aarch64 PageSize:4KB THP:always hwcap:8FF:CRC32:SHA1:SHA2:AES:ASIMD LE 1T CPU Freq (MHz): 1469 1446 2063 2203 2190 2200 2201 RAM size: 3858 MB, # CPU hardware threads: 6 RAM usage: 437 MB, # Benchmark threads:...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Column "12" for tlb miss. zen result was without Page Table Entry (PTE) Coalescing.

  • edison edison posted a comment on discussion Open Discussion

    Thanks for you reply~ I found https://www.7-cpu.com/cpu/Zen.html said: 512 K 20 4 + 8 (L1 TLB miss) how to get the +8 cycle number?

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Column "12" for 4 kb page TLB. But measuring can be complicated for AMD processors, if Page Table Entry (PTE) Coalescing is working there, where one one TLB entry can cover 4 pages: 4 * 4 KB= 16 KB.

  • edison edison posted a comment on discussion Open Discussion

    Start-Process -FilePath "MemLat64.exe" -ArgumentList "512 p" -Wait -NoNewWindow -PassThru | ForEach-Object { $_.ProcessorAffinity = 0x02 } MemLat64 14.00 : Igor Pavlov : Public domain : 2014-01-04 512 p Size 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 4-K 3.98 3.98 3.98 3.97 3.97 3.97 3.97 3.96 3.96 3.96 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 6-K 3.98 3.97 3.97 3.97 3.97 3.97 3.96 3.96 3.96 3.96 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8-K 3.98 3.98 3.98...

  • Florenz Janus Mina Florenz Janus Mina posted a comment on discussion Open Discussion

    Windows 10 IoT Enterprise LTSC

  • Skymmer Skymmer posted a comment on discussion Open Discussion

    O_0 My goodness! I waited for 7max update for almost 18 years! And here it is! Finally! Thanks Igor. My version of report taken on i7-3770K + Windows 7 A couple of tests later...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Thanks. Now we have small number of users of 7-max. Therefore, development has been suspended. If the gain from 7-max can be confirmed and if number of users will grow, I can try to improve the code for interface.

  • Piotr Biesiada Piotr Biesiada posted a comment on discussion Open Discussion

    I can test in about two weeks on 12900K and RTX 4090. Games. Will let you know if I don't forget. However two things I see that could be taken care of: 1. drag & drop on 7-max icon - run a program, not throw an error 2. drag & drop on empty space in 7-max window - the same

  • ivan ivan posted a comment on discussion Open Discussion

    Hello! This software seems pretty interesting. Thank you!

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Now I don't plan intensive development of 7-max. We have small number of respones about usefulness of that program. I hoped that some games could get performance gain with 7-max. But no one has tried 7-max for games still. I suppose game benchmarking is not so simple. It is advisable that some game benchmaking site or some gaming forum users try to test 7-max with games for accurate results comparison.

  • Piotr Biesiada Piotr Biesiada posted a comment on discussion Open Discussion

    Can fix drag-and-drop program to 7max icon?

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Thanks. Can you test also WinRAR benchmark with multithreading and multithreading-off ?

  • Mike H Mike H modified a comment on discussion Open Discussion

    Here's aother b'mark. System: AMD Ryzen7950x, 64Gb Ram, 2Tb SSD, Win11 23H2

  • Mike H Mike H posted a comment on discussion Open Discussion

    Here's aother b'mark. System: AMD Ryzen7950x, 64Gb Ram, Win11 23H2

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    in 7-zip 24.05 there is new RISCV filter. You can benchmark it so: 7zz b -mm=riscv -md25 -mtic=29 -mmt1 7zz b -mm=riscv -md25 -mtic=29 -mmt1 -mfile=big_riscv_elf_file_path where big_riscv_elf_file_path - some big elf file for risc-v that has big .text section. First test is for random data. You can get difference in speed because real riscv elf file has many branch mispredictions and it converts more data. So real file will be slower for processing by filter. You can get difference in compression...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    in 7-zip 24.05 there is new "riscv" filter. You can benchmark it so: 7zz b -mm=riscv -md25 -mtic=29 -mmt1 7zz b -mm=riscv -md25 -mtic=29 -mmt1 -mfile=big_riscv_elf_file_path where big_riscv_elf_file_path - some big elf file for risc-v that has big .text section. First test is for random data. You can get difference in speed because real riscv elf file has many branch mispredictions and it converts more data. So real file will be slower for processing by filter. You can get difference in compression...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    in 7-zip 24.05 there is new "riscv" filter. You can benchmark it so: 7zz b -mm=riscv -md25 -mtic=29 -mmt1 7zz b -mm=riscv -md25 -mtic=29 -mmt1 -mfile=big_riscv_elf_file_path where big_riscv_elf_file_path - some big elf file for risc-v that has big .text section. First test is for random data. You can get difference in speed because real riscv elf file has many branch mispredictions and it converts more data. So real file will be slower for processing by filter. You can get difference in compression...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    in 7-zip 24.05 there is new "riscv" filter. You can benchmark it so: 7zz b -mm=riscv -md25 -mtic=29 -mmt1 7zz b -mm=riscv -md25 -mtic=29 -mmt1 -mfile=big_riscv_elf_file_path where big_riscv_elf_file_path - some big elf file for risc-v that has big .text section. First test is for random data. You can get difference in speed because real riscv elf file has many branch mispredictions and it converts more data. So real file will be slower for processing by filter. You can get difference in compression...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Thanks! I'll use it. Also I checked new GCC/CLANG compilers in godbolt.org. And these compilers can use rev instructions for my macro #define Z7_BSWAP32_CONST(v) \ ( (((unsigned)(v) << 24) ) \ | (((unsigned)(v) << 8) & (unsigned)0xff0000) \ | (((unsigned)(v) >> 8) & (unsigned)0xff00 ) \ | (((unsigned)(v) >> 24) )) So probably we can get good performance for swap4, even without __riscv_zbb and _riscv_xtheadbb checks, if extensions are available: -O2 -march=rv64imafdczbkb -O2 "-mcpu=thead-c906" Also...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Thanks! I'll use it. Also I checked new GCC/CLANG compilers in godbolt.org. And these compilers can use rev instructions for my macro #define Z7_BSWAP32_CONST(v) \ ( (((unsigned)(v) << 24) ) \ | (((unsigned)(v) << 8) & (unsigned)0xff0000) \ | (((unsigned)(v) >> 8) & (unsigned)0xff00 ) \ | (((unsigned)(v) >> 24) )) So probably we can get good performance for swap4, even without __riscv_zbb and _riscv_xtheadbb checks, if extensions are available: -O2 -march=rv64imafdczbkb -O2 "-mcpu=thead-c906" Also...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Thanks! I'll use it. Also I checked new GCC/CLANG compilers in godbolt.org. And these compilers can use rev instructions for my macro #define Z7_BSWAP32_CONST(v) \ ( (((unsigned)(v) << 24) ) \ | (((unsigned)(v) << 8) & (unsigned)0xff0000) \ | (((unsigned)(v) >> 8) & (unsigned)0xff00 ) \ | (((unsigned)(v) >> 24) )) So probably we can get good performance for swap4, even without __riscv_zbb and _riscv_xtheadbb checks, if extensions are available: -O2 -march=rv64imafdczbkb -O2 "-mcpu=thead-c906" Also...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Thanks! I'll use it. Also I checked new GCC/CLANG compilers in godbolt.org. And these compilers can use rev instructions for my macro #define Z7_BSWAP32_CONST(v) \ ( (((unsigned)(v) << 24) ) \ | (((unsigned)(v) << 8) & (unsigned)0xff0000) \ | (((unsigned)(v) >> 8) & (unsigned)0xff00 ) \ | (((unsigned)(v) >> 24) )) So probably we can get good performance for swap4, even without __riscv_zbb and _riscv_xtheadbb checks, if extensions are available: -O2 -march=rv64imafdczbkb -O2 "-mcpu=thead-c906" Also...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Thanks! I'll use it. Also I checked new GCC/CLANG compilers in godbolt.org. And these compilers can use rev instructions for my macro #define Z7_BSWAP32_CONST(v) \ ( (((unsigned)(v) << 24) ) \ | (((unsigned)(v) << 8) & (unsigned)0xff0000) \ | (((unsigned)(v) >> 8) & (unsigned)0xff00 ) \ | (((unsigned)(v) >> 24) )) So probably we can get even good performance for swpa4, even without __riscv_zbb and _riscv_xtheadbb checks, if extensions are available: -O2 -march=rv64imafdcvzbkb -O2 "-mcpu=thead-c906"...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi again! In case you are interested, this is the code produced by GCC for the z7_SwapBytes4 functions, the 'th.' instructions are the non-standard instructions in this CPU. The code seems optimized enough: (gdb) Dump of assembler code for function z7_SwapBytes4: 0x0000002aaab033f6 <+0>: beqz a1,0x2aaab03412 <z7_SwapBytes4+28> 0x0000002aaab033f8 <+2>: and a5,a0,31 0x0000002aaab033fc <+6>: beqz a5,0x2aaab03414 <z7_SwapBytes4+30> 0x0000002aaab033fe <+8>: lw a5,0(a0) 0x0000002aaab03400 <+10>: add a0,a0,4...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi again! I discovered that GCC can compile to target the C906 core in the SG2002, and this uses the non-standard bit manipulation extensions, includeing the "REV" ins. So, I updated the patch to include the "xtheadbb" extension: #define Z7_CPU_FAST_BSWAP_SUPPORTED #elif (!defined(MY_CPU_RISCV) || defined (__riscv_zbb) || defined(__riscv_xtheadbb) ) \ && !defined(MY_CPU_SPARC) \ && ( \ (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) \ || (defined(__clang__) && Z7_has_builtin(__builtin_bswap16))...

  • 7-max 7-max released /7-max/24.01/7max2401-src.7z

  • 7-max 7-max released /7-max/24.01/7max2401.exe

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Maybe windows version affects results. We have seen some good 7-max results in Windows 10/11 but with newer CPUs than Core 2. Also maybe it's better for 7-max, if RAM size is larger. 7-max can be good, if memory allocations are rare, for example, if we have big file set for compression, where program allocates big memory at the starting and then works with allocated data for long time. In that case some time overhead for large page allocation will be smaller than gain for main work.

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Maybe windows version affect results. We have seen some good 7-max results in Windows 10/11 but with newer CPUs than Core 2. Also maybe it's better for 7-max, if RAM size is larger. 7-max can be good, if memory allocations are rare, for example, if we have big file set for compression, where program allocates big memory at the starting and then works with allocated data for long time. In that case some time overhead for large page allocation will be smaller than gain for main work.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Maybe windows version affect results. We have seen some good 7-max results in Windows 10/11 but with newer CPUs. Also maybe it's better for 7-max, if RAM size is larger. 7-max can be good, if memory allocations are rare, for example, if we have big file set for compression, where program allocates big memory at the starting and then works with allocated data for long time. In that case some time overhead for large page allocation will be smaller than gain for main work.

  • Andrii Andrii posted a comment on discussion Open Discussion

    Probably 7-max is not good for that Core 2 processor. i think, this "optimization" requires CPUs newer ~2015-2016 years or ddr4+ configurations )

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Thanks. Probably 7-max is not good for that Core 2 processor. There is some speed degradation in some cases. Maybe it's because allocation for large page is slow at such Core 2 / Windows 7 systems.

  • PassionateUser PassionateUser posted a comment on discussion Open Discussion

    Okay.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    So you have available large pages after reboot. And now you can test "C:\Program Files\7-max\7maxc" -t "C:\Program Files\7-Zip\7z.exe" b -bt >> c:\1\a.txt or WinRAR benchmark.

  • PassionateUser PassionateUser posted a comment on discussion Open Discussion

    I test with admin rights, yes. Take a look at the output after the reboot.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Did you try to reboot system before test? Also did you run the tests with administrator rights?

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! Compiled with your patch, this is the result, basically more than doubled the bandwidth: daniel@duo256:~/src/7zip-23.01+dfsg$ ./CPP/7zip/Bundles/Alone2/b/g/7zz b -mm=swap4 -mtic=26 -md25 -bt 7-Zip (z) 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=swap4 tic=26 d25 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 991 986...

  • PassionateUser PassionateUser posted a comment on discussion Open Discussion

    Windows 7 x64, 7max 24.01. $ 7maxc -bl Total RAM size = 4022 MB Large Page Min Size = 2097152 bytes. Large pages test: 4 MB 0 ms error = 1450: Insufficient system resources exist to complete the requested service. Large blocks: 4 MB 0 ms error = 1450: Insufficient system resources exist to complete the requested service.

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    We don't want slower code, when extensions will be enabled. If we check for MY_CPU_RISCVto disable __builtin_bswap32 using, we must exclude cases where "fast" extensions are enabled. But maybe good compiler can convert manual Swap4 code to REV8 instruction, even without __builtin_bswap32 using. I suppose we can check for __riscv_zbb. I don't about vector extension checks becasue it depends also from compiler optimizations. #elif (!defined(MY_CPU_RISCV) || defined (__riscv_zbb)) \ && !defined(MY_CPU_SPARC)...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    We don't want slower code, when extensions will be enabled. If we check for MY_CPU_RISCVto disable __builtin_bswap32 using, we must exclude cases where "fast" extensions are enabled. But maybe good compiler can convert manual Swap4 code to REV8 instruction, even without __builtin_bswap32 using. I suppose we can check for __riscv_zbb. I don't about vector extension checks becasue it depends also from compiler optimizations. #elif (!defined(MY_CPU_RISCV) || defined (__riscv_zbb)) \ && !defined(MY_CPU_SPARC)...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    We don't want slower code, when extensions will be enabled. If we check for MY_CPU_RISCVto disable __builtin_bswap32 using, we must exclude cases where "fast" extensions are enabled. But maybe good compiler can convert manual Swap4 code to REV8 instruction, even without __builtin_bswap32 using.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    We don't want slower code, when extensions will be enabled. If we check for MY_CPU_RISCVto disable __builtin_bswap32 using, we must exclude cases when extensions are enabled. But maybe good compiler can convert manual Swap4 code to REV8 instruction, even without __builtin_bswap32 using.

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! With GCC-14, if you activate the "Zbkb" extension, the compiler uses the new "REV8" instruction: https://godbolt.org/z/8MG9zea9T But, this CPU does not have the Zbkb extension (those are the "bit manipulation instructions"). Also with GCC-14, if you specify the "V" (vector) extension, the compiler vectorizes the code using the Gather instruction: https://godbolt.org/z/YYaWK38fv I don't have GCC-14 installed now, so I can't compile the full 7-zip code to test the speed. The problem is that Linux...

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    About slow Swap4. In new 7-zip I have patch in CpuArch.h for such cases for SPARC: /* GCC for SPARC generates slow code that calls function for __builtin_bswap32(). The code from CLANG for SPARC also is not fastest. So we don't define Z7_CPU_FAST_BSWAP_SUPPORTED for SPARC. */ #elif !defined(MY_CPU_SPARC) && ( \ (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) \ || (defined(__clang__) && Z7_has_builtin(__builtin_bswap16)) \ ) #define Z7_BSWAP16(v) __builtin_bswap16(v)...

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! Tried those, indeed the bench took a lot less time: daniel@duo256:~$ 7z b -mm=* -mtic=28 -bt 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 m=* tic=28 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 983 994 993 993 997 1022 1022 RAM size: 235 MB, # CPU hardware threads: 1 Dictionary reduced to: 23 RAM usage: 119 MB,...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Complex benchmark is too slow for weak CPUs. Swap4 workload is 100 times slower than optimized SSE* code speed of x86 at same frequency. So it especially slow for that line. If you will call it again for slow computer, you can reduce time of execution with -mtic=28 switch. 28 means 2^28 ticks of good CPU for each workload. 7z b -mm=* -mtic=28 -bt If you want to get memory bandwidth benchmarks, you can call them so: 7z b -mm=swap4 -mtic=26 -md25 -bt 7z b -mm=crc32:8 -mtic=28 -md25 -bt

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Complex benchmark is too slow for weak CPUs. Swap4 workload is 100 times slower than optimized SSE* code speed of x86 at same frequency. So it especially slow for that line. If you will call it again for slow computer, you can reduce time of execution with -mtic=28 switch. 28 means 2^28 ticks of good CPU for each workload. 7z b -mm=* -mtic=28 -bt If you want to get memory bandwidth benchnarks, you can call them so: 7z b -mm=swap4 -mtic=26 -md25 -bt 7z b -mm=crc32:8 -mtic=28 -md25 -bt

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Compex benchmark is too slow for weak CPUs. Swap4 workload is 100 times slower than optimized SSE* code speed of x86 at same frequency. So it especially slow for that line. If you will call it again for slow computer, you can reduce time of execution with -mtic=28 switch. 28 means 2^28 ticks of good CPU for each workload. 7z b -mm=* -mtic=28 -bt If you want to get memory bandwidth benchnarks, you can call them so: 7z b -mm=swap4 -mtic=26 -md25 -bt 7z b -mm=crc32:8 -mtic=28 -md25 -bt

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    Compex benchmark is too slow for weak CPUs. Swap4 workload is 100 times slower than optimized SSE* code speed of x86 at same frequency. So it especially slow for that line. So if you will call it again for slow computer, you can reduce toime of execution with -mtic=28 switch: 7z b -mm=* -mtic=28 -bt If you want memory bandwidth benchnarks, you can call them so: 7z b -mm=swap4 -mtic=26 -md25 -bt 7z b -mm=crc32:8 -mtic=28 -md25 -bt

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Compex benchamrk is too slow for weak CPUs. Swap4 workload is 100 times slower than optimized SSE* code speed of x86 at same frequency. So it especially slow for that line. So if you will call it again for slow computer, you can reduce toime of execution with -mtic=28 switch: 7z b -mm=* -mtic=28 -bt If you want memory bandwidth benchnarks, you can call them so: 7z b -mm=swap4 -mtic=26 -md25 -bt 7z b -mm=crc32:8 -mtic=28 -md25 -bt

  • Daniel Serpell Daniel Serpell posted a comment on discussion Open Discussion

    Hi! The MilkV Duo is a small and cheap SBC, with an SG2002 SOC that integrates 256MB RAM, tested on Debian SID: daniel@duo256:~$ 7z b 7-Zip 23.01 (riscv64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 64-bit locale=en_US.UTF-8 Threads:1 OPEN_MAX:1024 Compiler: 13.2.0 GCC 13.2.0 Linux : 5.10.4-20240329-1+ : #1 PREEMPT Tue May 7 08:14:28 UTC 2024 : riscv64 PageSize:4KB hwcap:20112D LE 1T CPU Freq (MHz): 982 993 993 990 983 925 1022 RAM size: 235 MB, # CPU hardware threads: 1 RAM usage: 220 MB,...

  • Andrii Andrii posted a comment on discussion Open Discussion

    yes C:\>"C:\Program Files\7-max\7maxc" -bl 7-max 24.01 (x64) : Copyright (c) 2003-2024 Igor Pavlov : 2024-05-10 7-max ERROR: Can't enable LockMemory privilege. 7-max must be installed or run with administrator rights. Total RAM size = 7657 MB Large Page Min Size = 2097152 bytes. Large pages test: 4 MB 0 ms error = 1314: Клиент не обладает требуемыми правами. Large blocks: 4 MB 0 ms error = 1314: Клиент не обладает требуемыми правами.

  • Igor Pavlov Igor Pavlov posted a comment on discussion Open Discussion

    Thanks. If you run it without administrator rights in "Windows 7", does benchmark commands show the following error message? Can't enable LockMemory privilege. 7-max must be installed or run with administrator rights.

  • Andrii Andrii posted a comment on discussion Open Discussion

    win7

  • Andrii Andrii posted a comment on discussion Open Discussion

    Did you run 7-max with administrator rights in Windows 7? yes

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. 7-max tests for different versions of Windows This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. 7-max works in low level...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. 7-max tests for different versions of Windows This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. 7-max works in low level...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. 7-max tests for different versions of Windows This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. 7-max works in low level...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. 7-max tests for different versions of Windows This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. 7-max works in low level...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. 7-max tests for different versions of Windows This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. 7-max works in low level...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. To test 7-max, install 7-max and 7-Zip, and call the commands: mkdir...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. To test 7-max, install 7-max and 7-Zip, and call the commands: mkdir...

  • Igor Pavlov Igor Pavlov modified a comment on discussion Open Discussion

    7-max 24.01 was released. Download 7-max for Windows: https://7-max.com/a/7max2401.exe What's new in 7-max 24.01: Improved support for Windows 8.1 and for 32-bit programs in Windows 11. What's new in 7-max 24.00: The program was updated to support new versions of Windows: Windows 7 / 8 / 10 / 11. This is a new version of 7-max. And we need more tests with different versions of Windows to check that 7-max can work with any system. To test 7-max, install 7-max and 7-Zip, and call the commands: mkdir...

1 >