Use __ARM_NEON to detect neon support when hwcap is unavailable
A free file archiver for extremely high compression
Brought to you by:
ipavlov
Hello,
This is the same as https://github.com/ip7z/7zip/blob/a7a1d4a241492e81f659a920f7379c193593ebc6/C/CpuArch.c#L835 but when hwcap is unavailable.
This will allow Vita SDK (an open source toolchain for PlayStation Vita development) and devkitA64 (an open source toolchain for Nintendo Switch development) to correctly have neon support enabled as they don't have glibc nor the sys/auxv.h header.
Thanks.
Do we need similar patches for another arm features?
If you can run command line 7-zip, you can check the difference of speed after compilation:
7-Zip also uses some tricks to define crypto macros
__ARM_FEATURE_SHA2
,__ARM_FEATURE_SHA1
and__ARM_FEATURE_AES
, even if they are not defined by compiler. And 7-Zip uses run-time dispatching for branches of crypto code.So we can get single binary that can use crypto, if it's supported at run-time.
But if
_ARM_FEATURE_SHA2
is defined already,7-zip still checks
CPU_IsSupported_SHA2()
.So you could compile 7-zip with defined crypto features (
_ARM_FEATURE_SHA2
). And 7-zip still checks crypto features at runtime and selects the branch.And you could compile 7-zip without defined crypto features. And 7-zip defines (
_ARM_FEATURE_SHA2
) at compile time insideSha256Opt.c
file and checks cpu crypto flagsCPU_IsSupported_SHA2()
at runtime inSha256.c
.So
_ARM_FEATURE_SHA2
was optional at compile time.And 7-zip could work on cpus without
SHA2
support, even if it was compiled with_ARM_FEATURE_SHA2
.I don't need similar patches for these features in my case but maybe it could be useful for others :)
For your information:
- Vita SDK does not define any of these arm features
- devkitA64 defines
__ARM_FEATURE_SHA2
and__ARM_FEATURE_AES
SHA and AES code use NEON.
But another parts of code in 7-zip use NEON branch only in rare cases.
So patched
CPU_IsSupported_NEON()
will not improve performance in most cases.Last edit: Igor Pavlov 2024-07-04