Menu

#450 Use __ARM_NEON to detect neon support when hwcap is unavailable

open
nobody
None
5
2024-07-04
2024-07-04
John Marca
No

Hello,

This is the same as https://github.com/ip7z/7zip/blob/a7a1d4a241492e81f659a920f7379c193593ebc6/C/CpuArch.c#L835 but when hwcap is unavailable.

This will allow Vita SDK (an open source toolchain for PlayStation Vita development) and devkitA64 (an open source toolchain for Nintendo Switch development) to correctly have neon support enabled as they don't have glibc nor the sys/auxv.h header.

1 Attachments

Discussion

  • Igor Pavlov

    Igor Pavlov - 2024-07-04

    Thanks.

    Do we need similar patches for another arm features?

    __ARM_FEATURE_SHA2
    __ARM_FEATURE_SHA1
    __ARM_FEATURE_AES
    

    If you can run command line 7-zip, you can check the difference of speed after compilation:

    7zz b -mmt1 -mtic=30 -mm=sha*
    7zz b -mmt1 -mtic=30 -mm=aes*
    

    7-Zip also uses some tricks to define crypto macros __ARM_FEATURE_SHA2,
    __ARM_FEATURE_SHA1 and __ARM_FEATURE_AES, even if they are not defined by compiler. And 7-Zip uses run-time dispatching for branches of crypto code.
    So we can get single binary that can use crypto, if it's supported at run-time.

    But if _ARM_FEATURE_SHA2 is defined already,
    7-zip still checks CPU_IsSupported_SHA2().

    So you could compile 7-zip with defined crypto features (_ARM_FEATURE_SHA2). And 7-zip still checks crypto features at runtime and selects the branch.

    And you could compile 7-zip without defined crypto features. And 7-zip defines (_ARM_FEATURE_SHA2) at compile time inside Sha256Opt.c file and checks cpu crypto flags CPU_IsSupported_SHA2() at runtime in Sha256.c.

    So _ARM_FEATURE_SHA2 was optional at compile time.
    And 7-zip could work on cpus without SHA2 support, even if it was compiled with _ARM_FEATURE_SHA2.

     
  • John Marca

    John Marca - 2024-07-04

    I don't need similar patches for these features in my case but maybe it could be useful for others :)

    For your information:
    - Vita SDK does not define any of these arm features
    - devkitA64 defines __ARM_FEATURE_SHA2 and __ARM_FEATURE_AES

     
  • Igor Pavlov

    Igor Pavlov - 2024-07-04

    SHA and AES code use NEON.
    But another parts of code in 7-zip use NEON branch only in rare cases.
    So patched CPU_IsSupported_NEON() will not improve performance in most cases.

     

    Last edit: Igor Pavlov 2024-07-04

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.