Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-04-01 | 1.5 kB | |
Simd v6.1.148 source code.tar.gz | 2025-04-01 | 4.5 MB | |
Simd v6.1.148 source code.zip | 2025-04-01 | 6.0 MB | |
Totals: 3 Items | 10.6 MB | 0 |
New features
- ForwardSmallNK algorithm in Base implementation of class SynetDeconvolution16bNhwcGemm.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetChannelSum16b.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetScale16b.
Algorithms
Improving
- AMX-BF16 optimizations of class SynetDeconvolution16bNhwcGemm.
- AMX-BF16 optimizations of class SynetMergedConvolution16bCdc.
- AMX-BF16 optimizations of class SynetMergedConvolution16bCd.
- AMX-BF16 optimizations of class SynetMergedConvolution16bDc.
Bug fixing
- Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetDeconvolution16bNhwcGemm.
- Error in Base implementation, AVX-512BW, AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
- Error in class Xml::NodeIterator.
- Error in class Xml::AttributeIterator.
Test framework
New features
- Tests for verifying functionality of function SynetChannelSum16b.
- Tests for verifying functionality of class SynetScale16b.
- Pinning of test threads (-pt=1 command line argument).
Infrastructure
New features
- Clang version parameter in Github actions script for CMake.
- Check Clang-19 in Github actions script for CMake.