编译版本,默认配置,android-ndk-r26c,xcode 15.2,ubuntu-20.04,ubuntu-22.04,vs2015,vs2017,vs2019,vs2022,emscripten-3.1.28 | file | content | arch | |---|---|---| |ncnn-full-source.zip |包含全部 submodule 代码的完整源码 | | |ncnn-android.zip | android 静态库/动态库 | armeabi-v7a + arm64-v8a + x86 + x86_64 | |ncnn-android-vulkan.zip | android 静态库/动态库,支持 GPU | armeabi-v7a + arm64-v8a + x86 + x86_64 | |ncnn-apple.zip | apple xcframework,ios + ios-simulator + macos + mac-catalyst + watchos + watchos-simulator + tvos + tvos-simulator + visionos + visionos-simulator | arm64 + arm64e + x86_64 | |ncnn-apple-vulkan.zip | apple xcframework,ios + ios-simulator + macos + mac-catalyst + watchos + watchos-simulator + tvos + tvos-simulator + visionos + visionos-simulator,支持 GPU | arm64 + arm64e + x86_64 | |ncnn-ios.zip | ios 静态库 | arm64 | |ncnn-ios-vulkan.zip | ios 静态库,支持 GPU | arm64 | |ncnn-ios-simulator.zip | ios simulator 静态库 | x86_64 + arm64 | |ncnn-ios-simulator-vulkan.zip | ios simulator 静态库,支持 GPU | x86_64 + arm64 | |ncnn-macos.zip | macos 静态库 | x86_64 + arm64 | |ncnn-macos-vulkan.zip | macos 静态库,支持 GPU | x86_64 + arm64 | |ncnn-mac-catalyst.zip | mac catalyst 静态库 | x86_64 + arm64 | |ncnn-mac-catalyst-vulkan.zip | mac catalyst 静态库,支持 GPU | x86_64 + arm64 | |ncnn-watchos.zip | watchos 静态库 | armv7k + arm64_32 | |ncnn-watchos-simulator.zip | watchos simulator 静态库 | x86_64 + arm64 | |ncnn-tvos.zip | tvos 静态库 | x86_64 + arm64 | |ncnn-tvos-vulkan.zip | tvos 静态库,支持 GPU | x86_64 + arm64 | |ncnn-tvos-simulator.zip | tvos simulator 静态库 | x86_64 + arm64 | |ncnn-tvos-simulator-vulkan.zip | tvos simulator 静态库,支持 GPU | x86_64 + arm64 | |ncnn-visionos.zip | visionos 静态库 | arm64 | |ncnn-visionos-simulator.zip | visionos simulator 静态库 | x86_64 + arm64 | |ncnn-ubuntu.zip | ubuntu linux 静态库/动态库,支持 GPU,模型转换工具 | x86_64 | |ncnn-windows.zip | windows 静态库/动态库,支持 GPU,模型转换工具 | x86 + x64 + arm + arm64 | |ncnn-webassembly.zip | webassembly 静态库 | wasm32 + simd + threads + simd-threads |
解耦合layer cpu和vulkan,不再使用virtual public继承 支持编译动态库时编译单元测试 单层特性掩码支持禁用多线程 extractor set_num_threads和set_vulkan_compute现在是无操作 gpu shader增加uniform类型改善adreno上fp16兼容性 检测vulkan矩阵扩展8x8x16配置,fp16a条件下默认使用fp16累加 更新stb_image rvv/neon优化 x86 mish avx512优化(@wnqn1597) riscv gemm fp32 rvv优化(@Xinyu302) 加载模型上传权重时不保留无用的临时数据 c-api新增draw rectangle/text/circle/line接口(@Deepdive543443) 修复armv7平台加载fp16模型sigbus错误 修复reduction L2norm denormal产生inf的问题 修复arm平台pixel_resize rounding导致的数值误差 修复softmax arm fp16计算错误 修复risc-v rvv输出fp16没有自动转换的问题 修复destroy_gpu_instance在驱动加载不完整时crash的问题(@shatyuka) destroy_gpu_instance等待全部设备idle(@whyb) 修复low-level api没有load_param直接create_pipeline可能的崩溃 修复ncnnoptimize在shape推断的崩溃 ncnnoptimize支持更多新算子,修复gemm权重丢失问题 被调试时候禁用signal指令集检测 windows-arm平台使用ruapu cpu指令集检测 arm vfpv4支持时启用自动转换fp16 在arm64架构中总是报告支持neon和vfpv4 simplevk寻找更多已知的vulkan驱动路径 修复旧cpp标准下risc-v rvv编译错误 修复某些老编译器在debug模式下编译错误 修复uwp平台编译 修复test_reduction运行时的警告 修复NCNN_PIXEL_DRAWING禁用时候编译错误(@shatyuka) 支持MSVC使用LLVM openmp运行时的配合编译(@shatyuka) 修复yolov8 python示例返回空发生错误(@dsplvd) pnnx解耦torchscript加载,清理cxxabi hack,修复whole-archive链接 pnnx加载dynamo onnx,默认不启用编译 pnnx改善函数化,支持更多slice+inplace复合操作 pnnx转换torch.masked_select/torch.slice_scatter pnnx支持超过4G的模型 pnnx macos编译universal wheel pnnx添加entrypoint脚本 pnnx支持动态slice下标 pnnx转换softmin logsoftmax dtype参数 pnnx处理index_put传入空indices和标量数值 pnnx转换一些cudnn conv2d变种 pnnx合并完整slices为tensor_split pnnx合并静态embedding pnnx不消除会导致shape变化的数学操作 pnnx改善torch-2.1 mha attn_mask探测 pnnx修复无bias tensor的nn.Conv2d转换 pnnx转换torch.stack负数dim pnnx添加torch.arange单元测试 pnnx修复图匹配失败时可能的越界访问问题 pnnx识别embedding输入的batch轴为0 pnnx python添加控制fp16参数(@MollySophia) pnnx添加torch-2.2 ci github ci使用4并行编译 更新cmake ios工具链,添加visionos ci,watchos支持arm64_32架构 添加apple a17和m3 cpu名称 不再编译apple平台32bit支持,不再编译ios arm64e架构,提升最低部署版本到ios-13 统一android python macos ci 不再打包和发布apple bitcode和32bit预编译包,新增visionos预编译包,新增tvos-gpu预编译包,更新openmp到18.1.2 改善a53/a55双发射文档(@luqiang-guo) 添加windows上protobuf>=22.0编译文档(@Galasnow) 更新macos编译文档(@lll143653) 清理无用的代码警告(@hokamilkv) 修正FAQ的拼写错误(@eltociear) 修正拼写错误(@hugo-syn) 修正拼写错误(@afredooo) 修正convolution_x86注释错误(@strongtz) 添加markdown文档代码辅助标志(@hugo-syn) 添加OneCloud跑分数据(@mizu-bai) 添加AWS c5.4xlarge跑分数据(@mizu-bai) 添加Xeon Phi 3120A跑分数据(@mizu-bai) 添加orangepi zero2跑分数据(@wonderfullook) 添加Dimensity 9300 MT6989跑分数据(@MollySophia) 添加PhytiumPi跑分数据(@HalfSweet) 添加remipi跑分数据(@dreamcmi) 添加radxa zero 3w跑分数据(@Qengineering)
New Contributors
- @wonderfullook made their first contribution in https://github.com/Tencent/ncnn/pull/5277
- @hugo-syn made their first contribution in https://github.com/Tencent/ncnn/pull/5301
- @FartSimps0n made their first contribution in https://github.com/Tencent/ncnn/pull/5304
- @HalfSweet made their first contribution in https://github.com/Tencent/ncnn/pull/5312
- @strongtz made their first contribution in https://github.com/Tencent/ncnn/pull/5310
- @afredooo made their first contribution in https://github.com/Tencent/ncnn/pull/5339
- @shatyuka made their first contribution in https://github.com/Tencent/ncnn/pull/5346
- @dsplvd made their first contribution in https://github.com/Tencent/ncnn/pull/5345
- @Galasnow made their first contribution in https://github.com/Tencent/ncnn/pull/5359
- @hokamilkv made their first contribution in https://github.com/Tencent/ncnn/pull/5365
Full Changelog: https://github.com/Tencent/ncnn/compare/20240102...20240410