| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-05-07 | 7.0 kB | |
| v0.10.0 source code.tar.gz | 2026-05-07 | 1.6 MB | |
| v0.10.0 source code.zip | 2026-05-07 | 1.9 MB | |
| Totals: 3 Items | 3.5 MB | 0 | |
What's Changed
- Port over parts of
burn-stdtocubecl-zspace. (#1139) @crutcher - chore: bump tracel-llvm version to 20.1.4-6 (#1141) @syl20bnr
- Rename
try_cast_uncheckedtodowncast(#1146) @adolago - feat: Slice destructure/Fixed dim layout (#1149) @wingertge
- Propagate lint rules to crates; fix outstanding violations. (#1138) @crutcher
- fix(spirv): reuse decorated struct id for wkgrp layout(tracel-ai/burn#4355 (#1154)
- fix(spirv): Fix dim type for metadata (#1155) @wingertge
- fix: Ensure
copy_intoworks correctly for different ranks (#1150) @wingertge - feat(spirv): Compilation cache (#1158) @wingertge
- fix(spirv): Fix stack overflow by migrating GVN to iterative algorithm (#1156) @wingertge
- chore(cuda): Bump cudarc for
fallback-dynamic-loading(#1159) @wingertge - fix(spirv): Don't do copy transform if either operand is written to in between read and write (#1161) @wingertge
- Upgrade to wgpu v28 (#1119) @laggui
- fix(metal): fix float to int narrowing (#1163) @dcvz
- Fix MLIR pass ordering and add CPU barrier support (#1151) @jguhlin
- fix: error driver not found! in CI and update to Tracel GH action v8 (#1169) @syl20bnr
- chore: update publish.yml (#1171) @syl20bnr
- Feat: Improve registry (#1172) @nathanielsimard
- Add aliases for backend features consistent with burn (#1174) @wingertge
- fix: Fix SPIR-V fix for
CopyTransform(#1173) @wingertge - Add some missing infra for radix sorting (#1170) @ArthurBrussee
- feat: Add explicit resource errors (#1164) @wingertge
- Add safety docs to generated
launch_uncheckedfunctions (#1168) @adolago - perf: Improve optimized tensor algorithm (#1177) @wingertge
- feat: Vulkan 64bit indexing (#1178) @wingertge
- feat: Chunked binary compilation cache (#1166) @wingertge
- fix: Make TMA errors return a result instead of panicking (#1176) @wingertge
- chore: Update dependencies to deduplicate and get fixes (#1179) @wingertge
- Chore: Pre-Release 0.10.0-pre.1 (#1180) @nathanielsimard
- Upgrade to rand 0.10 (#1182) @laggui
- fix: remove unconditional std on cubecl-runtime and cubecl-common (#1184) @antimora
- feat: Enable 64-bit indexing (#1185) @wingertge
- perf: Remove unconditional format from virtual layout (#1186) @wingertge
- CudaServer::change_server_serialized simultaneous Commands. (#1143) @crutcher
- perf: Improve performance of
MetadataBuilder(#1187) @wingertge - refactor: Metadata (#1190) @wingertge
- fix: Make
cubecl-zspaceno_std, avoid future prelude issues (#1193) @wingertge - Dependency tweaks for building on Android (#1160) @metasim
- fix: Fix metadata on no_std (#1195) @wingertge
- fix(hip): use __hip_bfloat16 types and __hmax/__hmin for ROCm 7.1 (#1152) @GeisYaO
- chore: update to cubecl-hip-sys version 7.1.5280200 (#1192) @syl20bnr
- refactor: Remove
CubeOptionin favor of expandingOption(#1194) @wingertge - Update version (#1205) @nathanielsimard
- Refactor device communication channel (#1199) @nathanielsimard
- Fix/no std device + improve channel device handle performance (#1209) @nathanielsimard
- Mma inplace version & 16x8x8 support (#1213) @louisfd
- feat: Runtime enum (#1208) @wingertge
- Fix wasm compilation error (#1206) @ArthurBrussee
- Fix/memory management (#1214) @nathanielsimard
- Fix: Benchmarking and Profiling (#1220) @nathanielsimard
- fix: Fix for loops with breaks (#1222) @paulzhng
- Remove critical section (#1223) @nathanielsimard
- Fix multiple bugs (#1225) @nathanielsimard
- refactor: Line size generic (#1221) @wingertge
- refactor: Rename and refactor dynamic types (#1229) @wingertge
- rm f32 float from metal (#1233) @louisfd
- feat: Allow view layouts to infer launch info from buffer metadata (#1231) @wingertge
- Remove atomic ptr (#1228) @nathanielsimard
- Nccl all reduce (#1226) @Charles23R
- revert removing f32 atomic from metal (#1235) @louisfd
- chore: Update to wgpu v29, enable 64-bit buffers for Vulkan (#1236) @wingertge
- refactor: Merge
compilation_argandregister(#1237) @wingertge - Fix UB in memory handle location, fix cloning CubeCount::Dynamic (#1239) @ArthurBrussee
- feat: gitignore .DS_Store (#1240) @syl20bnr
- Fix 7 more cases of UB, fix flaky test (#1238) @ArthurBrussee
- fix(wgpu): flush staging buffers periodically during bulk writes (#1204) @holg
- remove nonexistant field (#1242) @louisfd
- fix(cubecl-runtime): PersistentPool HashMap key mismatch and reuse safety (#1241) @Veercodeprog
- Switch to effective_size (#1245) @nathanielsimard
- refactor: Scalars/Metadata (#1244) @wingertge
- chore: Clean up
MetadataBindingInfo(#1248) @wingertge - Fix: defer CPU staging buffer drops with PendingDropQueue (#1255) @nathanielsimard
- Replace bincode with ciborium for compilation cache (#1254) @Veercodeprog
- Fix GPU hangs on integrated AMD GPUs by increasing drop queue flush frequency (#1257) @nathanielsimard
- Document unsafe code in cubecl-hip/cubecl-cuda (#1258) @nathanielsimard
- Adds arena + refactor stream id (#1259) @nathanielsimard
- feat: Atomic vector (#1253) @wingertge
- fix: Fix metal compile error (#1261) @wingertge
- fix: Fix metal again, make features not mutually exclusive (#1262) @wingertge
- Try all options as fallback when autotuning (#1247) @ArthurBrussee
- fix: Use out item for atomic index so it works properly on Metal (#1265) @wingertge
- fix: Improve portability of Vulkan compiler (#1263) @wingertge
- Fix/cuda err all reduce (#1266) @Charles23R
- feat(wgpu): support zero-sized resources (#1256) @ArthurBrussee
- feat: Add
Validateexecution mode (#1268) @wingertge - fix: Remove
const __restrict__from atomic pointers in CUDA (#1273) @wingertge - fix: Remove optimized casts because it's not supported (#1269) @wingertge
- Feat/vector sum (#1286) @nathanielsimard
- Fix UB in arena dropping (#1287) @ArthurBrussee
- Fix performance regression rocm (#1284) @nathanielsimard
- Fix one case of unsoundness, and two other potential bugs. (#1289) @ArthurBrussee
- Fix/tuner group (#1291) @nathanielsimard
- Simplify async tuning (#1292) @ArthurBrussee
- ci: add job to execute miri tests in ci.yml workflow (#1251) @syl20bnr
- Fix and refactor all_reduce (#1290) @Charles23R
- Track tuning lifetime (#1293) @ArthurBrussee
- Feat/device service stage (#1302) @nathanielsimard
- Fix wasm compilation (#1305) @ArthurBrussee
- Refactor cubecl.toml config (#1303) @nathanielsimard
- Fix: actually use the priority (#1307) @nathanielsimard
- Fix cubecl-common Arc (#1308) @laggui
- Fix persistent memory pool reset storage utilization when reserve (#1309) @nathanielsimard
- Add more strategies than a spin loop (#1310) @nathanielsimard
- Improve error reporting on WASM (#1306) @ArthurBrussee
- Server send recv (#1304) @Charles23R
- Fix vector size check for strided tensors with unit strides on non-axis dims (#1312) @antimora