| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-02-27 | 4.0 kB | |
| v0.31.0 source code.tar.gz | 2026-02-27 | 4.2 MB | |
| v0.31.0 source code.zip | 2026-02-27 | 4.6 MB | |
| Totals: 3 Items | 8.7 MB | 0 | |
Highlights
- Initial version of QMMs for CUDA (#3160)
- JACCL mesh bandwidth improvements (#3174)
- Massive speedups for 3D cones (#3147)
- Continued improvements to qqmm (#3106, [#3022])
What's Changed
- Patch bump by @angeloskath in https://github.com/ml-explore/mlx/pull/3102
- is_available() should check the device index too by @andresy in https://github.com/ml-explore/mlx/pull/3107
- Fix residency set with user provided buffer by @awni in https://github.com/ml-explore/mlx/pull/3108
- Cleanup test_fast_sdpa.py by @zcbenz in https://github.com/ml-explore/mlx/pull/3112
- [CUDA] Set current device before allocating memory by @zcbenz in https://github.com/ml-explore/mlx/pull/3110
- Quantize module to QQLinear by @nastya236 in https://github.com/ml-explore/mlx/pull/3106
- [CUDA] Use cuDNN SDPA for decoding when using fixed-size KV cache by @zcbenz in https://github.com/ml-explore/mlx/pull/3113
- register pressure by @nastya236 in https://github.com/ml-explore/mlx/pull/3116
- Fix precision in Metal fused attention by @awni in https://github.com/ml-explore/mlx/pull/3119
- [CUDA] Attention sinks in cuDNN SDPA by @zcbenz in https://github.com/ml-explore/mlx/pull/3118
- Fix donation in sdpa vector by @angeloskath in https://github.com/ml-explore/mlx/pull/3121
- Manage stream placement in import function by @awni in https://github.com/ml-explore/mlx/pull/3127
- fix: propagate quantization mode in QuantizedAllToShardedLinear / QuantizedShardedToAllLinear by @vskiwi in https://github.com/ml-explore/mlx/pull/3133
- [featuring] - add hanning window function by @Vlor999 in https://github.com/ml-explore/mlx/pull/3124
- feat: adding the hamming function by @Vlor999 in https://github.com/ml-explore/mlx/pull/3135
- Tensor scale nvfp4 by @nastya236 in https://github.com/ml-explore/mlx/pull/3022
- Fix fence synchronization accross command buffers by @awni in https://github.com/ml-explore/mlx/pull/3144
- Export: preserve Dtype state values in export callback arguments by @skryl in https://github.com/ml-explore/mlx/pull/3145
- [Metal] Fix 32-bit integer overflow in conv3d unfold kernel by @kellen-sun in https://github.com/ml-explore/mlx/pull/3143
- [Metal][Performance] Add implicit matmul pathway for mx.conv3d by @belkakari in https://github.com/ml-explore/mlx/pull/3147
- [Metal] Fix event leak by @awni in https://github.com/ml-explore/mlx/pull/3159
- [CUDA] FPxINT quantized matmul for Hopper by @zcbenz in https://github.com/ml-explore/mlx/pull/3160
- feat: implement mlx.core.blackman by @Vlor999 in https://github.com/ml-explore/mlx/pull/3136
- Enable setting thread block cluster for Hopper and later by @zcbenz in https://github.com/ml-explore/mlx/pull/3168
- [CUDA][NCCL] group split by @nastya236 in https://github.com/ml-explore/mlx/pull/3172
- JACCL refactor and small update by @angeloskath in https://github.com/ml-explore/mlx/pull/3174
- [CUDA] Heuristics for Hopper QMM by @zcbenz in https://github.com/ml-explore/mlx/pull/3173
- Fix compile_fuse broadcast split aliasing bug by @robert-johansson in https://github.com/ml-explore/mlx/pull/3166
- Enable passing in a GPU architecture string via env var by @angeloskath in https://github.com/ml-explore/mlx/pull/3176
- Bump the minor version by @angeloskath in https://github.com/ml-explore/mlx/pull/3183
New Contributors
- @vskiwi made their first contribution in https://github.com/ml-explore/mlx/pull/3133
- @Vlor999 made their first contribution in https://github.com/ml-explore/mlx/pull/3124
- @skryl made their first contribution in https://github.com/ml-explore/mlx/pull/3145
- @kellen-sun made their first contribution in https://github.com/ml-explore/mlx/pull/3143
- @belkakari made their first contribution in https://github.com/ml-explore/mlx/pull/3147
- @robert-johansson made their first contribution in https://github.com/ml-explore/mlx/pull/3166
Full Changelog: https://github.com/ml-explore/mlx/compare/v0.30.6...v0.31.0