| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-04-22 | 10.5 kB | |
| v0.31.2 source code.tar.gz | 2026-04-22 | 4.3 MB | |
| v0.31.2 source code.zip | 2026-04-22 | 4.7 MB | |
| Totals: 3 Items | 9.0 MB | 1 | |
Highlights
- Wider support for cuda quantized matmuls (#3352, [#3268], [#3321], [#3417], [#3255])
- MLX can be used by multiple threads for independent computations (#3405, [#3348], [#3281], [#3423])
- Added CUDA FFT support
- JACCL is now a standalone lib (#3412)
What's Changed
- Bump by @angeloskath in https://github.com/ml-explore/mlx/pull/3244
- win: re-enable and fix cuDNN performance by @dhiltgen in https://github.com/ml-explore/mlx/pull/3242
- Fix crashes in multi-threaded process teardown by @louen in https://github.com/ml-explore/mlx/pull/3167
- [CUDA] Add FFT support by @lucasnewman in https://github.com/ml-explore/mlx/pull/3243
- [CUDA] Implement MaskedScatter by @Lyxot in https://github.com/ml-explore/mlx/pull/3151
- docs: fix PyTorch to MLX conversion example by @LxYuan0420 in https://github.com/ml-explore/mlx/pull/3265
- update requirements for Macbook Neo by @tosh in https://github.com/ml-explore/mlx/pull/3257
- fix comparison op JVP returning bool tangents instead of input dtype by @mm65x in https://github.com/ml-explore/mlx/pull/3253
- fix nn.GRU skipping bhn bias when hidden is None by @mm65x in https://github.com/ml-explore/mlx/pull/3252
- [CUDA] Pipelined QMM by @zcbenz in https://github.com/ml-explore/mlx/pull/3255
- tests: harden memory leak check in test_siblings_without_eval by @booxter in https://github.com/ml-explore/mlx/pull/3088
- Slice update with operation by @angeloskath in https://github.com/ml-explore/mlx/pull/3266
- Nax Refactor by @jagrit06 in https://github.com/ml-explore/mlx/pull/3271
- Fix building with CUDA toolkit 13.2 by @zcbenz in https://github.com/ml-explore/mlx/pull/3273
- [CUDA] fp and int4 quants for qmm_sm80 by @zcbenz in https://github.com/ml-explore/mlx/pull/3268
- Fix repr of conv layers by @angeloskath in https://github.com/ml-explore/mlx/pull/3275
- Merge DeviceStream into CommandEncoder by @zcbenz in https://github.com/ml-explore/mlx/pull/3264
- [CUDA] Search system-installed CUDA toolkit for headers by @zcbenz in https://github.com/ml-explore/mlx/pull/3277
- Create default random key lazily by @zcbenz in https://github.com/ml-explore/mlx/pull/3278
- Support indexing with any type which implmented
__index__by @aisk in https://github.com/ml-explore/mlx/pull/3210 - Fix sort NaN handling for float16 and bfloat16 by @Lyxot in https://github.com/ml-explore/mlx/pull/3269
- Use thread local storage for frontend compile cache by @zcbenz in https://github.com/ml-explore/mlx/pull/3280
- [Metal][Performance]: Add split-K for quantized matmul (small M) by @Ziqiao-git in https://github.com/ml-explore/mlx/pull/3120
- [Metal] Fix depthwise conv 1D kernel name for large variant by @Brooooooklyn in https://github.com/ml-explore/mlx/pull/3289
- Fix stale transform copy-chain leaks by @Brooooooklyn in https://github.com/ml-explore/mlx/pull/3290
- Implement Pad::vmap to replace NYI stub by @Aristide021 in https://github.com/ml-explore/mlx/pull/3304
- logo files by @andresy in https://github.com/ml-explore/mlx/pull/3308
- Fix vmap + floor_divide: preserve integer dtype by @robert-johansson in https://github.com/ml-explore/mlx/pull/3292
- Fix moved-from shape bug in broadcast_arrays causing vmap bus error by @Aristide021 in https://github.com/ml-explore/mlx/pull/3310
- Use nb::ndarray for checking arrays by @zcbenz in https://github.com/ml-explore/mlx/pull/3283
- Add output_shapes for AddMM by @pHequals7 in https://github.com/ml-explore/mlx/pull/3262
- Manage Metal objects with smart pointers by @zcbenz in https://github.com/ml-explore/mlx/pull/3282
- [CUDA] support sorting complex numbers by @Lyxot in https://github.com/ml-explore/mlx/pull/3286
- Add norm parameter to FFT transforms (backward/ortho/forward) by @Aristide021 in https://github.com/ml-explore/mlx/pull/3287
- Make each thread have its own default stream by @zcbenz in https://github.com/ml-explore/mlx/pull/3281
- [CUDA] Implement BlockMaskedMM by @Lyxot in https://github.com/ml-explore/mlx/pull/3299
- Fix np bfloat16 misinterpreted as complex by @kellen-sun in https://github.com/ml-explore/mlx/pull/3146
- Remove no longer needed const_cast by @zcbenz in https://github.com/ml-explore/mlx/pull/3325
- Bump actions/deploy-pages from 4 to 5 by @dependabot[bot] in https://github.com/ml-explore/mlx/pull/3334
- Fix use after move by @angeloskath in https://github.com/ml-explore/mlx/pull/3343
- Decouple CommandEncoder from Device by @zcbenz in https://github.com/ml-explore/mlx/pull/3316
- Add vmap for BroadcastAxes by @angeloskath in https://github.com/ml-explore/mlx/pull/3344
- Add fftfreq, rfftfreq and scalar axes for fftshift/ifftshift by @declanhealy2 in https://github.com/ml-explore/mlx/pull/3298
- [Metal] Support sorting complex numbers by @Lyxot in https://github.com/ml-explore/mlx/pull/3314
- [CUDA] Fallback QMM by @zcbenz in https://github.com/ml-explore/mlx/pull/3315
- Make CommandEncoder thread local by @zcbenz in https://github.com/ml-explore/mlx/pull/3348
- [CUDA] 3/5/6-bit quants for qmm_naive by @zcbenz in https://github.com/ml-explore/mlx/pull/3352
- Fix regression in array creation by @angeloskath in https://github.com/ml-explore/mlx/pull/3353
- Use
metalas the front-end for the metal linker by @louen in https://github.com/ml-explore/mlx/pull/3354 - Add printoptions by @ChristophePRAT in https://github.com/ml-explore/mlx/pull/3333
- Add a convenience for making local streams in python by @angeloskath in https://github.com/ml-explore/mlx/pull/3355
- Fix CMake finding wrong Python during pip install by @fijimunkii in https://github.com/ml-explore/mlx/pull/3375
- [CUDA] Add GatherQMM for quantized gather matmul by @Lyxot in https://github.com/ml-explore/mlx/pull/3321
- fix: fail build when Metal compiler header resolution fails by @dogukanveziroglu in https://github.com/ml-explore/mlx/pull/3332
- Fix: Correct cross-attention query routing in Post-LN TransformerDecoderLayer by @suryawanshishantanu6 in https://github.com/ml-explore/mlx/pull/3382
- [CUDA] Thread safety by @zcbenz in https://github.com/ml-explore/mlx/pull/3367
- Fix test "test get streams" missing initialization by @dseredkin in https://github.com/ml-explore/mlx/pull/3376
- Conjugate VJP and JVP support by @CameronChurchwell in https://github.com/ml-explore/mlx/pull/3386
- Fix int16 overflow in SDPA NAX mask indexing for KV sequences > 32K by @Clydingus in https://github.com/ml-explore/mlx/pull/3361
- Avoid joining threads on exit by @zcbenz in https://github.com/ml-explore/mlx/pull/3388
- Add clear_streams API for cleanup before exit by @zcbenz in https://github.com/ml-explore/mlx/pull/3395
- Update nanobind version to v2.12.0 by @jrp2014 in https://github.com/ml-explore/mlx/pull/3396
- Jaccl refactor by @angeloskath in https://github.com/ml-explore/mlx/pull/3412
- Fixes for CUDA CI by @zcbenz in https://github.com/ml-explore/mlx/pull/3413
- Validate safetensors data offsets by @MillaFleurs in https://github.com/ml-explore/mlx/pull/3364
- Validate safetensors data offsets against file boundaries by @matinsaurralde in https://github.com/ml-explore/mlx/pull/3410
- Document sort stability and NaN handling by @NeuralNoble in https://github.com/ml-explore/mlx/pull/3400
- ThreadLocalStream in C++ by @zcbenz in https://github.com/ml-explore/mlx/pull/3405
- Fix jaccl init bug by @angeloskath in https://github.com/ml-explore/mlx/pull/3418
- Segmented mm nax kernel by @angeloskath in https://github.com/ml-explore/mlx/pull/3419
- [CUDA] gather_mm by @zcbenz in https://github.com/ml-explore/mlx/pull/3414
- [CUDA] GatherQMM matrix-matrix sm80/naive path by @Lyxot in https://github.com/ml-explore/mlx/pull/3417
- [CUDA] Handle residue k in qmm_naive by @zcbenz in https://github.com/ml-explore/mlx/pull/3379
- Speed up NAX split-K by better tuning and routing and fix NAX addmm by @angeloskath in https://github.com/ml-explore/mlx/pull/3422
- Make Scheduler::enqueue thread safe by @zcbenz in https://github.com/ml-explore/mlx/pull/3423
- Fix flaky TestVmap.test_vmap_masked_scatter by @zcbenz in https://github.com/ml-explore/mlx/pull/3421
- Fix synchronize for ThreadLocalStream by @angeloskath in https://github.com/ml-explore/mlx/pull/3429
- Fix bytes_per_key truncation in random kernels (Metal + CUDA) by @dogukanveziroglu in https://github.com/ml-explore/mlx/pull/3432
- Throw meaningful error when Metal device is not found by @dogukanveziroglu in https://github.com/ml-explore/mlx/pull/3428
- Fix kernel cache collision in Compiled constructor by @dogukanveziroglu in https://github.com/ml-explore/mlx/pull/3427
- Fix mx.prod vjp for complex types by @CameronChurchwell in https://github.com/ml-explore/mlx/pull/3433
New Contributors
- @LxYuan0420 made their first contribution in https://github.com/ml-explore/mlx/pull/3265
- @tosh made their first contribution in https://github.com/ml-explore/mlx/pull/3257
- @mm65x made their first contribution in https://github.com/ml-explore/mlx/pull/3253
- @booxter made their first contribution in https://github.com/ml-explore/mlx/pull/3088
- @Ziqiao-git made their first contribution in https://github.com/ml-explore/mlx/pull/3120
- @Brooooooklyn made their first contribution in https://github.com/ml-explore/mlx/pull/3289
- @Aristide021 made their first contribution in https://github.com/ml-explore/mlx/pull/3304
- @pHequals7 made their first contribution in https://github.com/ml-explore/mlx/pull/3262
- @declanhealy2 made their first contribution in https://github.com/ml-explore/mlx/pull/3298
- @fijimunkii made their first contribution in https://github.com/ml-explore/mlx/pull/3375
- @dogukanveziroglu made their first contribution in https://github.com/ml-explore/mlx/pull/3332
- @suryawanshishantanu6 made their first contribution in https://github.com/ml-explore/mlx/pull/3382
- @dseredkin made their first contribution in https://github.com/ml-explore/mlx/pull/3376
- @CameronChurchwell made their first contribution in https://github.com/ml-explore/mlx/pull/3386
- @Clydingus made their first contribution in https://github.com/ml-explore/mlx/pull/3361
- @jrp2014 made their first contribution in https://github.com/ml-explore/mlx/pull/3396
- @matinsaurralde made their first contribution in https://github.com/ml-explore/mlx/pull/3410
- @NeuralNoble made their first contribution in https://github.com/ml-explore/mlx/pull/3400
Full Changelog: https://github.com/ml-explore/mlx/compare/v0.31.1...v0.31.2