Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-08-07 | 4.0 kB | |
v0.28.0 source code.tar.gz | 2025-08-07 | 3.1 MB | |
v0.28.0 source code.zip | 2025-08-07 | 3.4 MB | |
Totals: 3 Items | 6.5 MB | 0 |
Highlights
- First version of fused sdpa vector for CUDA
- Convolutions in CUDA
- Speed improvements in CUDA normalization layers, softmax, compiled kernels, overheads and more
What's Changed
- [CUDA] Fix segfault on exit by @awni in https://github.com/ml-explore/mlx/pull/2424
- [CUDA] No occupancy query for launch params by @awni in https://github.com/ml-explore/mlx/pull/2426
- [CUDA] More sizes for gemv by @awni in https://github.com/ml-explore/mlx/pull/2429
- Add more CUDA architectures for PyPi package by @awni in https://github.com/ml-explore/mlx/pull/2427
- Use ccache in CI by @zcbenz in https://github.com/ml-explore/mlx/pull/2414
- [CUDA] Use aligned vector in Layer Norm and RMS norm by @awni in https://github.com/ml-explore/mlx/pull/2433
- Cuda faster softmax by @awni in https://github.com/ml-explore/mlx/pull/2435
- Remove the kernel arg from get_launch_args by @zcbenz in https://github.com/ml-explore/mlx/pull/2437
- Move arange to its own file by @zcbenz in https://github.com/ml-explore/mlx/pull/2438
- Use load_vector in arg_reduce by @zcbenz in https://github.com/ml-explore/mlx/pull/2439
- Make CI faster by @zcbenz in https://github.com/ml-explore/mlx/pull/2440
- [CUDA] Quantized refactoring by @angeloskath in https://github.com/ml-explore/mlx/pull/2442
- fix circular reference by @awni in https://github.com/ml-explore/mlx/pull/2443
- [CUDA] Fix gemv regression by @awni in https://github.com/ml-explore/mlx/pull/2445
- Fix wrong graph key when using concurrent context by @zcbenz in https://github.com/ml-explore/mlx/pull/2447
- Fix custom metal extension by @awni in https://github.com/ml-explore/mlx/pull/2446
- Add tests for export including control flow models and quantized models by @junpeiz in https://github.com/ml-explore/mlx/pull/2430
- [CUDA] Backward convolution by @zcbenz in https://github.com/ml-explore/mlx/pull/2431
- [CUDA] Save primitive inputs faster by @zcbenz in https://github.com/ml-explore/mlx/pull/2449
- [CUDA] Vectorize generated kernels by @angeloskath in https://github.com/ml-explore/mlx/pull/2444
- [CUDA] Matmul utils initial commit by @angeloskath in https://github.com/ml-explore/mlx/pull/2441
- Fix arctan2 grads by @angeloskath in https://github.com/ml-explore/mlx/pull/2453
- Use LRU cache for cuda graph by @zcbenz in https://github.com/ml-explore/mlx/pull/2448
- Add missing algorithm header to jit_compiler.cpp for Linux builds by @zamderax in https://github.com/ml-explore/mlx/pull/2460
- Default install cuda on linux by @awni in https://github.com/ml-explore/mlx/pull/2462
- fix wraps compile by @awni in https://github.com/ml-explore/mlx/pull/2461
- Feat: add USE_SYSTEM_FMT CMake option by @GaetanLepage in https://github.com/ml-explore/mlx/pull/2219
- Use SmallVector for shapes and strides by @zcbenz in https://github.com/ml-explore/mlx/pull/2454
- Fix install tags by @awni in https://github.com/ml-explore/mlx/pull/2464
- Faster gather qmm sorted test by @awni in https://github.com/ml-explore/mlx/pull/2463
- Fix cublas on h100 by @awni in https://github.com/ml-explore/mlx/pull/2466
- revert default cuda install by @awni in https://github.com/ml-explore/mlx/pull/2465
- feat: support a destinations based in tree flatten/unflatten by @LVivona in https://github.com/ml-explore/mlx/pull/2450
- Fix typo in metal command encoder by @angeloskath in https://github.com/ml-explore/mlx/pull/2471
- Update CUDA sdpa by @jagrit06 in https://github.com/ml-explore/mlx/pull/2468
- version by @awni in https://github.com/ml-explore/mlx/pull/2470
New Contributors
- @junpeiz made their first contribution in https://github.com/ml-explore/mlx/pull/2430
- @zamderax made their first contribution in https://github.com/ml-explore/mlx/pull/2460
- @GaetanLepage made their first contribution in https://github.com/ml-explore/mlx/pull/2219
- @LVivona made their first contribution in https://github.com/ml-explore/mlx/pull/2450
Full Changelog: https://github.com/ml-explore/mlx/compare/v0.27.1...v0.28.0