| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 0.49.0 source code.tar.gz | 2025-12-11 | 293.4 kB | |
| 0.49.0 source code.zip | 2025-12-11 | 380.5 kB | |
| README.md | 2025-12-11 | 5.3 kB | |
| Totals: 3 Items | 679.2 kB | 0 | |
Highlights
x86-64 CPU Improvements
CPU performance for 4bit is significantly improved on x86-64, with optimized kernel paths for CPUs that have AVX512 or AVX512BF16 support.
AMD ROCm Experimental Wheels
- Experimental support for AMD devices is now included in our PyPI wheels on Linux x86-64.
- We've added additional GPU target devices as outlined in our docs.
- Support for using the default blocksize of 64 for 4bit was added for RDNA GPUs in [#1748].
macOS 14+ Wheels
- We're now publishing wheels for macOS 14+!
- The 4bit and 8bit quantization features are supported on MPS by slow implementations. We plan to enable Metal kernels with improved performance in the future.
🚨 Breaking Changes
- Dropped support for Python 3.9.
- Dropped compilation support for Maxwell GPUs in the CUDA backend.
What's Changed
- [ROCm] Update build targets by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1788
- Drop Python 3.9 support by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1795
- Fix indexing overflow issue for blockwise quantization on AMD by @sstamenk in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1796
- Tests: Run CPU tests against PyTorch 2.9 by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1797
- Remove deprecated code by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1798
- Cpu C++ kernel by @jiqing-feng in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1789
- fix build error: "no case matching constant switch condition" by @yuguo68 in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1802
- CI: skip rebuilding CPU lib when building/installing wheels by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1803
- add support for 64 block size on 32 warp size supported amd gpus by @electron271 in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1748
- Enable more tests on AMD for warp size 32 by @sstamenk in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1805
- CUDA: Drop compilation compatibility with Maxwell by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1806
- ROCm: Add build for ROCm 7.1 by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1807
- CI: Enable tests on Linux x86-64 with CUDA 13 by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1808
- Replace NULL with nullptr in pythonInterface.cpp by @yuguo68 in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1809
- CI: Run tests on PRs, refactor nightly test workflow by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1811
- Remove old nightly workflow by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1812
- Cpu fused kernel by @jiqing-feng in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1804
- Update README by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1816
- Cleanup: remove FastBinarySearch by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1817
- Enable publishing of macOS wheel by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1818
- ROCm: reduce size of builds by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1819
- CUDA 13: aggressive compression of binary size by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1820
- ROCm: Add gfx1150/gfx1151 to build targets by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1822
- Update workflow dependencies by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1824
- Hf kernel by @jiqing-feng in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1814
- CUDA/ROCm: Remove dead code by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1827
- CPU: workaround avx512 4bit dequantize accuracy issue for large blocksize by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1828
- Update installation doc by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1830
- Add release for DGX Spark cuda121 by @mfuntowicz in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1829
- Fix: Python 3.14 compatibility with PyTorch 2.9 by @matthewdouglas in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1831
New Contributors
- @sstamenk made their first contribution in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1796
- @yuguo68 made their first contribution in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1802
- @electron271 made their first contribution in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1748
- @mfuntowicz made their first contribution in https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1829
Full Changelog: https://github.com/bitsandbytes-foundation/bitsandbytes/compare/0.48.2...0.49.0