| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| CCCL Python Libraries (v0.5.0) source code.tar.gz | 2026-02-04 | 10.1 MB | |
| CCCL Python Libraries (v0.5.0) source code.zip | 2026-02-04 | 16.5 MB | |
| README.md | 2026-02-04 | 1.9 kB | |
| Totals: 3 Items | 26.6 MB | 0 | |
These are the release notes for the cuda-cccl Python package version 0.5.0, dated February 5th, 2026. The previous release was v0.4.5.
cuda-cccl is in "experimental" status, meaning that its API and feature set can change quite rapidly.
Installation
Please refer to the install instructions here
⚠️ Breaking change
Object-based API requires passing operator to algorithm __call__ method
This API change affects only users of the object-based API (expert mode).
Previously, constructing an algorithm object required passing the operator as an argument, but invoking it did not:
:::python
# step 1: create algorithm object
transformer = cuda.compute.make_unary_transform(d_input, d_output, some_unary_op)
# step 2: invoke algorithm
transformer(d_in1, d_out1, num_items1) # NOTE: not passing some_unary_op here
The new behaviour requires passing it in both places:
:::python
# step 1: create algorithm object
transformer = cuda.compute.make_unary_transform(d_input, d_output, some_unary_op)
# step 2: invoke algorithm
transformer(d_in1, d_out1, some_unary_op, num_items1) # NOTE: need to pass some_unary_op here
This change is introduced because in many situations (such as in a loop), the operator itself and the globals/closures it references can change between construction and invocation (or between invocations).
Features
Improvements
- Avoid unnecessary recompilation of stateful operators (https://github.com/NVIDIA/cccl/pull/7500)
- Improved cache lookup performance (https://github.com/NVIDIA/cccl/pull/7501)
Bug Fixes
- Fix handling of boolean types in cuda.compute (https://github.com/NVIDIA/cccl/pull/7389)