Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-03-19 | 2.1 kB | |
Version 0.8.1 CUDA 12.x compatibility improvements _ minor fixes source code.tar.gz | 2025-03-19 | 630.3 kB | |
Version 0.8.1 CUDA 12.x compatibility improvements _ minor fixes source code.zip | 2025-03-19 | 758.6 kB | |
Totals: 3 Items | 1.4 MB | 1 |
Changes since v0.8.0:
CUDA 12.x compatibility
- [#711] : Added preliminary information regarding Blackwell cards and micro-architecture
*#701 : The
--version-ident
compilation option to NVRTC was dropped in CUDA 12.2; this is now respected by the wrappers and the option is not exposed for 12.2 and newer versions of CUDA. - [#702] : Fixed handling of
--version-ident
(we had a spacing issue) - [#635], [#701] : Added support for the
--fdevice_syntax_only
and--minimal
options for NVRTC compilation
Changes to the unique_span
& unique_region
classes
- [#703] :
unique_span<T>::swap()
now correctly swaps the deleters as well - [#713] : Move constructor and assignment operator of
unique_region_t
- [#702] : Fixed a typo when passing the
--no-source-include
option to NVRTC - [#719]: Removed redundant cast operations from
unique_span<T>
Bug fixes
- [#706] : Made
context_t::flags()
non-virtual - [#710] : Fixed the comparison operators for launch configurations
- [#709] : Span-to-C-array copy no longer ignoring the designated stream
- [#708] : Avoiding infinite recursion in
link_t::add_file()
Build & installation
- [#717]: Creating possibly-missing CUDAToolkit targets in installed config files, so that library targets can rely on them:
nvfatbin
,nvfatbin_static
andcufilt
.
Other changes
- [#704] : Limited the clang warning flags (no
-pedantic
) to avoid warnings we can't resolve - [#705] : Made some methods of
library_t
beconst
- [#721] :
device::proprties_t::max_in_flight_threads_on_device()
now returns anunsigned
(rather thanunsigned long long
)
Example programs
- [#720] : Avoiding suspicious numeric conversions in the example programs (mostly inherited from NVIDIA, tsk tsk tsk)
- [#722]: In simpleCudaGraphs, when using stream capture, now enqueueing the correct, existing event rather than an anonymous transient event
- Now compiling the example programs with more warning flags on.