Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2022-02-14 | 1.8 kB | |
Version 1.4.tar.gz | 2022-02-14 | 14.4 MB | |
Version 1.4.zip | 2022-02-14 | 14.9 MB | |
Totals: 3 Items | 29.3 MB | 0 |
Changes Since Last Release
Major Changes
- Added a PyTorch extension for using tiny-cuda-nn from within Python.
- This functionality is considered to be in a "beta" state. Please do report any issues you come across!
- See the this section of the README for installation/usage instructions.
- Caveat: the overheads of Python/PyTorch can be extensive. For example, the bundled mlp_learning_an_image example is ~2x slower through PyTorch than native CUDA. (This is still faster than implementing everything from scratch in Python, but something to be aware of.)
- Significantly reduced memory usage (sometimes 3x lower)
- Added a GPU memory arena that permits efficient, stream-ordered allocation and de-allocation of temporary buffers. This circumvents the need for pre-allocation, resulting in often 3x lower memory consumption.
- The memory arena uses the GPU's virtual memory mapper to get its performance without invalidating pointers or shuffling memory around.
- All neural networks in tiny-cuda-nn now additionally support row-major input memory layout. This affords higher performance and lower memory usage when transposition was otherwise required.
GridEncoding
naturally outputs row-major data and is thus sped-up by ~20% when followed by a neural network.- tiny-cuda-nn now runs on older GPUs down to compute capability 37.
Minor Changes
- Sped up the input gradient computation of
GridEncoding
by ~3x. - Sped up
SyncedMultiStream
. - Fixed incorrect gradients of
SphericalHarmonicsEncoding
. - Fixed incorrect gradients of
GridEncoding
whenmax_level
arguments were provided orInterpolation::Nearest
was used.