NVIDIA Warp
A Python framework for accelerated simulation, data generation
...It enables developers to write kernel-level code in Python that is automatically compiled into efficient CUDA kernels, combining ease of use with near-native performance. The framework is designed for applications such as robotics, reinforcement learning, physical simulation, and differentiable computing, where performance and flexibility are critical. Warp provides a set of primitives for working with arrays, geometry, and physics operations, allowing users to implement complex simulations without writing low-level CUDA code directly. It also supports differentiable programming, enabling gradients to be computed through simulation pipelines, which is particularly valuable for machine learning integration.