Clean and efficient FP8 GEMM kernels with fine-grained scaling
ONNX-TensorRT: TensorRT backend for ONNX
CUDA Templates for Linear Algebra Subroutines
fast C++ library for linear algebra & scientific computing
fast C++ library for GPU linear algebra & scientific computing
OpenCV Bindings for node.js
lightweight GPU-based sparse matrix-vector multiplication (SpMV)
Computer vision and image processing library for Qt.