Showing 19 open source projects for "linux nvidia"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    NVTX (NVIDIA Tools Extension Library)

    NVTX (NVIDIA Tools Extension Library)

    C-based Application Programming Interface (API)

    NVTX (NVIDIA Tools Extension) is a cross-platform API designed to annotate source code with rich metadata that can be consumed by developer profiling and debugging tools. It allows developers to insert markers, ranges, and events directly into their applications, providing contextual insight into how code executes on CPUs and GPUs. These annotations are visualized in tools such as NVIDIA Nsight Systems and Nsight Compute, enabling developers to identify performance bottlenecks, track...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 3
    ONNX Runtime

    ONNX Runtime

    ONNX Runtime: cross-platform, high performance ML inferencing

    ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators...
    Downloads: 66 This Week
    Last Update:
    See Project
  • 4
    Isaac ROS Visual SLAM

    Isaac ROS Visual SLAM

    Visual SLAM/odometry package based on NVIDIA-accelerated cuVSLAM

    Discover a faster, easier way to build advanced AI robotics applications with the NVIDIA Isaac™ ROS collection of accelerated computing packages and AI models, bringing NVIDIA acceleration to ROS developers everywhere. Isaac ROS Visual SLAM provides a high-performance, best-in-class ROS 2 package for VSLAM (visual simultaneous localization and mapping). This package uses one or more stereo cameras and optionally an IMU to estimate odometry as an input to navigation. It is GPU-accelerated to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    FlashMLA

    FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

    FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CUDA API Wrappers

    CUDA API Wrappers

    Thin, unified, C++-flavored wrappers for the CUDA APIs

    CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIA’s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    cuDF

    cuDF

    GPU DataFrame Library

    ...For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed with conda (miniconda, or the full Anaconda distribution) from the rapidsai channel. cuDF is supported only on Linux, and with Python versions 3.7 and later. The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    oneDNN

    oneDNN

    oneAPI Deep Neural Network Library (oneDNN)

    This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) and Deep Neural Network Library (DNNL). oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    DALI

    DALI

    A GPU-accelerated library containing highly optimized building blocks

    The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Thrust

    Thrust

    The C++ parallel algorithms library

    Thrust is the C++ parallel algorithms library which inspired the introduction of parallel algorithms to the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C++...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    YOLO ROS

    YOLO ROS

    YOLO ROS: Real-Time Object Detection for ROS

    This is a ROS package developed for object detection in camera images. You only look once (YOLO) is a state-of-the-art, real-time object detection system. In the following ROS package, you are able to use YOLO (V3) on GPU and CPU. The pre-trained model of the convolutional neural network is able to detect pre-trained classes including the data set from VOC and COCO, or you can also create a network with your own detection objects. The YOLO packages have been tested under ROS Noetic and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Tiramisu

    Tiramisu

    Polyhedral compiler for expressing fast and portable data algorithms

    Tiramisu is a compiler for expressing fast and portable data parallel computations. It provides a simple C++ API for expressing algorithms (Tiramisu expressions) and how these algorithms should be optimized by the compiler. Tiramisu can be used in areas such as linear and tensor algebra, deep learning, image processing, stencil computations and machine learning. The Tiramisu compiler is based on the polyhedral model thus it can express a large set of loop optimizations and data layout...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Caffe

    Caffe

    A fast open framework for deep learning

    Caffe is an open source deep learning framework that’s focused on expression, speed and modularity. It’s got an expressive architecture that encourages application and innovation, and extensible code that’s great for active development. Caffe also offers great speed, capable of processing over 60M images per day with a single NVIDIA K40 GPU. It’s arguably one of the fastest convnet implementations around. Caffe is developed by the Berkeley AI Research (BAIR)/The Berkeley Vision and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    MXLib is a C++ wrapper around the Intel® Integrated Performance Primitives (IPP) library and NVidia NPP CUDA library. You can use either IPP code (or a subset of functions that do not require IPP) on the CPU side, or use NPP/CUDA on the GPU side, or use both together. The function syntax is similar to that found in MatLab and the library is designed to make it easy to port your code from MatLab to C++. The idea is to provide Scientists, Engineers, Researchers and other non full-time...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    CURRENNT

    CUDA-enabled machine learning library for recurrent neural networks

    CURRENNT is a machine learning library for Recurrent Neural Networks (RNNs) which uses NVIDIA graphics cards to accelerate the computations. The library implements uni- and bidirectional Long Short-Term Memory (LSTM) architectures and supports deep networks as well as very large data sets that do not fit into main memory.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This project is a Neural Network Training Library implemented on CUDA. It's compatible with the most used libraries but allows to exploit the full power of NVIDIA graphic cards. Experimental results show speed ups over 100 times against CPU libraries
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    A data parallel scientific programming model. Compiles efficiently to different platforms like distributed memory (MPI), shared memory multi-processor (pthreads), Cell BE processor, Nvidia Cuda, SIMD vectorization (SSE, Altivec), and sequential C++ code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A simple, cross platform performance monitoring application specifically designed to be used with nVidia's instrumented driver and the NVPerfSDK to give a graphical representation of internal GPU counters. Support for non-GPU counters is also available.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB