linux nvidia free download

Showing 21 open source projects for "linux nvidia"

View related business solutions

Libraries Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
1

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 20 This Week

Last Update: 2026-03-25
See Project
2

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most...

Downloads: 50 This Week

Last Update: 2026-02-20
See Project
3

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to...

Downloads: 0 This Week

Last Update: 2026-03-31
See Project
4

CUDA API Wrappers

Thin, unified, C++-flavored wrappers for the CUDA APIs

CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIA’s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.

Downloads: 3 This Week

Last Update: 2026-02-09
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
5

libfabric

AWS Libfabric

...With EFA, High Performance Computing (HPC) applications using the Message Passing Interface (MPI) and Machine Learning (ML) applications using NVIDIA Collective Communications Library (NCCL) can scale to thousands of CPUs or GPUs. As a result, you get the application performance of on-premises HPC clusters with the on-demand elasticity and flexibility of the AWS cloud.

Downloads: 5 This Week

Last Update: 2026-01-22
See Project
6

oneDNN

oneAPI Deep Neural Network Library (oneDNN)

This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) and Deep Neural Network Library (DNNL). oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the...

Downloads: 7 This Week

Last Update: 2026-03-30
See Project
7

cuDF

GPU DataFrame Library

...For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed with conda (miniconda, or the full Anaconda distribution) from the rapidsai channel. cuDF is supported only on Linux, and with Python versions 3.7 and later. The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Downloads: 5 This Week

Last Update: 2026-04-08
See Project
8

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...

Downloads: 1 This Week

Last Update: 2025-07-08
See Project
9

DeepPavlov

A library for deep learning end-to-end dialog systems and chatbots

DeepPavlov makes it easy for beginners and experts to create dialogue systems. The best place to start is with user-friendly tutorials. They provide quick and convenient introduction on how to use DeepPavlov with complete, end-to-end examples. No installation needed. Guides explain the concepts and components of DeepPavlov. Follow step-by-step instructions to install, configure and extend DeepPavlov framework for your use case. DeepPavlov is an open-source framework for chatbots and virtual...

Downloads: 1 This Week

Last Update: 2024-08-12
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
10

Transformers4Rec

Transformers4Rec is a flexible and efficient library

Transformers4Rec is an advanced recommendation system library that leverages Transformer models for sequential and session-based recommendations. The library works as a bridge between natural language processing (NLP) and recommender systems (RecSys) by integrating with one of the most popular NLP frameworks, Hugging Face Transformers (HF). Transformers4Rec makes state-of-the-art transformer architectures available for RecSys researchers and industry practitioners. Traditional recommendation...

Downloads: 9 This Week

Last Update: 2025-01-24
See Project
11

NVIDIA Container Toolkit

Build and run Docker containers leveraging NVIDIA GPUs

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed.

Downloads: 8 This Week

Last Update: 2023-04-26
See Project
12

Thrust

The C++ parallel algorithms library

Thrust is the C++ parallel algorithms library which inspired the introduction of parallel algorithms to the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C++...

Downloads: 2 This Week

Last Update: 2023-03-20
See Project
13

Proteus Model Builder

GUI for training of neural network models for GuitarML Proteus

GUI for easier installation and training of neural network models for guitar amplifiers and pedals, based on the GuitarML Proteus models. These are usable for Proteus, Chowdhury-DSP BYOD and even NeuralPi, on all platforms incl. Linux and RaspberryPi. What is this? GuitarML's work on Proteus, NeuralPi and Proteusboard (hardware) is amazing. https://github.com/GuitarML Yet, it is not easy to wrap your head around if you are not familiar with programming, AI, machine learning, neuronal...

Downloads: 9 This Week

Last Update: 2023-03-27
See Project
14

Fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the...

Downloads: 1 This Week

Last Update: 2022-06-27
See Project
15

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

This is a ROS package developed for object detection in camera images. You only look once (YOLO) is a state-of-the-art, real-time object detection system. In the following ROS package, you are able to use YOLO (V3) on GPU and CPU. The pre-trained model of the convolutional neural network is able to detect pre-trained classes including the data set from VOC and COCO, or you can also create a network with your own detection objects. The YOLO packages have been tested under ROS Noetic and...

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
16

Tiramisu

Polyhedral compiler for expressing fast and portable data algorithms

Tiramisu is a compiler for expressing fast and portable data parallel computations. It provides a simple C++ API for expressing algorithms (Tiramisu expressions) and how these algorithms should be optimized by the compiler. Tiramisu can be used in areas such as linear and tensor algebra, deep learning, image processing, stencil computations and machine learning. The Tiramisu compiler is based on the polyhedral model thus it can express a large set of loop optimizations and data layout...

Downloads: 0 This Week

Last Update: 2024-05-28
See Project
17

Caffe

A fast open framework for deep learning

Caffe is an open source deep learning framework that’s focused on expression, speed and modularity. It’s got an expressive architecture that encourages application and innovation, and extensible code that’s great for active development. Caffe also offers great speed, capable of processing over 60M images per day with a single NVIDIA K40 GPU. It’s arguably one of the fastest convnet implementations around. Caffe is developed by the Berkeley AI Research (BAIR)/The Berkeley Vision and...

Downloads: 5 This Week

Last Update: 2025-03-07
See Project
18

OpenCV CUDA Binaries

OpenCV Pre-built CUDA binaries

This project is now hosted as the nuget packages : opencvcuda-release opencvcuda-debug 3 Builds now available as nuget packages : - https://www.nuget.org/packages/opencvdefault/ Package for the default Windows x64 build available on opencv.org - https://www.nuget.org/packages/opencvcontrib/ Package for Windows x64 Visual Studio 2015 for the contrib and vtk modules built with AVX, SSE & OpenGL support. - https://www.nuget.org/packages/opencvcuda-release/ -...

3 Reviews

Downloads: 1 This Week

Last Update: 2017-03-05
See Project
19

CURRENNT

CUDA-enabled machine learning library for recurrent neural networks

CURRENNT is a machine learning library for Recurrent Neural Networks (RNNs) which uses NVIDIA graphics cards to accelerate the computations. The library implements uni- and bidirectional Long Short-Term Memory (LSTM) architectures and supports deep networks as well as very large data sets that do not fit into main memory.

3 Reviews

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
20

libcudann

This project is a Neural Network Training Library implemented on CUDA. It's compatible with the most used libraries but allows to exploit the full power of NVIDIA graphic cards. Experimental results show speed ups over 100 times against CPU libraries

1 Review

Downloads: 0 This Week

Last Update: 2015-01-01
See Project
21

python parallel utilities

nVidia CUDA and MPI python wrappers. These wrappers are written in pure C no swig or boost necessary. The CUDA wrapper exposes the CUDA runtime and Driver API's.

Downloads: 0 This Week

Last Update: 2014-05-08
See Project