nvidia free download - SourceForge

Showing 21 open source projects for "nvidia"

View related business solutions

Libraries Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Desktop and Mobile Device Management Software
It's a modern take on desktop management that can be scaled as per organizational needs.

Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.

Learn More
1

TensorRT

C++ library for high performance inference on NVIDIA GPUs

...TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 16 This Week

Last Update: 2025-11-08
See Project
2

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. ...

Downloads: 1 This Week

Last Update: 2025-10-03
See Project
3

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. ...

Downloads: 15 This Week

Last Update: 2025-08-18
See Project
4

CUDA API Wrappers

Thin, unified, C++-flavored wrappers for the CUDA APIs

CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIA’s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.

Downloads: 3 This Week

Last Update: 2025-03-19
See Project
Free and Open Source HR Software
OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.

Learn More
5

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...

Downloads: 4 This Week

Last Update: 2025-07-08
See Project
6

cuDF

GPU DataFrame Library

...For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed with conda (miniconda, or the full Anaconda distribution) from the rapidsai channel. cuDF is supported only on Linux, and with Python versions 3.7 and later. The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Downloads: 3 This Week

Last Update: 2025-12-10
See Project
7

libfabric

AWS Libfabric

...Its custom-built operating system (OS) bypass hardware interface enhances the performance of inter-instance communications, which is critical to scaling these applications. With EFA, High Performance Computing (HPC) applications using the Message Passing Interface (MPI) and Machine Learning (ML) applications using NVIDIA Collective Communications Library (NCCL) can scale to thousands of CPUs or GPUs. As a result, you get the application performance of on-premises HPC clusters with the on-demand elasticity and flexibility of the AWS cloud.

Downloads: 0 This Week

Last Update: 2025-12-06
See Project
8

oneDNN

oneAPI Deep Neural Network Library (oneDNN)

...The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the following architectures: Arm* 64-bit Architecture (AArch64), NVIDIA* GPU, OpenPOWER* Power ISA (PPC64), IBMz* (s390x), and RISC-V. oneDNN is intended for deep learning applications and framework developers interested in improving application performance on Intel CPUs and GPUs. Deep learning practitioners should use one of the applications enabled with oneDNN.

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
9

Transformers4Rec

Transformers4Rec is a flexible and efficient library

Transformers4Rec is an advanced recommendation system library that leverages Transformer models for sequential and session-based recommendations. The library works as a bridge between natural language processing (NLP) and recommender systems (RecSys) by integrating with one of the most popular NLP frameworks, Hugging Face Transformers (HF). Transformers4Rec makes state-of-the-art transformer architectures available for RecSys researchers and industry practitioners. Traditional recommendation...

Downloads: 2 This Week

Last Update: 2025-01-24
See Project
ThriveSparrow is an employee experience platform tailored for HR professionals
At the heart of ThriveSparrow is the engagement surveys module, offering a wide range of customizable surveys, including wellness and pulse surveys.

ThriveSparrow seeks to transform the workplace into a thriving ecosystem where employee experience meets organizational growth. What sets ThriveSparrow apart is its seamless combination of user experience, actionable insights, and holistic employee engagement features.

Learn More
10

DeepPavlov

A library for deep learning end-to-end dialog systems and chatbots

DeepPavlov makes it easy for beginners and experts to create dialogue systems. The best place to start is with user-friendly tutorials. They provide quick and convenient introduction on how to use DeepPavlov with complete, end-to-end examples. No installation needed. Guides explain the concepts and components of DeepPavlov. Follow step-by-step instructions to install, configure and extend DeepPavlov framework for your use case. DeepPavlov is an open-source framework for chatbots and virtual...

Downloads: 0 This Week

Last Update: 2024-08-12
See Project
11

NVIDIA Container Toolkit

Build and run Docker containers leveraging NVIDIA GPUs

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed.

Downloads: 6 This Week

Last Update: 2023-04-26
See Project
12

Thrust

The C++ parallel algorithms library

...It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C++ Standard Library is an open-source project; it is available on GitHub and included in the NVIDIA HPC SDK and CUDA Toolkit. If you have one of those SDKs installed, no additional installation or compiler flags are needed to use libcu++. Thrust is a header-only library; there is no need to build or install the project unless you want to run the Thrust unit tests.

Downloads: 0 This Week

Last Update: 2023-03-20
See Project
13

Proteus Model Builder

GUI for training of neural network models for GuitarML Proteus

GUI for easier installation and training of neural network models for guitar amplifiers and pedals, based on the GuitarML Proteus models. These are usable for Proteus, Chowdhury-DSP BYOD and even NeuralPi, on all platforms incl. Linux and RaspberryPi. What is this? GuitarML's work on Proteus, NeuralPi and Proteusboard (hardware) is amazing. https://github.com/GuitarML Yet, it is not easy to wrap your head around if you are not familiar with programming, AI, machine learning, neuronal...

Downloads: 6 This Week

Last Update: 2023-03-27
See Project
14

Fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the...

Downloads: 0 This Week

Last Update: 2022-06-27
See Project
15

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...We also provide branches that work under ROS Melodic, ROS Foxy and ROS2. Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
16

Tiramisu

Polyhedral compiler for expressing fast and portable data algorithms

...The Tiramisu compiler is based on the polyhedral model thus it can express a large set of loop optimizations and data layout transformations. Currently, it targets (1) multicore X86 CPUs, (2) Nvidia GPUs, (3) Xilinx FPGAs (Vivado HLS) and (4) distributed machines (using MPI). It is designed to enable easy integration of code generators for new architectures.

Downloads: 0 This Week

Last Update: 2024-05-28
See Project
17

Caffe

A fast open framework for deep learning

...It’s got an expressive architecture that encourages application and innovation, and extensible code that’s great for active development. Caffe also offers great speed, capable of processing over 60M images per day with a single NVIDIA K40 GPU. It’s arguably one of the fastest convnet implementations around. Caffe is developed by the Berkeley AI Research (BAIR)/The Berkeley Vision and Learning Center (BVLC) and a great community of contributors that continue to make Caffe state-of-the-art in both code and models. It’s been used in numerous projects, from startup prototypes and academic research projects, to large scale industrial applications.

Downloads: 1 This Week

Last Update: 2025-03-07
See Project
18

OpenCV CUDA Binaries

OpenCV Pre-built CUDA binaries

This project is now hosted as the nuget packages : opencvcuda-release opencvcuda-debug 3 Builds now available as nuget packages : - https://www.nuget.org/packages/opencvdefault/ Package for the default Windows x64 build available on opencv.org - https://www.nuget.org/packages/opencvcontrib/ Package for Windows x64 Visual Studio 2015 for the contrib and vtk modules built with AVX, SSE & OpenGL support. - https://www.nuget.org/packages/opencvcuda-release/ -...

3 Reviews

Downloads: 3 This Week

Last Update: 2017-03-05
See Project
19

CURRENNT

CUDA-enabled machine learning library for recurrent neural networks

CURRENNT is a machine learning library for Recurrent Neural Networks (RNNs) which uses NVIDIA graphics cards to accelerate the computations. The library implements uni- and bidirectional Long Short-Term Memory (LSTM) architectures and supports deep networks as well as very large data sets that do not fit into main memory.

3 Reviews

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
20

libcudann

This project is a Neural Network Training Library implemented on CUDA. It's compatible with the most used libraries but allows to exploit the full power of NVIDIA graphic cards. Experimental results show speed ups over 100 times against CPU libraries

1 Review

Downloads: 0 This Week

Last Update: 2015-01-01
See Project
21

python parallel utilities

nVidia CUDA and MPI python wrappers. These wrappers are written in pure C no swig or boost necessary. The CUDA wrapper exposes the CUDA runtime and Driver API's.

Downloads: 0 This Week

Last Update: 2014-05-08
See Project