cuda free download - SourceForge

Showing 28 open source projects for "cuda"

View related business solutions

Libraries Windows Clear Filters & Widen Search

Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
1

cuda-oxide

cuda-oxide is an experimental Rust-to-CUDA compiler

cuda-oxide is an experimental NVIDIA Labs project that brings Rust closer to native CUDA GPU development. It works as a Rust-to-CUDA compiler path that lets developers write SIMT GPU kernels in idiomatic Rust instead of using a separate CUDA C++ workflow. The project compiles standard Rust code directly to PTX, avoiding DSLs, source-to-source translation, or foreign-language bindings.

Downloads: 2 This Week

Last Update: 5 days ago
See Project
2

CUDA API Wrappers

Thin, unified, C++-flavored wrappers for the CUDA APIs

...In a nutshell - making CUDA API work more fun.

Downloads: 0 This Week

Last Update: 2026-02-09
See Project
3

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. ...

Downloads: 26 This Week

Last Update: 2026-06-01
See Project
4

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

...It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.

Downloads: 2 This Week

Last Update: 2025-07-08
See Project
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
5

CUDA Core Compute Libraries (CCCL)

CUDA Core Compute Libraries

CCCL, or CUDA Core Compute Libraries, is a unified repository that consolidates several foundational CUDA C++ libraries into a single, cohesive development platform. It brings together Thrust, CUB, and libcudacxx, which collectively provide high-level abstractions, low-level performance primitives, and a CUDA-compatible standard library for GPU programming.

Downloads: 1 This Week

Last Update: 2026-06-04
See Project
6

TensorRT

C++ library for high performance inference on NVIDIA GPUs

...With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 35 This Week

Last Update: 2026-06-02
See Project
7

PyTorch Geometric

Geometric deep learning extension library for PyTorch

...We have outsourced a lot of functionality of PyTorch Geometric to other packages, which needs to be additionally installed. These packages come with their own CPU and GPU kernel implementations based on C++/CUDA extensions. We do not recommend installation as root user on your system python. Please setup an Anaconda/Miniconda environment or create a Docker image. We provide pip wheels for all major OS/PyTorch/CUDA combinations.

Downloads: 2 This Week

Last Update: 2026-06-05
See Project
8

SuiteSparse

The official SuiteSparse library: a suite of sparse matrix algorithms

The official SuiteSparse library: a suite of sparse matrix algorithms authored or co-authored by Tim Davis, Texas A&M University.

Downloads: 1 This Week

Last Update: 2026-02-10
See Project
9

ArrayFire

ArrayFire, a general purpose GPU library

...Together we can fulfill The ArrayFire Mission under an excellent Code of Conduct that promotes a respectful and friendly building experience. Rigorous benchmarks and tests ensuring top performance and numerical accuracy. Cross-platform compatibility with support for CUDA, OpenCL, and native CPU on Windows, Mac, and Linux. Built-in visualization functions through Forge.

Downloads: 3 This Week

Last Update: 2025-09-05
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It...

Downloads: 8 This Week

Last Update: 3 days ago
See Project
11

MuJoCo Playground

An open source library for GPU-accelerated robot learning

MuJoCo Playground, developed by Google DeepMind, is a GPU-accelerated suite of simulation environments for robot learning and sim-to-real research, built on top of MuJoCo MJX. It unifies a range of control, locomotion, and manipulation tasks into a consistent and scalable framework optimized for JAX and Warp backends. The project includes classic control benchmarks from dm_control, advanced quadruped and bipedal locomotion systems, and dexterous as well as non-prehensile manipulation setups....

Downloads: 2 This Week

Last Update: 2026-04-28
See Project
12

Jupyter Docker Stacks

Ready-to-run Docker images containing Jupyter applications

Jupyter Docker Stacks provides a curated set of ready-to-run Docker container images that bundle Jupyter applications with popular data science and computing tools, enabling users to quickly start working in a reproducible environment. These stacks support a range of use cases, from lightweight base notebook images to full featured environments that include scientific computing libraries, machine learning tools, and IDE-like notebook interfaces, all within Docker containers that run...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
13

Face Alignment

2D and 3D Face alignment library build using pytorch

...However, the users can alternatively use dlib, BlazeFace, or pre-existing ground truth bounding boxes. While not required, for optimal performance(especially for the detector) it is highly recommended to run the code using a CUDA-enabled GPU. While here the work is presented as a black box, if you want to know more about the intrisecs of the method please check the original paper either on arxiv or my webpage.

Downloads: 1 This Week

Last Update: 2026-04-06
See Project
14

Multimodal

TorchMultimodal is a PyTorch library

...The repository also includes example scripts and datasets for common multimodal tasks (e.g. retrieval, visual question answering, grounding) so you can test and compare models end to end. Installation supports both CPU and CUDA, and the codebase is versioned, tested, and maintained.

Downloads: 0 This Week

Last Update: 2026-01-12
See Project
15

QtAV

A multimedia framework based on Qt and FFmpeg

QtAV is a cross-platform and high performance multimedia playback framework based on Qt and FFmpeg. Features: timeline preview, gpu decoding etc

5 Reviews

Downloads: 35 This Week

Last Update: 6 hours ago
See Project
16

Thrust

The C++ parallel algorithms library

...Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C++ Standard Library is an open-source project; it is available on GitHub and included in the NVIDIA HPC SDK and CUDA Toolkit. If you have one of those SDKs installed, no additional installation or compiler flags are needed to use libcu++. ...

Downloads: 0 This Week

Last Update: 2023-03-20
See Project
17

Proteus Model Builder

GUI for training of neural network models for GuitarML Proteus

GUI for easier installation and training of neural network models for guitar amplifiers and pedals, based on the GuitarML Proteus models. These are usable for Proteus, Chowdhury-DSP BYOD and even NeuralPi, on all platforms incl. Linux and RaspberryPi. What is this? GuitarML's work on Proteus, NeuralPi and Proteusboard (hardware) is amazing. https://github.com/GuitarML Yet, it is not easy to wrap your head around if you are not familiar with programming, AI, machine learning, neuronal...

Downloads: 9 This Week

Last Update: 2023-03-27
See Project
18

Darknet

Convolutional Neural Networks

...With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.

Downloads: 28 This Week

Last Update: 1 day ago
See Project
19

Flux3D.jl

3D computer vision library in Julia

...This package utilizes Flux.jl and Zygote.jl as its building blocks for training 3D vision models and for supporting differentiation. This package also have support of CUDA GPU acceleration with CUDA.jl.

Downloads: 3 This Week

Last Update: 2023-12-07
See Project
20

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
21

VideoMan Library

C++ library for image acquisition and visualization

Library for capturing video from cameras, 3d sensors, frame-grabbers, video files and image sequences. It can also display multiple images using OpenGL with different layouts. Easy integration with OpenCV, CUDA... Perfect for computer vision. Keywords: video capture, computer vision, machine vision, opencv, opengl, cameras, video input devices, firewire, usb, gige

Downloads: 0 This Week

Last Update: 2018-07-19
See Project
22

OpenCV CUDA Binaries

OpenCV Pre-built CUDA binaries

... - https://www.nuget.org/packages/opencvcuda-release/ - https://www.nuget.org/packages/opencvcuda-debug/ Packages for Release and Debug configurations (due to file size limitations on nuget.org) built with CUDA Toolkit 7.0.0 targeting architectures 2.0, 2.1, 3.0, 3.5, 3.7, 5.0, 5.2 suitable for Visual Studio 2013.

3 Reviews

Downloads: 0 This Week

Last Update: 2017-03-05
See Project
23

ViennaCL

Linear algebra and solver library using CUDA, OpenCL, and OpenMP

ViennaCL provides high level C++ interfaces for linear algebra routines on CPUs and GPUs using CUDA, OpenCL, and OpenMP. The focus is on generic implementations of iterative solvers often used for large linear systems and simple integration into existing projects.

1 Review

Downloads: 27 This Week

Last Update: 2016-09-10
See Project
24

CURRENNT

CUDA-enabled machine learning library for recurrent neural networks

CURRENNT is a machine learning library for Recurrent Neural Networks (RNNs) which uses NVIDIA graphics cards to accelerate the computations. The library implements uni- and bidirectional Long Short-Term Memory (LSTM) architectures and supports deep networks as well as very large data sets that do not fit into main memory.

3 Reviews

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
25

FortCUDA

This project is for CUDA bindings in F95/2003 using the Fortran 2003 ISO_C_BINDINGS module. It is intended to give near native call syntax to the CUDA SDK in Fortran 2003. Currently, most Fortran compilers are supporting the ISO_C_BINDINGS module.

Downloads: 0 This Week

Last Update: 2013-05-14
See Project