cuda free download - SourceForge

Showing 33 open source projects for "cuda"

View related business solutions

Software Development C++ Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
1

CUDA API Wrappers

Thin, unified, C++-flavored wrappers for the CUDA APIs

...In a nutshell - making CUDA API work more fun.

Downloads: 0 This Week

Last Update: 2026-02-09
See Project
2

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

...It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.

Downloads: 2 This Week

Last Update: 2025-07-08
See Project
3

CUDA Core Compute Libraries (CCCL)

CUDA Core Compute Libraries

CCCL, or CUDA Core Compute Libraries, is a unified repository that consolidates several foundational CUDA C++ libraries into a single, cohesive development platform. It brings together Thrust, CUB, and libcudacxx, which collectively provide high-level abstractions, low-level performance primitives, and a CUDA-compatible standard library for GPU programming.

Downloads: 0 This Week

Last Update: 1 day ago
See Project
4

TensorRT

C++ library for high performance inference on NVIDIA GPUs

...With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 40 This Week

Last Update: 2026-06-02
See Project
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
5

Shumai

Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

...The library supports matrix operations, gradient computation, and tensor conversions with intuitive APIs and near-native speed, thanks to Bun’s low-overhead FFI bindings. It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
6

Halide

A language for fast, portable data-parallel computation

...It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. This representation can then be compiled to an object file, or a JIT-compile and run in the same process. ...

Downloads: 1 This Week

Last Update: 2025-09-17
See Project
7

cuDF

GPU DataFrame Library

...It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Downloads: 1 This Week

Last Update: 2026-06-03
See Project
8

Ccache

A fast compiler cache

...Supports GCC, Clang, MSVC (Microsoft Visual C++) and other similar compilers. Works on Linux, macOS, other Unix-like operating systems and Windows. Understands C, C++, assembler, CUDA, Objective-C and Objective-C++. Supports secondary storage over HTTP (e.g. using Nginx or Google Cloud Storage), Redis or local filesystem, optionally sharding data onto a server cluster. Supports fast "direct" and "depend" modes that don't rely on using the preprocessor. Supports compression using Zstandard. Checksums cache content using XXH3 to detect data corruption. ...

Downloads: 4 This Week

Last Update: 2026-05-04
See Project
9

ArrayFire

ArrayFire, a general purpose GPU library

...Together we can fulfill The ArrayFire Mission under an excellent Code of Conduct that promotes a respectful and friendly building experience. Rigorous benchmarks and tests ensuring top performance and numerical accuracy. Cross-platform compatibility with support for CUDA, OpenCL, and native CPU on Windows, Mac, and Linux. Built-in visualization functions through Forge.

Downloads: 3 This Week

Last Update: 2025-09-05
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
10

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
11

MegEngine

Easy-to-use deep learning framework with 3 key features

MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR...

Downloads: 5 This Week

Last Update: 2024-04-30
See Project
12

QtAV

A multimedia framework based on Qt and FFmpeg

QtAV is a cross-platform and high performance multimedia playback framework based on Qt and FFmpeg. Features: timeline preview, gpu decoding etc

5 Reviews

Downloads: 33 This Week

Last Update: 12 hours ago
See Project
13

Bandicoot

fast C++ library for GPU linear algebra & scientific computing

* Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive...

Downloads: 5 This Week

Last Update: 2026-05-08
See Project
14

Thrust

The C++ parallel algorithms library

...Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C++ Standard Library is an open-source project; it is available on GitHub and included in the NVIDIA HPC SDK and CUDA Toolkit. If you have one of those SDKs installed, no additional installation or compiler flags are needed to use libcu++. ...

Downloads: 0 This Week

Last Update: 2023-03-20
See Project
15

Flashlight library

A C++ standalone library for machine learning

...Flashlight can be broken down into several components as described above. Each component can be incrementally built by specifying the correct build options. Flashlight is most-easily built and installed with vcpkg. Both the CUDA and CPU backends are supported with vcpkg. For either backend, first, install Intel MKL. Flashlight app binaries are also built for the selected features and are installed into the vcpkg install tree's tools directory.

Downloads: 0 This Week

Last Update: 2022-05-27
See Project
16

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
17

PMCGPU

Parallel simulators for Membrane Computing on the GPU

...The objective of this project (PMCGPU) is to bring together all the researchers working on the development of parallel simulators for P systems, specially those using the GPU (e.g. CUDA, OpenCL, etc). Other parallel platforms are also welcome (multicore and manycore, FPGAs, etc). This project has been initiated by the Research Group on Natural Computing (Department of Computer Science and Artificial Intelligence, University of Seville). PMCGPU was born inside the P-Lingua project, of the same research group. ...

Downloads: 0 This Week

Last Update: 2020-02-27
See Project
18

cudatemplates

"CUDA Templates" is a collection of C++ template classes and functions which provide a consistent interface to NVidia's "Compute Unified Device Architecture" (CUDA), hiding much of the complexity of the underlying CUDA functions from the programmer.

Downloads: 0 This Week

Last Update: 2019-05-09
See Project
19

cocolib / light field suite

CUDA library for continuous optimization and light field analysis

Library for continuous convex optimization in image analysis, together with a command line tool and Matlab interface. Implements several recent algorithms for inverse problems and image segmentation with total variation regularizers and vectorial multilabel transition costs. Also included is a suite for variational light field analysis, which ties into the HCI light field benchmark set and givens reference implementations for a number of our recently published algorithms. *** NOTE: ...

2 Reviews

Downloads: 0 This Week

Last Update: 2018-11-13
See Project
20

VideoMan Library

C++ library for image acquisition and visualization

Library for capturing video from cameras, 3d sensors, frame-grabbers, video files and image sequences. It can also display multiple images using OpenGL with different layouts. Easy integration with OpenCV, CUDA... Perfect for computer vision. Keywords: video capture, computer vision, machine vision, opencv, opengl, cameras, video input devices, firewire, usb, gige

Downloads: 0 This Week

Last Update: 2018-07-19
See Project
21

ViennaCL

Linear algebra and solver library using CUDA, OpenCL, and OpenMP

ViennaCL provides high level C++ interfaces for linear algebra routines on CPUs and GPUs using CUDA, OpenCL, and OpenMP. The focus is on generic implementations of iterative solvers often used for large linear systems and simple integration into existing projects.

1 Review

Downloads: 39 This Week

Last Update: 2016-09-10
See Project
22
$OpenGL Mathematics (GLM)$

OpenGL Mathematics (GLM)

OpenGL Mathematics (GLM) is a C++ mathematics library for 3D software based on the OpenGL Shading Language (GLSL) specification.

Downloads: 8 This Week

Last Update: 2015-02-15
See Project
23

MXLib

MXLib is a C++ wrapper around the Intel® Integrated Performance Primitives (IPP) library and NVidia NPP CUDA library. You can use either IPP code (or a subset of functions that do not require IPP) on the CPU side, or use NPP/CUDA on the GPU side, or use both together. The function syntax is similar to that found in MatLab and the library is designed to make it easy to port your code from MatLab to C++. The idea is to provide Scientists, Engineers, Researchers and other non full-time programmers an easy to use, high performance library of functions.

1 Review

Downloads: 0 This Week

Last Update: 2015-08-14
See Project
24

CURRENNT

CUDA-enabled machine learning library for recurrent neural networks

CURRENNT is a machine learning library for Recurrent Neural Networks (RNNs) which uses NVIDIA graphics cards to accelerate the computations. The library implements uni- and bidirectional Long Short-Term Memory (LSTM) architectures and supports deep networks as well as very large data sets that do not fit into main memory.

3 Reviews

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
25

HIPAcc

Heterogeneous Image Processing Acceleration (HIPACC) Framework

HIPAcc development has moved to github: https://github.com/hipacc HIPAcc allows to design image processing kernels and algorithms in a domain-specific language (DSL). From this high-level description, low-level target code for GPU accelerators is generated using source-to-source translation. As back ends, the framework supports CUDA, OpenCL, and Renderscript. HIPAcc allows programmers to develop imaging applications while providing high productivity, flexibility and portability as well as competitive performance: the same algorithm description serves as basis for targeting different GPU accelerators and low-level languages.

Downloads: 0 This Week

Last Update: 2017-01-01
See Project