cuda gpu free download

Showing 18 open source projects for "cuda gpu"

View related business solutions

Software Development C++ Clear Filters & Widen Search

Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
1

CUDA Core Compute Libraries (CCCL)

CUDA Core Compute Libraries

CCCL, or CUDA Core Compute Libraries, is a unified repository that consolidates several foundational CUDA C++ libraries into a single, cohesive development platform. It brings together Thrust, CUB, and libcudacxx, which collectively provide high-level abstractions, low-level performance primitives, and a CUDA-compatible standard library for GPU programming.

Downloads: 1 This Week

Last Update: 2026-06-15
See Project
2

CUDA API Wrappers

Thin, unified, C++-flavored wrappers for the CUDA APIs

CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIA’s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.

Downloads: 2 This Week

Last Update: 2026-02-09
See Project
3

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...

Downloads: 0 This Week

Last Update: 2025-07-08
See Project
4

Shumai

Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

...It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.

Downloads: 1 This Week

Last Update: 1 day ago
See Project
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
5

cuDF

GPU DataFrame Library

...The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Downloads: 0 This Week

Last Update: 2026-06-03
See Project
6

Halide

A language for fast, portable data-parallel computation

...It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. This representation can then be compiled to an object file, or a JIT-compile and run in the same process. ...

Downloads: 0 This Week

Last Update: 2025-09-17
See Project
7

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It...

Downloads: 4 This Week

Last Update: 2026-06-13
See Project
8

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 27 This Week

Last Update: 2026-06-02
See Project
9

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...

Downloads: 0 This Week

Last Update: 2025-09-05
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

MegEngine

Easy-to-use deep learning framework with 3 key features

MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR...

Downloads: 6 This Week

Last Update: 2024-04-30
See Project
11

Bandicoot

fast C++ library for GPU linear algebra & scientific computing

* Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive...

Downloads: 5 This Week

Last Update: 2026-05-08
See Project
12

QtAV

A multimedia framework based on Qt and FFmpeg

QtAV is a cross-platform and high performance multimedia playback framework based on Qt and FFmpeg. Features: timeline preview, gpu decoding etc

5 Reviews

Downloads: 29 This Week

Last Update: 16 hours ago
See Project
13

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
14

PMCGPU

Parallel simulators for Membrane Computing on the GPU

...The objective of this project (PMCGPU) is to bring together all the researchers working on the development of parallel simulators for P systems, specially those using the GPU (e.g. CUDA, OpenCL, etc). Other parallel platforms are also welcome (multicore and manycore, FPGAs, etc). This project has been initiated by the Research Group on Natural Computing (Department of Computer Science and Artificial Intelligence, University of Seville). PMCGPU was born inside the P-Lingua project, of the same research group. ...

Downloads: 0 This Week

Last Update: 2020-02-27
See Project
15

MXLib

MXLib is a C++ wrapper around the Intel® Integrated Performance Primitives (IPP) library and NVidia NPP CUDA library. You can use either IPP code (or a subset of functions that do not require IPP) on the CPU side, or use NPP/CUDA on the GPU side, or use both together. The function syntax is similar to that found in MatLab and the library is designed to make it easy to port your code from MatLab to C++. The idea is to provide Scientists, Engineers, Researchers and other non full-time programmers an easy to use, high performance library of functions.

1 Review

Downloads: 0 This Week

Last Update: 2015-08-14
See Project
16

HIPAcc

Heterogeneous Image Processing Acceleration (HIPACC) Framework

HIPAcc development has moved to github: https://github.com/hipacc HIPAcc allows to design image processing kernels and algorithms in a domain-specific language (DSL). From this high-level description, low-level target code for GPU accelerators is generated using source-to-source translation. As back ends, the framework supports CUDA, OpenCL, and Renderscript. HIPAcc allows programmers to develop imaging applications while providing high productivity, flexibility and portability as well as competitive performance: the same algorithm description serves as basis for targeting different GPU accelerators and low-level languages.

Downloads: 0 This Week

Last Update: 2017-01-01
See Project
17

gpusmoldyn

Porting of the core simulation portions of smoldyn to the GPU, using CUDA

Downloads: 0 This Week

Last Update: 2013-04-03
See Project
18

CUJ2K: Jpeg2000 encoder on Cuda

A fast JPEG2000 encoder running on CUDA-capable GPUs. Supports lossless and lossy encoding. Both commandline tool with GUI and CUDA/C++ library. Visit >> http://cuj2k.sourceforge.net << for further information.

Downloads: 11 This Week

Last Update: 2013-04-02
See Project