Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Machine Learning Software
Search Results

Search Results for "cuda"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 22
Windows 19
Mac 15
More...
BSD 5
ChromeOS 3
Desktop Operating Systems 1
Mobile Operating Systems 1
Server Operating Systems 1

Category

Artificial Intelligence 24
Scientific/Engineering 7
Software Development 7
Multimedia 2
Business 1
Games 1
System 1

License

OSI-Approved Open Source 22

Programming Language

C++ 24
Python 3
C# 2
C 1
JavaScript 1
More...
Unix Shell 1

Status

Pre-Alpha 4
Production/Stable 3
Beta 2
Alpha 1

Showing 24 open source projects for "cuda"

View related business solutions

Machine Learning C++ Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
1

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library

CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.

Downloads: 8 This Week

Last Update: 2025-11-15
See Project
2

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

...It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.

Downloads: 2 This Week

Last Update: 2025-07-08
See Project
3

TensorRT Backend For ONNX

ONNX-TensorRT: TensorRT backend for ONNX

...For building within docker, we recommend using and setting up the docker containers as instructed in the main (TensorRT repository). Note that this project has a dependency on CUDA. By default the build will look in /usr/local/cuda for the CUDA toolkit installation. If your CUDA path is different, overwrite the default path. ONNX models can be converted to serialized TensorRT engines using the onnx2trt executable.

Downloads: 1 This Week

Last Update: 2026-06-02
See Project
4

CUTLASS

CUDA Templates for Linear Algebra Subroutines

CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN. CUTLASS decomposes these "moving parts" into reusable, modular software components abstracted by C++ template classes.

Downloads: 3 This Week

Last Update: 2026-05-19
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

TensorRT

C++ library for high performance inference on NVIDIA GPUs

...With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 35 This Week

Last Update: 2026-06-02
See Project
6

OpenCV

Open Source Computer Vision Library

OpenCV (Open Source Computer Vision Library) is a comprehensive open-source library for computer vision, machine learning, and image processing. It enables developers to build real-time vision applications ranging from facial recognition to object tracking. OpenCV supports a wide range of programming languages including C++, Python, and Java, and is optimized for both CPU and GPU operations.

Downloads: 29 This Week

Last Update: 2026-06-06
See Project
7

cuML

RAPIDS Machine Learning Library

cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects. cuML enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs without going into the details of CUDA programming. In most cases, cuML's Python API matches the API from scikit-learn. For large datasets, these GPU-based implementations can complete 10-50x faster than their CPU equivalents. For details on performance, see the cuML Benchmarks Notebook.

Downloads: 5 This Week

Last Update: 2026-06-04
See Project
8

Torch-TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch’s Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into a module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extension and compiles modules that integrate...

Downloads: 16 This Week

Last Update: 6 days ago
See Project
9

Instant Neural Graphics Primitives

Instant neural graphics primitives: lightning fast NeRF and more

Instant Neural Graphics Primitives, is an open-source research project developed by NVIDIA that enables extremely fast training and rendering of neural graphics representations. The system implements several neural graphics primitives including neural radiance fields, signed distance functions, neural images, and neural volumes. These representations are trained using a compact neural network combined with a multiresolution hash encoding that dramatically accelerates both training and...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
10

MegEngine

Easy-to-use deep learning framework with 3 key features

MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR...

Downloads: 4 This Week

Last Update: 2024-04-30
See Project
11

Bandicoot

fast C++ library for GPU linear algebra & scientific computing

* Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive...

Downloads: 3 This Week

Last Update: 2026-05-08
See Project
12

Bullet Physics SDK

Real-time collision detection and multi-physics simulation for VR

...It allows different automatic differentiation backends, for forward and reverse mode gradients. TDS can be trained using Deep Reinforcement Learning, or using Gradient based optimization (for example LFBGS). In addition, the simulator can be entirely run on CUDA for fast rollouts, in combination with Augmented Random Search. This allows for 1 million simulation steps per second. It is highly recommended to use PyBullet Python bindings for improved support for robotics, reinforcement learning and VR. Use pip install pybullet and checkout the PyBullet Quickstart Guide.

Downloads: 7 This Week

Last Update: 2022-09-25
See Project
13

Flashlight library

A C++ standalone library for machine learning

...Flashlight can be broken down into several components as described above. Each component can be incrementally built by specifying the correct build options. Flashlight is most-easily built and installed with vcpkg. Both the CUDA and CPU backends are supported with vcpkg. For either backend, first, install Intel MKL. Flashlight app binaries are also built for the selected features and are installed into the vcpkg install tree's tools directory.

Downloads: 0 This Week

Last Update: 2022-05-27
See Project
14

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
15

CUDA-JMI

Tool for feature selection using the JMI metric and multiple GPUs

CUDA-JMI is a parallel tool to accelerate the feature selection process using Joint Mutual Information as metric. This tool receives as input a file with ARFF, CVS or LIBSVM extensions that contais the values of m individuals and n features and returns a file with those features that provide more non-rendundant information.

Downloads: 0 This Week

Last Update: 2019-12-12
See Project
16

GPUMLib

...This library aims to provide machine learning researchers and practitioners with a high performance library by taking advantage of the GPU enormous computational power. The library is developed in C++ and CUDA.

Downloads: 0 This Week

Last Update: 2017-09-18
See Project
17

Multiple Back-Propagation (with CUDA)

Open source software for training neural networks

Multiple Back-Propagation is an open source software application for training neural networks with the backpropagation and the multiple back propagation algorithms. Currently this project is also hosted at http://code.google.com/p/multiplebackpropagation

3 Reviews

Downloads: 4 This Week

Last Update: 2016-11-24
See Project
18

LightSpMV

lightweight GPU-based sparse matrix-vector multiplication (SpMV)

LightSpMV is a novel CUDA-compatible sparse matrix-vector multiplication (SpMv) algorithm using the standard compressed sparse row (CSR) storage format. We have evaluated LightSpMV using various sparse matrices and further compared it to the CSR-based SpMV subprograms in the state-of-the-art CUSP and cuSPARSE. Performance evaluation reveals that on a single Tesla K40c GPU, LightSpMV is superior to both CUSP and cuSPARSE, with a speedup of up to 2.60 and 2.63 over CUSP, and up to 1.93 and 1.79 over cuSPARSE for single and double precision, respectively.

Downloads: 0 This Week

Last Update: 2016-06-15
See Project
19

Accelerated Feature Extraction Tool

A fast GPU accelerated feature extraction software for speech analysis

...It incorporates standard MFCC, PLP, and TRAPS features. The tool is a specially designed to process very large audio data sets. It uses GPU acceleration if compatible GPU available (CUDA as weel as OpenCL, NVIDIA, AMD, and Intel GPUs are supported). CPU SSE intrinsic instruction set is used in cases where no compatible GPU present. The output files are stored in HTK format. The software is developed at Department of Cybernetics at University of West Bohemia in Pilsen.

1 Review

Downloads: 0 This Week

Last Update: 2015-05-25
See Project
20

CURRENNT

CUDA-enabled machine learning library for recurrent neural networks

CURRENNT is a machine learning library for Recurrent Neural Networks (RNNs) which uses NVIDIA graphics cards to accelerate the computations. The library implements uni- and bidirectional Long Short-Term Memory (LSTM) architectures and supports deep networks as well as very large data sets that do not fit into main memory.

3 Reviews

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
21

Parallel Reinforcement Evolutionary ANN

Parallel Reinforcement Evolutionary Artificial Neural Networks (PREANN) is a framework of flexible multi-layer ANN's with reinforcement learning based on genetic algorithms and a parallel implementation (using XMM registers and NVIDIA's CUDA).

Downloads: 0 This Week

Last Update: 2013-05-02
See Project
22

ViVid

Python framework for video processing and content analysis using CUDA for acceleration.

Downloads: 0 This Week

Last Update: 2013-05-14
See Project
23

Monk Computer Vision

A low code unified framework for computer vision and deep learning

Monk is an open source low code programming environment to reduce the cognitive load faced by entry level programmers while catering to the needs of Expert Deep Learning engineers. There are three libraries in this opensource set. - Monk Classiciation- https://monkai.org. A Unified wrapper over major deep learning frameworks. Our core focus area is at the intersection of Computer Vision and Deep Learning algorithms. - Monk Object Detection -...

Downloads: 0 This Week

Last Update: 2020-02-25
See Project
24

cerebra

a distributed engine for abstract neural network development via natural-language programming

Downloads: 0 This Week

Last Update: 2013-04-02
See Project

Previous
You're on page 1
Next

Related Searches

torch chess engine

opencv

cuda

deb file

nvidia

opencv-vb6

cobol for 64bit windows

pattern recognition

physics simulation

flashlight

Related Categories

Artificial Intelligence

Scientific/Engineering

Software Development

Multimedia

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise