cuda free download - SourceForge

Showing 307 open source projects for "cuda"

View related business solutions

Eptura Workplace Software
From desk booking and visitor management, to space planning and office utilization data, Eptura Workplace helps your entire organization work smarter.

With the world of work changed forever, it’s essential to manage your workplace and assets together to effectively create a high-performing environment. The Eptura experience combines the power of workplace management software with asset management, enabling you to effectively operate your building and facilitate hybrid work.

Learn More
Web Hosting, Reseller Hosting and Cloud Server Hosting | ScalaHosting
Help your website thrive by hosting it on the cloud which delivers scalability, high speed and security 24/7/365.

From freelance web developers, to web design agencies, to bustling e-commerce websites, to savvy domain resellers, clients trust our managed cloud-hosting solutions to reach their full potential on the web.

Learn More
1

CUDA.jl

CUDA programming in Julia

High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl. This will...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
2

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library

CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.

Downloads: 1 This Week

Last Update: 2024-09-05
See Project
3

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

... memory in its default configuration. It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.

Downloads: 0 This Week

Last Update: 2022-12-01
See Project
4

GLM

OpenGL Mathematics (GLM)

OpenGL Mathematics (GLM) is a header only C++ mathematics library for graphics software based on the OpenGL Shading Language (GLSL) specifications. GLM provides classes and functions designed and implemented with the same naming conventions and functionality than GLSL so that anyone who knows GLSL, can use GLM as well in C++. This project isn't limited to GLSL features. An extension system, based on the GLSL extension conventions, provides extended capabilities: matrix transformations,...

Downloads: 143 This Week

Last Update: 2024-02-27
See Project
Complete Data Management for Nonprofits
Designed to fit with multi-level non-profit organization, across any sector

NewOrg is a robust platform built with enhanced features to help non-profit organizations that capture and integrate the information from all of their operational areas to better manage volunteers, clients, programs, outcome reporting, activity sign-ups & scheduling, communications, surveys, fundraising activities and Development campaigns. NewOrg can truly deliver an intuitive product that will help manage your Committees, Donors, Events, and Memberships so that the organization runs efficiently.

Learn More
5

Cuda DNN

Downloads: 7 This Week

Last Update: 2022-07-08
See Project
6

TSNE-CUDA

GPU Accelerated t-SNE for CUDA with Python bindings

This repo is an optimized CUDA version of FIt-SNE algorithm with associated python modules. We find that our implementation of t-SNE can be up to 1200x faster than Sklearn, or up to 50x faster than Multicore-TSNE when used with the right GPU. You can install binaries with anaconda for CUDA version 10.1 and 10.2 using conda install tsnecuda -c conda-forge. Tsnecuda supports CUDA versions 9.0 and later through source installation, check out the wiki for up to date installation instructions. Time...

Downloads: 0 This Week

Last Update: 2022-07-14
See Project
7

GFPGAN

GFPGAN aims at developing Practical Algorithms

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration. Colab Demo for GFPGAN; (Another Colab Demo for the original paper model) Online demo: Huggingface (return only the cropped face) Online demo: Replicate.ai (may need to sign in, return the whole image). Online demo: Baseten.co (backed by GPU, returns the whole image). We provide a clean version of GFPGAN, which can run without CUDA extensions. So that it can run in Windows or on CPU mode. GFPGAN aims at developing...

Downloads: 68 This Week

Last Update: 2022-09-16
See Project
8

XMRig

RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface...

1 Review

Downloads: 16 This Week

Last Update: 2024-08-11
See Project
9

Dream Textures

Stable Diffusion built-in to Blender

... you're looking for. Texture entire models and scenes with depth to image. Inpaint to fix up images and convert existing textures into seamless ones automatically. Outpaint to increase the size of an image by extending it in any direction. Perform style transfer and create novel animations with Stable Diffusion as a post processing step. Dream Textures has been tested with CUDA and Apple Silicon GPUs. Over 4GB of VRAM is recommended.

Downloads: 11 This Week

Last Update: 2024-08-26
See Project
KYC Portal CLM is the most advanced CDD and AML data collection software
For Banks, Law firms, Accounting and Auditing, Financial Service Providers, Logistic Companies, Gaming

KYC Portal focuses on streamlining and automating the back-office of any due diligence process. It allows you to define and manage all your regulatory and policy requirements within the system and it then provides the operational capacity to automate and manage the entire process from on-boarding relationship management all throughout the automation of ongoing aspects of KYC such as risk-based approach, reporting, document requests, automated risk-based questionnaires etc.

Learn More
10

opencvsharp

OpenCV wrapper for .NET

OpenCV wrapper for .NET. Some Docker images are provided to use OpenCvSharp with AppEngine Flexible. The native binding (libOpenCvSharpExtern) is already built in the docker image and you don't need to worry about it. OpenCvSharp won't work on Unity and Xamarin platform. For Unity, please consider using OpenCV for Unity or some other solutions. OpenCvSharp does not support CUDA. If you want to use the CUDA features, you need to customize the native bindings yourself. Objects of classes...

Downloads: 11 This Week

Last Update: 2024-01-07
See Project
11

TensorRT

C++ library for high performance inference on NVIDIA GPUs

..., embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 8 This Week

Last Update: 2024-09-12
See Project
12

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases...

Downloads: 1 This Week

Last Update: 2024-08-22
See Project
13

NVIDIA Container Toolkit

Build and run Docker containers leveraging NVIDIA GPUs

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed. The NVIDIA Container Toolkit supports different container engines...

Downloads: 5 This Week

Last Update: 2023-04-26
See Project
14

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

.... InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.

2 Reviews

Downloads: 8 This Week

Last Update: 2024-09-05
See Project
15

Real-ESRGAN Video Enhance

Real-ESRGAN video upscaler with resumability

... is portable and includes all the binaries and models required. No CUDA or PyTorch environment is needed.

Downloads: 6 This Week

Last Update: 2023-03-30
See Project
16

VLLM

A high-throughput and memory-efficient inference and serving engine

vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.

Downloads: 2 This Week

Last Update: 2024-09-13
See Project
17

Torch-TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch’s Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into a module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extension and compiles modules that integrate...

Downloads: 3 This Week

Last Update: 2024-07-26
See Project
18

Implicit

Fast Python collaborative filtering for implicit feedback datasets

This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets. All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU’s. This library also supports using approximate nearest neighbour libraries such as Annoy, NMSLIB and Faiss for speeding up...

Downloads: 2 This Week

Last Update: 2024-08-06
See Project
19

NVIDIA GPU Operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

... software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.

Downloads: 2 This Week

Last Update: 2024-08-08
See Project
20

emgucv

Cross platform .Net wrapper to the OpenCV image processing library

Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.

Downloads: 1 This Week

Last Update: 2024-08-13
See Project
21

Bullet Physics SDK

Real-time collision detection and multi-physics simulation for VR

... using Deep Reinforcement Learning, or using Gradient based optimization (for example LFBGS). In addition, the simulator can be entirely run on CUDA for fast rollouts, in combination with Augmented Random Search. This allows for 1 million simulation steps per second. It is highly recommended to use PyBullet Python bindings for improved support for robotics, reinforcement learning and VR. Use pip install pybullet and checkout the PyBullet Quickstart Guide.

Downloads: 2 This Week

Last Update: 2022-09-25
See Project
22

Jittor

Jittor is a high-performance deep learning framework

... learning, etc. The front-end language is Python. Module Design and Dynamic Graph Execution is used in the front-end, which is the most popular design for deep learning framework interface. The back-end is implemented by high-performance languages, such as CUDA, C++. Jittor'op is similar to NumPy. Let's try some operations. We create Var a and b via operation jt.float32, and add them. Printing those variables shows they have the same shape and dtype.

Downloads: 1 This Week

Last Update: 2024-07-02
See Project
23

CUTLASS

CUDA Templates for Linear Algebra Subroutines

CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN. CUTLASS decomposes these "moving parts" into reusable, modular software components abstracted by C++ template classes. These thread-wide, warp-wide, block-wide, and device-wide...

Downloads: 0 This Week

Last Update: 2024-08-20
See Project
24

TensorRT Backend For ONNX

ONNX-TensorRT: TensorRT backend for ONNX

... the docker containers as instructed in the main (TensorRT repository). Note that this project has a dependency on CUDA. By default the build will look in /usr/local/cuda for the CUDA toolkit installation. If your CUDA path is different, overwrite the default path. ONNX models can be converted to serialized TensorRT engines using the onnx2trt executable.

Downloads: 0 This Week

Last Update: 2024-09-11
See Project
25

Thrust

The C++ parallel algorithms library

Thrust is the C++ parallel algorithms library which inspired the introduction of parallel algorithms to the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C...

Downloads: 0 This Week

Last Update: 2023-03-20
See Project