cushaw2-gpu free download

Showing 211 open source projects for "cushaw2-gpu"

View related business solutions

Software Development Windows Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
1

NVIDIA GPU Operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

...However, configuring and managing nodes with these hardware resources requires the configuration of multiple software components such as drivers, container runtimes or other libraries which are difficult and prone to errors. The NVIDIA GPU Operator uses the operator framework within Kubernetes to automate the management of all NVIDIA software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.

Downloads: 2 This Week

Last Update: 2026-03-19
See Project
2

nviwatch

A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU

NviWatch is an interactive terminal user interface (TUI) application for monitoring NVIDIA GPU devices and processes. Built with Rust, it provides real-time insights into GPU performance metrics, including temperature, utilization, memory usage, and power consumption.

Downloads: 0 This Week

Last Update: 2025-08-21
See Project
3

ComputeSharp

.NET library to run C# code in parallel on the GPU through DX12

ComputeSharp is a .NET library to run C# code in parallel on the GPU through DX12 and dynamically generated HLSL compute shaders. The available APIs let you access GPU devices, allocate GPU buffers and textures, move data between them and the RAM, write compute shaders entirely in C# and have them run on the GPU. The goal of this project is to make GPU computing easy to use for all .NET developers!

Downloads: 1 This Week

Last Update: 2025-04-13
See Project
4

Numba CUDA Target

The CUDA target for Numba

Numba CUDA Target is NVIDIA’s maintained CUDA backend for the Numba JIT compiler, enabling developers to write GPU-accelerated code directly in Python. It allows users to define CUDA kernels using Python syntax, which are then compiled into efficient GPU code at runtime using LLVM-based toolchains. This approach significantly lowers the barrier to entry for GPU programming by eliminating the need to write CUDA C++ while still delivering high performance.

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
5

Numbast

Build an automated pipeline that converts CUDA APIs into Numba

Numbast is an automated toolchain that bridges CUDA C++ and Python by generating Numba-compatible bindings directly from CUDA header files. Its primary goal is to eliminate the manual effort required to expose CUDA libraries to Python, enabling developers to use GPU-accelerated functionality in Python environments more easily. The system parses CUDA C++ declarations and converts them into Python bindings that can be used within Numba, allowing seamless integration with Python-based GPU workflows. This approach significantly improves developer productivity by reducing boilerplate code and ensuring consistency between C++ and Python interfaces. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
6

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. CuPy is very easy to install through pip or through precompiled binary packages called wheels for recommended environments. ...

Downloads: 3 This Week

Last Update: 2026-02-20
See Project
7

waifu2x ncnn Vulkan

waifu2x converter ncnn version, run fast GPU with vulkan

ncnn implementation of waifu2x converter. Runs fast on Intel/AMD/Nvidia/Apple-Silicon with Vulkan API. waifu2x-ncnn-vulkan uses ncnn project as the universal neural network inference framework.

Downloads: 13 This Week

Last Update: 2025-09-15
See Project
8

Triton

Development repository for the Triton language and compiler

...The project leverages LLVM and MLIR to compile code into efficient GPU instructions, supporting both NVIDIA and AMD hardware. It is widely used in research and production environments where custom tensor operations are required, offering both high performance and developer-friendly syntax.

Downloads: 3 This Week

Last Update: 2026-03-20
See Project
9

Khronos KTX

KTX (Khronos Texture) Library and Tools

KTX-Software is a suite of tools and libraries for working with Khronos Texture (KTX) files, designed and maintained by the Khronos Group. KTX is a container format for storing textures that are optimized for GPU upload, supporting modern formats like Basis Universal and ASTC. This repository includes tools for creating, validating, inspecting, and converting KTX and KTX2 files, making it essential for developers working in 3D engines, games, and visualization tools where texture streaming and compression are key.

Downloads: 31 This Week

Last Update: 2025-10-04
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
10

CUDA Python

Performance meets Productivity

CUDA Python is a unified Python interface for accessing and working with the NVIDIA CUDA platform, enabling developers to build GPU-accelerated applications entirely in Python. It acts as a metapackage composed of multiple submodules that provide both high-level and low-level access to CUDA functionality, including runtime APIs, driver APIs, and JIT compilation tools. The project is designed to simplify GPU programming by offering Pythonic abstractions while still exposing the full power of CUDA for advanced users. ...

Downloads: 2 This Week

Last Update: 2026-03-24
See Project
11

CubeCL

Multi-platform high-performance compute language extension for Rust

CubeCL is a low-level compute language and compiler framework designed to simplify and optimize GPU programming for high-performance workloads, particularly in machine learning and numerical computing. It provides an abstraction layer that allows developers to write portable, hardware-efficient compute kernels without directly dealing with complex GPU APIs such as CUDA or OpenCL. CubeCL focuses on delivering predictable performance and composability by exposing explicit control over memory layouts, parallelism, and execution patterns while still maintaining a developer-friendly syntax. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
12

CUDA Core Compute Libraries (CCCL)

CUDA Core Compute Libraries

...By unifying these components, CCCL reduces duplication and improves developer productivity while maintaining performance across different GPU architectures.

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
13

Lapce

Lightning-fast and Powerful Code Editor written in Rust

Lapce is a GUI-based, next‑generation code editor written in Rust, using native GPU-accelerated rendering (via Floem and wgpu). It aims to deliver VS Code–level productivity with minimal latency, built-in LSP support, modal editing, remote development capabilities, and WASI‑based plugin extensibility.

Downloads: 6 This Week

Last Update: 2026-01-21
See Project
14

Starling Framework

2D GPU-accelerated framework for ActionScript developers

Starling is an open-source 2D framework for ActionScript developers that leverages GPU acceleration via Adobe's Stage3D API to create smooth, high-performance games and applications across desktop and mobile platforms. It mimics the traditional Flash display list while dramatically improving performance, making it a popular choice for Flash developers transitioning into more efficient, hardware-accelerated environments.

Downloads: 0 This Week

Last Update: 2026-01-02
See Project
15

SwissGL

SwissGL is a minimalistic wrapper on top of WebGL2 JS API

SwissGL is a compact JavaScript library that provides a streamlined abstraction layer over the WebGL2 API, designed to minimize boilerplate when building GPU-accelerated graphics, simulations, and procedural visualizations. Acting as a "Swiss Army knife" for WebGL2, it simplifies shader, texture, and framebuffer management into a single, expressive interface that enables developers to write complex GPU workflows in just a few lines of code. The library centers around one main function that unifies rendering and compute operations, allowing the creation of particle systems, GPGPU effects, and real-time simulations entirely on the GPU. ...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
16

Video-subtitle-extractor

A GUI tool for extracting hard-coded subtitle (hardsub) from videos

...Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu and Ali to complete text recognition locally. Support GPU acceleration, after GPU acceleration, you can get higher accuracy and faster extraction speed. (CLI version) No need for users to manually set the subtitle area, the project automatically detects the subtitle area through the text detection model. Filter the text in the non-subtitle area and remove the watermark (station logo) text.

1 Review

Downloads: 59 This Week

Last Update: 2025-06-23
See Project
17

DeepEP

DeepEP: an efficient expert-parallel communication library

DeepEP is a communication library designed specifically to support Mixture-of-Experts (MoE) and expert parallelism (EP) deployments. Its core role is to implement high-throughput, low-latency all-to-all GPU communication kernels, which handle the dispatching of tokens to different experts (or shards) and then combining expert outputs back into the main data flow. Because MoE architectures require routing inputs to different experts, communication overhead can become a bottleneck — DeepEP addresses that by providing optimized GPU kernels and efficient dispatch/combining logic. ...

Downloads: 2 This Week

Last Update: 2025-10-03
See Project
18

DGL

Python package built to ease deep learning on graph

...We also want to make the combination of graph based modules and tensor based modules (PyTorch or MXNet) as smooth as possible. DGL provides a powerful graph object that can reside on either CPU or GPU. It bundles structural data as well as features for a better control. We provide a variety of functions for computing with graph objects including efficient and customizable message passing primitives for Graph Neural Networks.

Downloads: 2 This Week

Last Update: 2024-08-29
See Project
19

PortableGL

An implementation of OpenGL 3.x-ish in clean C

PortableGL is a single-header, software-only implementation of a subset of OpenGL (specifically the GL 2.1 pipeline), designed to run entirely on the CPU. This lightweight graphics library allows OpenGL-style rendering without GPU acceleration, making it ideal for educational use, debugging, embedded systems, and retro-style software rendering. Because it mirrors OpenGL syntax and design, it can act as a drop-in CPU renderer for testing or deploying 3D graphics on platforms without GPU support.

Downloads: 4 This Week

Last Update: 2026-03-05
See Project
20

NVIDIA AI Cluster Runtime (AICR)

Tooling for optimized and reproducible GPU-accelerated AI runtime

...Based on its positioning within NVIDIA’s repositories, it is designed to support scalable AI runtime environments, potentially addressing challenges related to orchestration, resource management, or reproducible AI execution. The project likely aligns with NVIDIA’s broader strategy of building modular infrastructure layers that integrate with GPU-accelerated workloads and cloud-native systems. It appears to emphasize automation, consistency, and performance optimization across AI pipelines, potentially targeting enterprise and research use cases. Given NVIDIA’s ecosystem, it may also integrate with containerized environments, Kubernetes, or other orchestration frameworks.

Downloads: 0 This Week

Last Update: 2026-03-21
See Project
21

noUiSlider

JavaScript range slider with multi-touch and keyboard support

...It can be used for free and without any attribution, in any personal or commercial project. An extensive documentation, including examples, options and configuration details, is available in the website. GPU animated, no reflows, so fast; even on older devices. All modern browsers and IE > 9 are supported.

Downloads: 0 This Week

Last Update: 2024-06-21
See Project
22

DeepSeed

Deep learning optimization library making distributed training easy

...DeepSpeed delivers extreme-scale model training for everyone, from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. With just a single GPU, ZeRO-Offload of DeepSpeed can train models with over 10B parameters, 10x bigger than the state of arts, democratizing multi-billion-parameter model training such that many deep learning scientists can explore bigger and better models. ...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
23

MuJoCo Playground

An open source library for GPU-accelerated robot learning

MuJoCo Playground, developed by Google DeepMind, is a GPU-accelerated suite of simulation environments for robot learning and sim-to-real research, built on top of MuJoCo MJX. It unifies a range of control, locomotion, and manipulation tasks into a consistent and scalable framework optimized for JAX and Warp backends. The project includes classic control benchmarks from dm_control, advanced quadruped and bipedal locomotion systems, and dexterous as well as non-prehensile manipulation setups. ...

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
24

DALI

A GPU-accelerated library containing highly optimized building blocks

...These data processing pipelines, which are currently executed on the CPU, have become a bottleneck, limiting the performance and scalability of training and inference. DALI addresses the problem of the CPU bottleneck by offloading data preprocessing to the GPU. Additionally, DALI relies on its own execution engine, built to maximize the throughput of the input pipeline.

Downloads: 0 This Week

Last Update: 2026-02-19
See Project
25

Zed

High-performance, multiplayer code editor from the creators of Atom

Zed is a next-generation code editor designed for high-performance collaboration with humans and AI. Written from scratch in Rust to efficiently leverage multiple CPU cores and your GPU. Integrate upcoming LLMs into your workflow to generate, transform, and analyze code. Chat with teammates, write notes together, and share your screen and project. Multibuffers compose excerpts from across the codebase in one editable surface. Evaluate code inline via Jupyter runtimes and collaboratively edit notebooks. Support for many languages via Tree-sitter, WebAssembly, and the Language Server Protocol. ...

Downloads: 18 This Week

Last Update: 18 hours ago
See Project