gpu hardware free download

Showing 159 open source projects for "gpu hardware"

View related business solutions

Mac Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

GPU Puzzles

Solve puzzles. Learn CUDA

GPU Puzzles is an educational project designed to teach GPU programming concepts through interactive coding exercises and puzzles. Instead of presenting traditional lecture-style explanations, the project immerses learners directly in hands-on programming tasks that demonstrate how GPU computation works. The exercises are implemented using Python with the Numba CUDA interface, which allows Python code to compile into GPU kernels that run on CUDA-enabled hardware.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
2

GPU Hot

Real-time NVIDIA GPU dashboard

GPU Hot is an open-source, lightweight monitoring dashboard designed to provide real-time visibility into NVIDIA GPU performance across single machines or entire clusters. The project offers a self-hosted web interface that streams hardware metrics directly from GPU servers, enabling developers, ML engineers, and system administrators to observe GPU utilization and system behavior in real time through a browser.

Downloads: 4 This Week

Last Update: 5 days ago
See Project
3

NVIDIA GPU Operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

Kubernetes provides access to special hardware resources such as NVIDIA GPUs, NICs, Infiniband adapters and other devices through the device plugin framework. However, configuring and managing nodes with these hardware resources requires the configuration of multiple software components such as drivers, container runtimes or other libraries which are difficult and prone to errors.

Downloads: 3 This Week

Last Update: 2026-03-19
See Project
4

ChefKiss Inferno

Emulating Apple Silicon devices

Inferno by ChefKissInc is a low-level systems project focused on enabling hardware acceleration and advanced graphics compatibility on Apple Silicon devices, particularly within unsupported or experimental environments. It is designed to bridge gaps between macOS hardware capabilities and software ecosystems that traditionally rely on different GPU architectures, such as those found in Linux or Windows environments.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

CubeCL

Multi-platform high-performance compute language extension for Rust

CubeCL is a low-level compute language and compiler framework designed to simplify and optimize GPU programming for high-performance workloads, particularly in machine learning and numerical computing. It provides an abstraction layer that allows developers to write portable, hardware-efficient compute kernels without directly dealing with complex GPU APIs such as CUDA or OpenCL. CubeCL focuses on delivering predictable performance and composability by exposing explicit control over memory layouts, parallelism, and execution patterns while still maintaining a developer-friendly syntax. ...

Downloads: 8 This Week

Last Update: 2026-03-18
See Project
6

llmfit

157 models, 30 providers, one command to find what runs on hardware

llmfit is a terminal-based utility that helps developers determine which large language models can realistically run on their local hardware by analyzing system resources and model requirements. The tool automatically detects CPU, RAM, GPU, and VRAM specifications, then ranks available models based on performance factors such as speed, quality, and memory fit. It provides both an interactive terminal user interface and a traditional CLI mode, enabling flexible workflows for different user preferences. llmfit also supports advanced configurations including multi-GPU setups, mixture-of-experts architectures, and dynamic quantization recommendations. ...

Downloads: 53 This Week

Last Update: 2 days ago
See Project
7

CUDA-Q

C++ and Python support for the CUDA Quantum programming model

...It provides a full toolchain that includes compilers, runtimes, and libraries for writing quantum programs in both C++ and Python. The platform is designed to be hardware-agnostic, allowing developers to run applications on different quantum backends or simulate them efficiently using GPU acceleration when physical quantum hardware is unavailable. It enables complex workflows where classical and quantum computations are tightly integrated, supporting advanced research and real-world applications in quantum computing. ...

Downloads: 10 This Week

Last Update: 2026-03-18
See Project
8

SwiftShader

SwiftShader is a high-performance CPU-based implementation

SwiftShader is Google’s high-performance CPU-based implementation of the Vulkan 1.3 graphics API, designed to provide a hardware-independent rendering solution for 3D graphics. Unlike traditional GPU drivers, SwiftShader executes graphics commands entirely on the CPU, making it ideal for environments where dedicated graphics hardware is unavailable or unsuitable. It acts as a drop-in replacement for Vulkan drivers, allowing existing applications to run seamlessly by redirecting API calls through its software-based rendering engine. ...

Downloads: 165 This Week

Last Update: 4 days ago
See Project
9

GPUStack

Performance-optimized AI inference on your GPUs

GPUStack is an open-source GPU cluster management platform designed to simplify the deployment and operation of artificial intelligence models across heterogeneous hardware environments. The system aggregates GPU resources from multiple machines into a unified cluster so developers and administrators can run large language models and other AI workloads efficiently across distributed infrastructure.

Downloads: 11 This Week

Last Update: 2026-03-26
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
10

Stats

macOS system monitor in your menu bar

Stats currently supported on macOS 10.13 (High Sierra) and higher. Stats is an application that allows you to monitor your macOS system. CPU utilization, GPU utilization, memory usage, disk utilization, sensors information (Temperature/Voltage/Power), battery level, network usage, fans speed, fan control, and Bluetooth devices. Supports many languages, such as English, Polski, Українська, Русский, and many more. You can help by adding a new language or improve existing translation.

Downloads: 9 This Week

Last Update: 4 days ago
See Project
11

LibreHardwareMonitor

Monitor temperature sensors, fan speed, voltage, load & clock speeds

...LibreHardwareMonitor supports modern Intel and AMD CPUs, major GPU vendors, storage devices, and network adapters. Built on modern .NET versions, it continues to evolve with frequent updates and broad community contributions. Licensed under MPL 2.0, it offers a transparent and extensible alternative to proprietary hardware monitoring tools.

Downloads: 177 This Week

Last Update: 2026-02-14
See Project
12

FlexLLMGen

Running large language models on a single GPU

FlexLLMGen is an open-source inference engine designed to run large language models efficiently on limited hardware resources such as a single GPU. The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on commodity hardware.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
13

XenosRecomp

A tool for converting Xbox 360 shaders to HLSL

XenosRecomp is a specialized project within the Hedge-dev ecosystem that focuses on recompiling and reconstructing the Xenos GPU pipeline used in the Xbox 360, enabling accurate rendering when porting games to modern platforms. It works alongside CPU recompilation tools by translating GPU-specific instructions and behaviors into equivalents that can be executed on modern graphics APIs such as DirectX or Vulkan. This allows recompiled games to maintain visual fidelity while benefiting from modern hardware acceleration. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
14

AirLLM

AirLLM 70B inference with single 4GB GPU

AirLLM is an open source Python library that enables extremely large language models to run on consumer hardware with very limited GPU memory. The project addresses one of the main barriers to local LLM experimentation by introducing a memory-efficient inference technique that loads model layers sequentially rather than storing the entire model in GPU memory. This layer-wise inference approach allows models with tens of billions of parameters to run on devices with only a few gigabytes of VRAM. ...

Downloads: 1 This Week

Last Update: 2026-03-10
See Project
15

HeavyDB

HeavyDB (formerly MapD/OmniSciDB)

...HeavyDB was originally developed as part of the OmniSci platform (formerly MapD) and is commonly used for large-scale analytics and geospatial data processing. The database compiles queries into optimized machine code that executes efficiently on GPU hardware, significantly accelerating analytical workloads. It supports hybrid deployment environments where queries can run on both CPU and GPU architectures depending on the available resources.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
16

PowerInfer

High-speed Large Language Model Serving for Local Deployment

...PowerInfer incorporates specialized algorithms and sparse operators to manage neuron activation patterns and minimize data transfers between hardware components. As a result, it enables powerful language models to run on consumer hardware while achieving performance comparable to more expensive server-grade systems.

Downloads: 2 This Week

Last Update: 2026-03-04
See Project
17

LocalAI

The free, Open Source alternative to OpenAI, Claude and others

LocalAI is an open-source platform that allows users to run large language models and other AI systems locally on their own hardware. It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a GPU, making it accessible for local development and private deployments. ...

Downloads: 33 This Week

Last Update: 2026-04-07
See Project
18

how-to-optim-algorithm-in-cuda

How to optimize some algorithm in cuda

...Instead of presenting only theoretical explanations, the repository includes hand-written CUDA implementations of fundamental operations such as reductions, element-wise computations, softmax, and attention mechanisms. These examples show how different optimization techniques influence performance on modern GPU hardware and allow readers to experiment with real implementations. The repository also contains extensive learning notes that summarize CUDA programming concepts, GPU architecture details, and performance engineering strategies.

Downloads: 2 This Week

Last Update: 7 days ago
See Project
19

Parallax

Parallax is a distributed model serving framework

Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to available hardware and how requests are routed across nodes during execution. ...

Downloads: 4 This Week

Last Update: 2026-03-09
See Project
20

GPT4All

Run Local LLMs on Any Device. Open-source

GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...

1 Review

Downloads: 148 This Week

Last Update: 2025-03-17
See Project
21

ImplicitGlobalGrid.jl

Distributed parallelization of stencil-based GPU and CPU applications

...Samuel Omlin) with Stanford University (Dr. Ludovic Räss) and the Swiss Geocomputing Centre (Prof. Yuri Podladchikov). It renders the distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid almost trivial and enables close to ideal weak scaling of real-world applications on thousands of GPUs [1, 2, 3]. ImplicitGlobalGrid relies on the Julia MPI wrapper (MPI.jl) to perform halo updates close to hardware limit and leverages CUDA-aware or ROCm-aware MPI for GPU-applications. ...

Downloads: 0 This Week

Last Update: 2026-01-08
See Project
22

node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama

...By using native bindings and optimized model execution, the framework allows developers to integrate advanced language model capabilities into desktop applications, server software, and command-line tools. The system automatically detects the available hardware on a machine and selects the most appropriate compute backend, including CPU or GPU acceleration. Developers can use the library to perform tasks such as text generation, conversational chat, embedding generation, and structured output generation. Because it runs models locally, the platform is particularly useful for privacy-sensitive environments or offline AI deployments.

Downloads: 18 This Week

Last Update: 2026-03-17
See Project
23

Scalene

High-performance CPU, GPU, and memory profiler for Python

Scalene is a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than other profilers while delivering far more detailed information. Once Scalene has profiled your program, it will launch a web browser with an interactive user interface (all processing is done locally). Hover over bars to see breakdowns of CPU and memory consumption, and click on underlined column headers...

Downloads: 1 This Week

Last Update: 2026-03-22
See Project
24

Starling Framework

2D GPU-accelerated framework for ActionScript developers

Starling is an open-source 2D framework for ActionScript developers that leverages GPU acceleration via Adobe's Stage3D API to create smooth, high-performance games and applications across desktop and mobile platforms. It mimics the traditional Flash display list while dramatically improving performance, making it a popular choice for Flash developers transitioning into more efficient, hardware-accelerated environments.

Downloads: 0 This Week

Last Update: 2026-01-02
See Project
25

SkyPilot

SkyPilot: Run AI and batch jobs on any infra

SkyPilot is a framework for running AI and batch workloads on any infra, offering unified execution, high cost savings, and high GPU availability. Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Downloads: 0 This Week

Last Update: 2026-03-24
See Project