Showing 240 open source projects for "gpu hardware"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...
    Downloads: 123 This Week
    Last Update:
    See Project
  • 2
    Xenia

    Xenia

    Xbox 360 Emulator Research Project

    Xenia is an open-source experimental emulator for the Xbox 360 that aims to let users run Xbox 360 games on Windows and other platforms by reverse-engineering the console’s hardware and firmware behavior in software. It implements the 360’s CPU (Xenon), GPU (including Direct3D shader logic), and system libraries to translate Xbox instructions into equivalent host machine operations, enabling many titles to launch and in some cases play at improved frame rates compared with the original hardware. Because Xbox 360 games use custom hardware features and proprietary APIs, Xenia developers have progressively mapped and translated these into PC-friendly code while balancing performance and accuracy, and the project includes compatibility tracking so users can see what games work and how well. ...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 3
    Triton

    Triton

    Development repository for the Triton language and compiler

    ...The project leverages LLVM and MLIR to compile code into efficient GPU instructions, supporting both NVIDIA and AMD hardware. It is widely used in research and production environments where custom tensor operations are required, offering both high performance and developer-friendly syntax.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Scalene

    Scalene

    High-performance CPU, GPU, and memory profiler for Python

    Scalene is a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than other profilers while delivering far more detailed information. Once Scalene has profiled your program, it will launch a web browser with an interactive user interface (all processing is done locally). Hover over bars to see breakdowns of CPU and memory consumption, and click on underlined column headers...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    ImplicitGlobalGrid.jl

    ImplicitGlobalGrid.jl

    Distributed parallelization of stencil-based GPU and CPU applications

    ...Samuel Omlin) with Stanford University (Dr. Ludovic Räss) and the Swiss Geocomputing Centre (Prof. Yuri Podladchikov). It renders the distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid almost trivial and enables close to ideal weak scaling of real-world applications on thousands of GPUs [1, 2, 3]. ImplicitGlobalGrid relies on the Julia MPI wrapper (MPI.jl) to perform halo updates close to hardware limit and leverages CUDA-aware or ROCm-aware MPI for GPU-applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    tt-metal

    tt-metal

    TT-NN operator library, and TT-Metalium low level kernel programming

    tt-metal, also referred to in its documentation as TT-Metalium, is Tenstorrent’s low-level software development kit for programming applications on Tenstorrent AI accelerators. The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 7
    Starling Framework

    Starling Framework

    2D GPU-accelerated framework for ActionScript developers

    Starling is an open-source 2D framework for ActionScript developers that leverages GPU acceleration via Adobe's Stage3D API to create smooth, high-performance games and applications across desktop and mobile platforms. It mimics the traditional Flash display list while dramatically improving performance, making it a popular choice for Flash developers transitioning into more efficient, hardware-accelerated environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SkyPilot

    SkyPilot

    SkyPilot: Run AI and batch jobs on any infra

    SkyPilot is a framework for running AI and batch workloads on any infra, offering unified execution, high cost savings, and high GPU availability. Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    node-llama-cpp

    node-llama-cpp

    Run AI models locally on your machine with node.js bindings for llama

    ...By using native bindings and optimized model execution, the framework allows developers to integrate advanced language model capabilities into desktop applications, server software, and command-line tools. The system automatically detects the available hardware on a machine and selects the most appropriate compute backend, including CPU or GPU acceleration. Developers can use the library to perform tasks such as text generation, conversational chat, embedding generation, and structured output generation. Because it runs models locally, the platform is particularly useful for privacy-sensitive environments or offline AI deployments.
    Downloads: 4 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    autoresearch-win-rtx

    autoresearch-win-rtx

    AI agents running research on single-GPU nanochat training

    ...Experiments are executed within a fixed time budget, ensuring consistent benchmarking across iterations and allowing the agent to focus on incremental improvements. The framework is designed to be lightweight and accessible, making it suitable for developers and researchers working on desktop hardware. It also supports modern GPU acceleration features through PyTorch, enabling efficient experimentation even on limited resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Humanoid-Gym

    Humanoid-Gym

    Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

    Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training. The framework emphasizes the concept of zero-shot sim-to-real transfer, meaning that behaviors learned in simulation can be deployed directly on physical robots with minimal adjustment. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    ChatGLM.cpp

    ChatGLM.cpp

    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

    ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 14
    mpv

    mpv

    Command line video player

    mpv is a free (as in freedom) media player for the command line. It supports a wide variety of media file formats, audio and video codecs, and subtitle types. Powerful scripting capabilities can make the player do almost anything. There is a large selection of user scripts on the wiki. While mpv strives for minimalism and provides no real GUI, it has a small controller on top of the video for basic control. mpv has an OpenGL, Vulkan, and D3D11 based video output that is capable of many...
    Downloads: 96 This Week
    Last Update:
    See Project
  • 15
    OptiScaler

    OptiScaler

    OptiScaler bridges upscaling/frame gen across GPUs

    ...This makes it possible to swap technologies such as NVIDIA DLSS, AMD FSR, or Intel XeSS even if the game only supports one of them by default. The tool effectively acts as a compatibility layer between the game engine and multiple upscaling frameworks, enabling cross-GPU access to features that might otherwise be restricted to specific hardware ecosystems. In addition to replacing upscalers, OptiScaler can enable frame generation features in titles that do not officially support them, improving frame rates and perceived smoothness during gameplay.
    Downloads: 138 This Week
    Last Update:
    See Project
  • 16
    Megatron-LM

    Megatron-LM

    Ongoing research training transformer models at scale

    Megatron-LM is a GPU-optimized deep learning framework from NVIDIA designed to train extremely large transformer-based language models efficiently at scale. The repository provides both a reference training implementation and Megatron Core, a composable library of high-performance building blocks for custom large-model pipelines. It supports advanced parallelism strategies including tensor, pipeline, data, expert, and context parallelism, enabling training across massive multi-GPU and multi-node clusters. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    NVIDIA Isaac Sim

    NVIDIA Isaac Sim

    NVIDIA Isaac Sim is an open-source application on NVIDIA Omniverse

    NVIDIA Isaac Sim is a high-fidelity robotics simulation platform built on NVIDIA Omniverse to develop, test, and validate AI-driven robots in physically accurate virtual environments. It supports a wide array of robotics formats (URDF, MJCF, CAD), includes GPU-accelerated physics, and features immersive RTX rendering and multisensory simulation. Realistic physics via GPU-accelerated engines and RTX ray tracing. Multi-sensor simulation (RGB-D cameras, Lidar, Radar, IMU, contact sensors)....
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Strato

    Strato

    Run Nintendo Switch homebrew & games on your Android device

    Strato is an experimental Nintendo Switch emulator designed specifically for ARMv8 Android devices, aiming to bring Switch gaming and homebrew applications to mobile platforms. It builds upon earlier emulator efforts such as Skyline while incorporating improvements and optimizations tailored for mobile hardware constraints. The emulator focuses heavily on high-level emulation of Switch subsystems, including kernel services and GPU behavior, to achieve usable performance on smartphones. It leverages components inspired by established projects like Ryujinx and Yuzu, particularly in areas such as shader compilation and system emulation. Strato emphasizes accessibility by targeting Android devices directly, removing the need for desktop-class hardware for Switch emulation. ...
    Downloads: 46 This Week
    Last Update:
    See Project
  • 19
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. The project supports zero-shot voice cloning, meaning it can adapt to a reference speaker’s voice with minimal example data, enabling realistic and personalized synthetic speech. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    CapFrameX

    CapFrameX

    Frametime capture and analysis tool

    ...It uses backend tools like PresentMon to log data and then exposes a comprehensive UI for analyzing the results: you can view charts of frametimes, historic graphing, stuttering analysis, L-shape graphs, input-lag overlays, and compare multiple capture runs side by side. Importantly, the tool also integrates with sensor inputs (CPU, GPU, VRAM, temps, etc.) and overlays statistics in-game via Rivatuner Statistics Server, so you get in-situ feedback while you run. For benchmarking, it supports aggregation, filtering, outlier detection, and export of records to CSV/Excel for further analysis or reporting. The project is suited for reviewers, hardware testers, and power users who want to dig deeper than simple FPS numbers and want to diagnose performance issues.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Paint.NET

    Paint.NET

    Downloads for Paint.NET, such as installer EXEs and portable ZIPs

    ...The use of DXGI Flip Model ensures low input latency and reduced power consumption. Whether you have a power-conscious laptop or a monstrous desktop with a gigantic GPU, you can expect it to start up immediately, respond quickly to every mouse click, and take full advantage of all of your hardware.
    Downloads: 145 This Week
    Last Update:
    See Project
  • 22
    FanCtrl

    FanCtrl

    FanCtrl allows you to automatically control the fan speed

    FanCtrl is a Windows desktop utility focused on automatically controlling PC fan speeds using temperature sensors and customizable fan curves, so your cooling behavior matches how you actually use your system. It combines monitoring and control in one place, letting you view temperatures, fan RPM, and control percentages while you tune how aggressively each fan responds. The project supports a range of control backends and integrations, including motherboard fan headers and several popular...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 23
    Insanely Fast Whisper

    Insanely Fast Whisper

    An opinionated CLI to transcribe Audio files w/ Whisper on-device

    Insanely Fast Whisper is a high-performance command-line tool designed to dramatically accelerate speech-to-text transcription using OpenAI’s Whisper models on local hardware. It leverages modern optimizations such as batch processing, mixed precision, and advanced attention mechanisms like Flash Attention to significantly reduce inference time while maintaining high transcription accuracy. The project is built on top of the Transformers ecosystem and integrates with libraries such as Optimum to maximize GPU efficiency. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    PEFT

    PEFT

    State-of-the-art Parameter-Efficient Fine-Tuning

    Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters. Fine-tuning large-scale PLMs is often prohibitively costly. In this regard, PEFT methods only fine-tune a small number of (extra) model parameters, thereby greatly decreasing the computational and storage costs. Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Apollo

    Apollo

    The easiest way to stream with the native resolution of your client

    Apollo is a self-hosted desktop streaming host designed to enable low-latency game streaming from a personal computer to remote clients using protocols compatible with Moonlight and Artemis. It acts as a server that captures, encodes, and streams desktop or game sessions while supporting hardware acceleration across AMD, Intel, and NVIDIA GPUs. The project includes a web-based interface that allows users to configure streaming settings, manage connected clients, and control application...
    Downloads: 18 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB