Showing 48 open source projects for "gpu hardware"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    SwiftShader

    SwiftShader

    SwiftShader is a high-performance CPU-based implementation

    SwiftShader is Google’s high-performance CPU-based implementation of the Vulkan 1.3 graphics API, designed to provide a hardware-independent rendering solution for 3D graphics. Unlike traditional GPU drivers, SwiftShader executes graphics commands entirely on the CPU, making it ideal for environments where dedicated graphics hardware is unavailable or unsuitable. It acts as a drop-in replacement for Vulkan drivers, allowing existing applications to run seamlessly by redirecting API calls through its software-based rendering engine. ...
    Downloads: 183 This Week
    Last Update:
    See Project
  • 2
    TrafficMonitor

    TrafficMonitor

    Floating window used to display current network speed, CPU & memory

    ...There are two versions of TrafficMonitor, the standard version and the Lite version. The standard version includes all the functions, while the Lite version does not include hardware monitoring functions such as temperature monitoring, GPU usage, and hard disk usage. The standard version requires administrator privilege to run, while the Lite version does not.
    Downloads: 155 This Week
    Last Update:
    See Project
  • 3
    CUDA-Q

    CUDA-Q

    C++ and Python support for the CUDA Quantum programming model

    ...It provides a full toolchain that includes compilers, runtimes, and libraries for writing quantum programs in both C++ and Python. The platform is designed to be hardware-agnostic, allowing developers to run applications on different quantum backends or simulate them efficiently using GPU acceleration when physical quantum hardware is unavailable. It enables complex workflows where classical and quantum computations are tightly integrated, supporting advanced research and real-world applications in quantum computing. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    XenosRecomp

    XenosRecomp

    A tool for converting Xbox 360 shaders to HLSL

    XenosRecomp is a specialized project within the Hedge-dev ecosystem that focuses on recompiling and reconstructing the Xenos GPU pipeline used in the Xbox 360, enabling accurate rendering when porting games to modern platforms. It works alongside CPU recompilation tools by translating GPU-specific instructions and behaviors into equivalents that can be executed on modern graphics APIs such as DirectX or Vulkan. This allows recompiled games to maintain visual fidelity while benefiting from modern hardware acceleration. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    HeavyDB

    HeavyDB

    HeavyDB (formerly MapD/OmniSciDB)

    ...HeavyDB was originally developed as part of the OmniSci platform (formerly MapD) and is commonly used for large-scale analytics and geospatial data processing. The database compiles queries into optimized machine code that executes efficiently on GPU hardware, significantly accelerating analytical workloads. It supports hybrid deployment environments where queries can run on both CPU and GPU architectures depending on the available resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Hardware Serial Spoofer

    An open-source tool for managing and modifying hardware ID

    Hardware Identity Manager (HIM) is a powerful utility designed for advanced users, developers, and security researchers to manage the hardware identification values of their Windows-based systems.
    Downloads: 67 This Week
    Last Update:
    See Project
  • 7
    PowerInfer

    PowerInfer

    High-speed Large Language Model Serving for Local Deployment

    ...PowerInfer incorporates specialized algorithms and sparse operators to manage neuron activation patterns and minimize data transfers between hardware components. As a result, it enables powerful language models to run on consumer hardware while achieving performance comparable to more expensive server-grade systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...
    Downloads: 134 This Week
    Last Update:
    See Project
  • 9
    Xenia

    Xenia

    Xbox 360 Emulator Research Project

    Xenia is an open-source experimental emulator for the Xbox 360 that aims to let users run Xbox 360 games on Windows and other platforms by reverse-engineering the console’s hardware and firmware behavior in software. It implements the 360’s CPU (Xenon), GPU (including Direct3D shader logic), and system libraries to translate Xbox instructions into equivalent host machine operations, enabling many titles to launch and in some cases play at improved frame rates compared with the original hardware. Because Xbox 360 games use custom hardware features and proprietary APIs, Xenia developers have progressively mapped and translated these into PC-friendly code while balancing performance and accuracy, and the project includes compatibility tracking so users can see what games work and how well. ...
    Downloads: 42 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    tt-metal

    tt-metal

    TT-NN operator library, and TT-Metalium low level kernel programming

    tt-metal, also referred to in its documentation as TT-Metalium, is Tenstorrent’s low-level software development kit for programming applications on Tenstorrent AI accelerators. The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.
    Downloads: 61 This Week
    Last Update:
    See Project
  • 11
    ChatGLM.cpp

    ChatGLM.cpp

    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

    ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    Strato

    Strato

    Run Nintendo Switch homebrew & games on your Android device

    Strato is an experimental Nintendo Switch emulator designed specifically for ARMv8 Android devices, aiming to bring Switch gaming and homebrew applications to mobile platforms. It builds upon earlier emulator efforts such as Skyline while incorporating improvements and optimizations tailored for mobile hardware constraints. The emulator focuses heavily on high-level emulation of Switch subsystems, including kernel services and GPU behavior, to achieve usable performance on smartphones. It leverages components inspired by established projects like Ryujinx and Yuzu, particularly in areas such as shader compilation and system emulation. Strato emphasizes accessibility by targeting Android devices directly, removing the need for desktop-class hardware for Switch emulation. ...
    Downloads: 50 This Week
    Last Update:
    See Project
  • 13
    OptiScaler

    OptiScaler

    OptiScaler bridges upscaling/frame gen across GPUs

    ...This makes it possible to swap technologies such as NVIDIA DLSS, AMD FSR, or Intel XeSS even if the game only supports one of them by default. The tool effectively acts as a compatibility layer between the game engine and multiple upscaling frameworks, enabling cross-GPU access to features that might otherwise be restricted to specific hardware ecosystems. In addition to replacing upscalers, OptiScaler can enable frame generation features in titles that do not officially support them, improving frame rates and perceived smoothness during gameplay.
    Downloads: 121 This Week
    Last Update:
    See Project
  • 14
    Apollo

    Apollo

    The easiest way to stream with the native resolution of your client

    Apollo is a self-hosted desktop streaming host designed to enable low-latency game streaming from a personal computer to remote clients using protocols compatible with Moonlight and Artemis. It acts as a server that captures, encodes, and streams desktop or game sessions while supporting hardware acceleration across AMD, Intel, and NVIDIA GPUs. The project includes a web-based interface that allows users to configure streaming settings, manage connected clients, and control application...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 15
    Skiko

    Skiko

    Kotlin Multiplatform bindings to Skia

    Skiko is an open-source graphics library from JetBrains that provides lightweight, cross-platform bindings for the Skia graphics engine tailored specifically for Kotlin Multiplatform and Compose applications. It serves as the low-level rendering backbone for Kotlin UI frameworks like Compose for Desktop and Compose for Web, enabling smooth, GPU-accelerated 2D graphics across Windows, macOS, Linux, and other supported targets without writing native code. Skiko abstracts away platform-specific rendering details while exposing Skia’s powerful features such as high-quality text shaping, image filters, path operations, and hardware accelerated canvases, making it ideal for building rich UI components, animations, games, or custom drawing surfaces. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 16
    HLSL++

    HLSL++

    Math library using HLSL syntax with multiplatform SIMD support

    HLSL++ is a header-only C++ math library designed to replicate the syntax and functionality of the HLSL shading language, making it easier for developers to write CPU-side code that mirrors GPU shader logic. It provides vector, matrix, and math operations with a syntax identical or very similar to HLSL, allowing seamless transition between shader code and application code. The library is optimized for performance and supports SIMD instructions across multiple architectures, including SSE, AVX, AVX2, AVX512, and ARM NEON, ensuring high efficiency on modern hardware.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    UCCL

    UCCL

    UCCL is an efficient communication library for GPUs

    UCCL is a high-performance GPU communication library designed to support distributed machine learning workloads and large-scale AI systems. The library focuses on enabling efficient data transfer and collective communication between GPUs during training and inference processes. It supports a variety of communication patterns including collective operations such as all-reduce as well as peer-to-peer transfers that are commonly used in modern machine learning architectures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    gemma.cpp

    gemma.cpp

    lightweight, standalone C++ inference engine for Google's Gemma models

    Gemma.cpp is a C++ implementation for running inference with Gemma models efficiently on CPUs and GPUs. Developed by Google, it allows running large language models (LLMs) like Gemma with minimal hardware, focusing on optimized performance and low latency. Gemma.cpp is intended for developers seeking to deploy LLMs in production environments without needing massive computational resources.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    bitnet.cpp

    bitnet.cpp

    Official inference framework for 1-bit LLMs

    bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    Codon

    Codon

    A high-performance, zero-overhead, extensible Python compiler

    Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 100x or more, on a single thread. Codon supports native multithreading which can lead to speedups many times higher still. The Codon framework is fully modular and extensible, allowing for the seamless integration of new modules, compiler optimizations, domain-specific languages and so on. We actively develop Codon...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    QuickViewer

    QuickViewer

    A image/comic viewer application for Windows, Mac and Linux

    QuickViewer is a fast and lightweight image viewer designed to handle large image collections and archive formats with blazing performance. Built using C++ and Qt, QuickViewer is optimized for viewing manga, comics, and large photo folders with instant loading and minimal lag. It includes a streamlined interface, hardware-accelerated rendering, and features tailored for browsing through image series efficiently. With support for compressed formats like ZIP and RAR, QuickViewer is especially...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 22
    ArrayFire

    ArrayFire

    ArrayFire, a general purpose GPU library

    ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    qvac-fabric-llm.cpp

    qvac-fabric-llm.cpp

    QVAC Fabric: cross-platform LLM inference and fine-tuning

    qvac-fabric-llm.cpp is a cross-platform large language model inference and fine-tuning engine built as an advanced fork of llama.cpp, designed to run efficiently across desktops, mobile devices, and heterogeneous GPU environments. The project focuses on removing hardware limitations traditionally associated with LLM deployment by enabling support for a wide range of backends, including Vulkan, Metal, CUDA, and CPU, making it accessible on devices ranging from smartphones to enterprise servers. It introduces native LoRA fine-tuning capabilities that can be executed directly on consumer hardware, allowing developers to train and adapt models locally without relying on cloud infrastructure. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    CUDA-QX

    CUDA-QX

    Accelerated libraries for quantum-classical computing built on CUDA-Q

    CUDA-QX is a collection of accelerated libraries built on top of the CUDA-Q platform, designed to enable rapid development of hybrid quantum-classical applications. It extends the CUDA-Q programming model by providing optimized implementations of domain-specific quantum computing primitives and workflows. The libraries are intended to help researchers and developers leverage GPUs, CPUs, and quantum processing units together in a unified computational model. CUDA-QX focuses on key areas such...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 25
    FlashMLA

    FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

    ...On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB