Showing 204 open source projects for "gpu hardware"

View related business solutions
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    FanCtrl

    FanCtrl

    FanCtrl allows you to automatically control the fan speed

    FanCtrl is a Windows desktop utility focused on automatically controlling PC fan speeds using temperature sensors and customizable fan curves, so your cooling behavior matches how you actually use your system. It combines monitoring and control in one place, letting you view temperatures, fan RPM, and control percentages while you tune how aggressively each fan responds. The project supports a range of control backends and integrations, including motherboard fan headers and several popular...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    Bend

    Bend

    A massively parallel, high-level programming language

    Bend is an interactive programming environment (REPL) built on top of the Kotlin language, designed to allow users to explore, experiment, and learn Kotlin in a live, feedback-driven manner. The tool lets you define variables, functions, or values at the prompt and iteratively refine them—immediately seeing output and types—while preserving state across commands. It emphasizes discoverability and experimentation: users can inspect functions, call them on sample inputs, and evolve logic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    HWID Spoofer - Hardware ID Changer

    HWID Spoofer - Hardware ID Changer

    The best HWID Spoofer in 2025.

    ...Ideal for gamers facing hardware bans or trying to stay undetected. Compatible with Windows 10/11. BE CAREFUL, THERE ARE COPIES OF TRACEX ON SOURCEFORGE WHICH ARE MALWARE. IF YOU WANT FULL INSTRUCTIONS ON HOW TO USE TRACEX, CHECK OUT OUR OFFICIAL PAGE: https://slothytech.com/tracex/ ------------------------------------
    Leader badge
    Downloads: 1,183 This Week
    Last Update:
    See Project
  • 4
    stt

    stt

    Voice Recognition to Text Tool

    ...The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. It supports GPU acceleration if available, enabling faster processing on compatible hardware but still offers reliable performance on CPUs alone.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ...The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference (INT4, INT8) to reduce GPU memory requirements. Automatic mode switching between precision/memory tradeoffs (full/quantized).
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    UCCL

    UCCL

    UCCL is an efficient communication library for GPUs

    UCCL is a high-performance GPU communication library designed to support distributed machine learning workloads and large-scale AI systems. The library focuses on enabling efficient data transfer and collective communication between GPUs during training and inference processes. It supports a variety of communication patterns including collective operations such as all-reduce as well as peer-to-peer transfers that are commonly used in modern machine learning architectures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Transcoder

    Transcoder

    Hardware-accelerated video transcoding using Android MediaCodec APIs

    Transcoder by DeepMedia is an AI-powered video-to-video speech translation engine that enables fully automated multilingual dubbing. Unlike traditional speech translation systems that rely on multi-stage pipelines, Transcoder directly translates one speaker’s video into another language while preserving facial expressions, lip-sync, and vocal identity. Designed for real-time use and production-grade pipelines, Transcoder combines advanced deep learning models with GPU acceleration to deliver...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    tt-metal

    tt-metal

    TT-NN operator library, and TT-Metalium low level kernel programming

    tt-metal, also referred to in its documentation as TT-Metalium, is Tenstorrent’s low-level software development kit for programming applications on Tenstorrent AI accelerators. The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    tvm

    tvm

    Open deep learning compiler stack for cpu, gpu, etc.

    Apache TVM is an open source machine learning compiler framework for CPUs, GPUs, and machine learning accelerators. It aims to enable machine learning engineers to optimize and run computations efficiently on any hardware backend. The vision of the Apache TVM Project is to host a diverse community of experts and practitioners in machine learning, compilers, and systems architecture to build an accessible, extensible, and automated open-source framework that optimizes current and emerging...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    Ollama Telegram Bot

    Ollama Telegram Bot

    Ollama Telegram bot, with advanced configuration

    ...It includes access control features such as user whitelists and admin roles, allowing fine-grained control over who can interact with the bot and manage its behavior. The bot connects to a local or remote Ollama server, enabling users to run models on their own hardware while maintaining full privacy. It supports Docker-based deployment, making it easy to set up alongside an Ollama instance with optional GPU acceleration. Configuration is handled through environment variables, allowing customization of models, timeouts, and interaction rules. Overall, ollama-telegram provides a lightweight and extensible solution for deploying personal or team-based AI assistants.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    HLSL++

    HLSL++

    Math library using HLSL syntax with multiplatform SIMD support

    HLSL++ is a header-only C++ math library designed to replicate the syntax and functionality of the HLSL shading language, making it easier for developers to write CPU-side code that mirrors GPU shader logic. It provides vector, matrix, and math operations with a syntax identical or very similar to HLSL, allowing seamless transition between shader code and application code. The library is optimized for performance and supports SIMD instructions across multiple architectures, including SSE, AVX, AVX2, AVX512, and ARM NEON, ensuring high efficiency on modern hardware.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    Wan2GP is an open source AI video generation toolkit designed to make modern generative models accessible on consumer-grade hardware with limited GPU memory. It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and certain AMD GPUs. ...
    Downloads: 49 This Week
    Last Update:
    See Project
  • 13
    QuickViewer

    QuickViewer

    A image/comic viewer application for Windows, Mac and Linux

    QuickViewer is a fast and lightweight image viewer designed to handle large image collections and archive formats with blazing performance. Built using C++ and Qt, QuickViewer is optimized for viewing manga, comics, and large photo folders with instant loading and minimal lag. It includes a streamlined interface, hardware-accelerated rendering, and features tailored for browsing through image series efficiently. With support for compressed formats like ZIP and RAR, QuickViewer is especially...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 14
    hashcat

    hashcat

    World's fastest and most advanced password recovery utility

    hashcat is the world's fastest and most advanced password recovery utility, supporting five unique modes of attack for over 300 highly-optimized hashing algorithms. hashcat currently supports CPUs, GPUs, and other hardware accelerators on Linux, Windows, and macOS, and has facilities to help enable distributed password cracking. Download the latest release and unpack it in the desired location. Please remember to use 7z x when unpacking the archive from the command line to ensure full file...
    Downloads: 113 This Week
    Last Update:
    See Project
  • 15
    ort

    ort

    Fast ML inference & training for ONNX models in Rust

    ort is a high-performance Rust library that provides bindings to ONNX Runtime, enabling developers to run machine learning inference and training workflows directly within Rust applications using the standardized ONNX model format. It is designed to bridge the gap between modern machine learning frameworks and systems programming by offering a safe, ergonomic API for executing models originally built in ecosystems like PyTorch, TensorFlow, or scikit-learn. The library emphasizes speed and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    shimmy

    shimmy

    Python-free Rust inference server

    ...It supports modern model formats such as GGUF and SafeTensors and can automatically discover models stored locally or in common directories used by other AI tools. Advanced capabilities include CPU offloading for Mixture-of-Experts models and GPU acceleration, enabling large models to run on consumer hardware with limited VRAM.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    TensorRT Node for ComfyUI

    TensorRT Node for ComfyUI

    Enables the best performance on NVIDIA RTX Graphics Cards

    ...The repo typically includes instructions for converting models to TensorRT engines and for wiring those engines into ComfyUI nodes. This is particularly attractive for power users who run many generations or who host ComfyUI on dedicated hardware and want to squeeze out every bit of GPU performance. In short, it’s about taking ComfyUI from “it runs” to “it runs fast” on NVIDIA GPUs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    RamaLama

    RamaLama

    Simplifies the local serving of AI models from any source

    RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    ArrayFire

    ArrayFire

    ArrayFire, a general purpose GPU library

    ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Codon

    Codon

    A high-performance, zero-overhead, extensible Python compiler

    Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 100x or more, on a single thread. Codon supports native multithreading which can lead to speedups many times higher still. The Codon framework is fully modular and extensible, allowing for the seamless integration of new modules, compiler optimizations, domain-specific languages and so on. We actively develop Codon...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    gemma.cpp

    gemma.cpp

    lightweight, standalone C++ inference engine for Google's Gemma models

    Gemma.cpp is a C++ implementation for running inference with Gemma models efficiently on CPUs and GPUs. Developed by Google, it allows running large language models (LLMs) like Gemma with minimal hardware, focusing on optimized performance and low latency. Gemma.cpp is intended for developers seeking to deploy LLMs in production environments without needing massive computational resources.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    qvac-fabric-llm.cpp

    qvac-fabric-llm.cpp

    QVAC Fabric: cross-platform LLM inference and fine-tuning

    qvac-fabric-llm.cpp is a cross-platform large language model inference and fine-tuning engine built as an advanced fork of llama.cpp, designed to run efficiently across desktops, mobile devices, and heterogeneous GPU environments. The project focuses on removing hardware limitations traditionally associated with LLM deployment by enabling support for a wide range of backends, including Vulkan, Metal, CUDA, and CPU, making it accessible on devices ranging from smartphones to enterprise servers. It introduces native LoRA fine-tuning capabilities that can be executed directly on consumer hardware, allowing developers to train and adapt models locally without relying on cloud infrastructure. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Lenovo Legion Linux Support

    Lenovo Legion Linux Support

    Driver and tools for controlling Lenovo Legion laptops in Linux

    Lenovo Legion Linux (LLL) brings additional drivers and tools for Lenovo Legion series laptops to Linux. It is the alternative to Lenovo Vantage or Legion Zone (both Windows only). It allows you to control features like the fan curve, power mode, power limits, rapid charging, and more. This has been achieved through reverse engineering and disassembling the ACPI firmware, as well as the firmware and memory of the embedded controller (EC).
    Downloads: 32 This Week
    Last Update:
    See Project
  • 24
    Skiko

    Skiko

    Kotlin Multiplatform bindings to Skia

    Skiko is an open-source graphics library from JetBrains that provides lightweight, cross-platform bindings for the Skia graphics engine tailored specifically for Kotlin Multiplatform and Compose applications. It serves as the low-level rendering backbone for Kotlin UI frameworks like Compose for Desktop and Compose for Web, enabling smooth, GPU-accelerated 2D graphics across Windows, macOS, Linux, and other supported targets without writing native code. Skiko abstracts away platform-specific rendering details while exposing Skia’s powerful features such as high-quality text shaping, image filters, path operations, and hardware accelerated canvases, making it ideal for building rich UI components, animations, games, or custom drawing surfaces. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    bitnet.cpp

    bitnet.cpp

    Official inference framework for 1-bit LLMs

    bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB