hardware free download

Showing 62 open source projects for "hardware"

View related business solutions

Artificial Intelligence C++ Clear Filters & Widen Search

99.99% Uptime for MySQL and PostgreSQL Databases
Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.

Try Free
Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
1

XiaoZhi AI Chatbot

Build your own AI friend

xiaozhi-esp32 is an open-source project that guides users in building their own AI-powered conversational companion using the ESP32 microcontroller. The project provides detailed instructions on assembling the hardware, setting up the software, and integrating AI models to enable natural language interactions. This DIY approach offers an accessible entry point into AI and hardware development.

Downloads: 140 This Week

Last Update: 2026-04-19
See Project
2

GPT4All

Run Local LLMs on Any Device. Open-source

...The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.

1 Review

Downloads: 105 This Week

Last Update: 2025-03-17
See Project
3

ONNX Runtime

ONNX Runtime: cross-platform, high performance ML inferencing

...Support for a variety of frameworks, operating systems and hardware platforms. Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training.

Downloads: 29 This Week

Last Update: 2026-06-22
See Project
4

Lucebox

Fast LLM speculative inference server for consumer hardware

...The repository also includes harnesses for testing compatibility with clients such as Claude Code, Codex, OpenCode, Hermes, Pi, OpenClaw, and Open WebUI. It is most useful for developers and AI enthusiasts who want to run optimized local models with lower latency, faster token generation, and hardware-aware inference behavior.

Downloads: 5 This Week

Last Update: 5 days ago
See Project
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
5

whisper.cpp

Port of OpenAI's Whisper model in C/C++

...The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. Quantized models require less memory and disk space and depending on the hardware can be processed more efficiently.

Downloads: 537 This Week

Last Update: 2026-06-19
See Project
6

tt-metal

TT-NN operator library, and TT-Metalium low level kernel programming

tt-metal, also referred to in its documentation as TT-Metalium, is Tenstorrent’s low-level software development kit for programming applications on Tenstorrent AI accelerators. The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.

Downloads: 3 This Week

Last Update: 2026-06-26
See Project
7

mllm

Fast Multimodal LLM on Mobile Devices

...It also provides tools to convert models from popular formats like PyTorch checkpoints into optimized runtime formats that can be executed on supported hardware platforms.

Downloads: 3 This Week

Last Update: 2026-03-09
See Project
8

LiteRT

LiteRT, successor to TensorFlow Lite

...With broad hardware compatibility and advanced performance optimizations, LiteRT enables developers to build fast, scalable, and efficient AI applications that run directly on user devices.

Downloads: 6 This Week

Last Update: 3 days ago
See Project
9

nndeploy

An Easy-to-Use and High-Performance AI Deployment Framework

...The system supports multiple inference engines and hardware accelerators, allowing the same AI workflow to run on different platforms without significant modifications. nndeploy also includes performance optimization techniques such as parallel execution, memory reuse, and hardware-accelerated operations to improve inference speed.

Downloads: 0 This Week

Last Update: 2026-04-04
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

LiteRT-LM

LiteRT-LM is Google's production-ready inference framework

LiteRT-LM is Google’s open-source inference framework for deploying large language models on edge devices. It is built for production-oriented local LLM execution across Android, iOS, desktop, web, embedded, and IoT environments. The framework focuses on performance, hardware acceleration, and efficient model serving close to the user instead of relying only on remote cloud inference. It supports CPU execution across major platforms and adds GPU or NPU acceleration where available. LiteRT-LM is especially relevant for developers building private, low-latency AI features on phones, laptops, Raspberry Pi-style devices, and other edge hardware. ...

Downloads: 4 This Week

Last Update: 2 days ago
See Project
11

bitnet.cpp

Official inference framework for 1-bit LLMs

...The project’s focus on extreme quantization dramatically reduces memory footprint and energy consumption compared with traditional 16-bit or 32-bit LLMs, making it practical to deploy advanced language understanding and generation models on everyday machines. BitNet is built to scale across architectures, with configurable kernels and tiling strategies that adapt to different hardware, and it supports large models with impressive throughput even on modest resources.

Downloads: 5 This Week

Last Update: 2026-03-10
See Project
12

ggml

Tensor library for machine learning

...Written primarily in C and C++, the library provides low-level tensor operations and automatic differentiation that allow developers to implement machine learning algorithms and neural networks efficiently. The project emphasizes portability and performance, enabling machine learning inference across a wide range of hardware environments including CPUs and specialized accelerators. It is widely used as a foundational component in projects that run large language models locally, including tools that perform inference for transformer-based models. The library also implements optimization algorithms and computation graph functionality so developers can build training and inference workflows directly on top of its tensor operations.

Downloads: 2 This Week

Last Update: 2026-06-26
See Project
13

ChatGLM.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.

Downloads: 0 This Week

Last Update: 2025-01-21
See Project
14

qvac-fabric-llm.cpp

QVAC Fabric: cross-platform LLM inference and fine-tuning

qvac-fabric-llm.cpp is a cross-platform large language model inference and fine-tuning engine built as an advanced fork of llama.cpp, designed to run efficiently across desktops, mobile devices, and heterogeneous GPU environments. The project focuses on removing hardware limitations traditionally associated with LLM deployment by enabling support for a wide range of backends, including Vulkan, Metal, CUDA, and CPU, making it accessible on devices ranging from smartphones to enterprise servers. It introduces native LoRA fine-tuning capabilities that can be executed directly on consumer hardware, allowing developers to train and adapt models locally without relying on cloud infrastructure. ...

Downloads: 0 This Week

Last Update: 2026-03-31
See Project
15

MIVisionX

Set of comprehensive computer vision & machine intelligence libraries

...AMD MIVisionX delivers highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions along with Convolution Neural Net Model Compiler & Optimizer supporting ONNX, and Khronos NNEF™ exchange formats. The toolkit allows for rapid prototyping and deployment of optimized computer vision and machine learning inference workloads on a wide range of computer hardware, including small embedded x86 CPUs, APUs, discrete GPUs, and heterogeneous servers. AMD OpenVX is a highly optimized open-source implementation of the Khronos OpenVX™ 1.3 computer vision specification. It allows for rapid prototyping as well as fast execution on a wide range of computer hardware, including small embedded x86 CPUs and large workstation discrete GPUs.

Downloads: 1 This Week

Last Update: 2026-06-27
See Project
16

llama.cpp

LLM inference in C/C++

llama.cpp is a high-performance C and C++ project for running large language models locally and in the cloud with minimal setup. It is built around efficient inference, broad hardware support, and the GGUF model format. The project supports many model families and has become a major foundation for local AI tools, model serving, and embedded inference workflows. It provides command-line tools, a server mode with an OpenAI-compatible API style, model conversion utilities, and extensive backend acceleration options. llama.cpp runs on CPUs and GPUs, with support for Apple silicon, x86, RISC-V, CUDA, HIP, Vulkan, SYCL, Metal, and hybrid CPU-GPU execution. ...

Downloads: 29 This Week

Last Update: 9 hours ago
See Project
17

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if you're interested and able to write top performing tensor functions. ...

Downloads: 1 This Week

Last Update: 2025-09-05
See Project
18

HeavyDB

HeavyDB (formerly MapD/OmniSciDB)

...The database compiles queries into optimized machine code that executes efficiently on GPU hardware, significantly accelerating analytical workloads. It supports hybrid deployment environments where queries can run on both CPU and GPU architectures depending on the available resources.

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
19

Bolt NLP

Bolt is a deep learning library with high performance

Bolt is a high-performance deep learning inference framework developed by Huawei Noah's Ark Lab. It is designed to optimize and accelerate the deployment of deep learning models across various hardware platforms. Bolt is a light-weight library for deep learning. Bolt, as a universal deployment tool for all kinds of neural networks, aims to automate the deployment pipeline and achieve extreme acceleration. Bolt has been widely deployed and used in many departments of HUAWEI company, such as 2012 Laboratory, CBG and HUAWEI Product Lines. ...

Downloads: 0 This Week

Last Update: 2025-01-30
See Project
20

Speech Note

Speech Note Linux app. Note taking, reading and translating

...The application supports multiple STT engines such as Coqui STT (DeepSpeech fork), Vosk, whisper.cpp, Faster Whisper, and april-asr, giving users flexibility in accuracy, speed, and hardware requirements. For text-to-speech, it can plug into a wide range of engines including espeak-ng, MBROLA, Piper, RHVoice, Coqui TTS, Mimic 3, WhisperSpeech, Kokoro, Parler-TTS, F5-TTS, and even classic S.A.M., making it highly customizable in terms of voices and languages.

Downloads: 19 This Week

Last Update: 6 days ago
See Project
21

Cactus

Low-latency AI inference engine optimized for mobile devices

Cactus is a low-latency, energy-efficient AI inference framework designed specifically for mobile devices and wearables, enabling advanced machine learning capabilities directly on-device. It provides a full-stack architecture composed of an inference engine, a computation graph system, and highly optimized hardware kernels tailored for ARM-based processors. Cactus emphasizes efficient memory usage through techniques such as zero-copy computation graphs and quantized model formats, allowing large models to run within the constraints of mobile hardware. It supports a wide range of AI tasks including text generation, speech-to-text, vision processing, and retrieval-augmented workflows through a unified API interface. ...

Downloads: 0 This Week

Last Update: 2026-04-18
See Project
22

PowerInfer

High-speed Large Language Model Serving for Local Deployment

...This hybrid execution strategy significantly reduces memory bottlenecks and improves overall inference speed. PowerInfer incorporates specialized algorithms and sparse operators to manage neuron activation patterns and minimize data transfers between hardware components. As a result, it enables powerful language models to run on consumer hardware while achieving performance comparable to more expensive server-grade systems.

Downloads: 0 This Week

Last Update: 2026-05-11
See Project
23

ONNX

Open standard for machine learning interoperability

...It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring). ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community.

Downloads: 9 This Week

Last Update: 2026-06-15
See Project
24

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...

Downloads: 15 This Week

Last Update: 2026-06-09
See Project
25

ExecuTorch

On-device AI across mobile, embedded and edge for PyTorch

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.

Downloads: 1 This Week

Last Update: 2026-05-28
See Project