Ultra-Efficient AI Assistant in Go
FlashInfer: Kernel Library for LLM Serving
AI coding agent optimized for small LLMs. 87% benchmark
Official inference repo for FLUX.1 models
Vector Database for the next generation of AI applications
Open-source, code-first Python toolkit for building, evaluating, etc.
The most powerful local music generation model
LiteRT-LM is Google's production-ready inference framework
Any model. Any hardware. Zero compromise
Parallax is a distributed model serving framework
Official inference framework for 1-bit LLMs
Generate audiobooks from e-books, voice cloning & 1107+ languages
AI video generator optimized for low VRAM and older GPUs use
Performance-optimized AI inference on your GPUs
LLM.swift is a simple and readable library
Fast ML inference & training for ONNX models in Rust
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Accessible large language models via k-bit quantization for PyTorch
Accelerate local LLM inference and finetuning
Open deep learning compiler stack for cpu, gpu, etc.
High-Resolution Image Synthesis with Latent Diffusion Models
Phi-3.5 for Mac: Locally-run Vision and Language Models
Tensor library for machine learning
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Java interface to OpenCV, FFmpeg, and more