Towards Human-Sounding Speech
Any model. Any hardware. Zero compromise
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
Parallax is a distributed model serving framework
Official Python inference and LoRA trainer package
Ling is a MoE LLM provided and open-sourced by InclusionAI
Bayesian Modeling and Probabilistic Programming in Python
Personal AI, On Personal Devices
Taming Stable Diffusion for Lip Sync
LiteRT is the new name for TensorFlow Lite (TFLite)
Operating LLMs in production
Run Local LLMs on Any Device. Open-source
Minimal Python framework for scalable AI inference servers fast
Python-free Rust inference server
Tensor library for machine learning
Alibaba's high-performance LLM inference engine for diverse apps
LightLLM is a Python-based LLM (Large Language Model) inference
Performance-optimized AI inference on your GPUs
Accelerate local LLM inference and finetuning
A library for accelerating Transformer models on NVIDIA GPUs
OpenShell is the safe, private runtime for autonomous AI agents.
Open standard for machine learning interoperability
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Powering Amazon custom machine learning chips