A high-throughput and memory-efficient inference and serving engine
A lightweight vLLM implementation built from scratch
Visual Causal Flow
A unified library of SOTA model optimization techniques
Personal AI, On Personal Devices
Towards Human-Sounding Speech
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Run a full local LLM stack with one command using Docker
Accelerate local LLM inference and finetuning
Interface for OuteTTS models
Qwen3 is the large language model series developed by Qwen team
Advanced language and coding AI model
Open-source large language model family from Tencent Hunyuan
Multilingual Document Layout Parsing in a Single Vision-Language Model
Accurate × Fast × Comprehensive
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs
A course of learning LLM inference serving on Apple Silicon
Renderer for the harmony response format to be used with gpt-oss
Qwen2.5-VL is the multimodal large language model series
High-performance Inference and Deployment Toolkit for LLMs and VLMs
LightLLM is a Python-based LLM (Large Language Model) inference
Performance-optimized AI inference on your GPUs
FAIR Sequence Modeling Toolkit 2
Agent framework and applications built upon Qwen>=3.0
New family of code large language models (LLMs)