A high-throughput and memory-efficient inference and serving engine
A lightweight vLLM implementation built from scratch
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Accelerate local LLM inference and finetuning
Qwen3 is the large language model series developed by Qwen team
Advanced language and coding AI model
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs
Open-source large language model family from Tencent Hunyuan
A course of learning LLM inference serving on Apple Silicon
Performance-optimized AI inference on your GPUs
Qwen3-omni is a natively end-to-end, omni-modal LLM
High-performance Inference and Deployment Toolkit for LLMs and VLMs
LightLLM is a Python-based LLM (Large Language Model) inference
Qwen2.5-VL is the multimodal large language model series
GLM-4 series: Open Multilingual Multimodal Chat LMs