A high-throughput and memory-efficient inference and serving engine
Document (PDF, Word, PPTX ...) extraction and parse API
Interact with your documents using the power of GPT
Agentic, Reasoning, and Coding (ARC) foundation models
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Open-source observability for your LLM application
Operating LLMs in production
Advanced language and coding AI model
Universal LLM Deployment Engine with ML Compilation
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Access large language models from the command-line
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
All-in-one WebUI for AI generative image and video creation
Open-source AI hackers to find and fix your app’s vulnerabilities
CNCF Sandbox Project
State-of-the-art Parameter-Efficient Fine-Tuning
High-performance inference framework for large language models
Qwen2.5-VL is the multimodal large language model series
ChatGLM-6B: An Open Bilingual Dialogue Language Model
The official repo of Qwen chat & pretrained large language model
Collect, organize, use, and share, all in OmniBox
Adding guardrails to large language models
An orchestration framework for agentic AI and LLM applications
Performance-optimized AI inference on your GPUs