A high-throughput and memory-efficient inference and serving engine
Multilingual sentence & image embeddings with BERT
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
State-of-the-art Parameter-Efficient Fine-Tuning
Modular AI runtime for robots
MobileLLM Optimizing Sub-billion Parameter Language Models
Operating LLMs in production
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Qwen3-Coder is the code version of Qwen3
⚡ Building applications with LLMs through composability ⚡
Gemma open-weight LLM library, from Google DeepMind
Replace OpenAI GPT with another LLM in your app
Text-space optimizer that trains reusable natural-language skills
A series of math-specific large language models of our Qwen2 series
Toolkit for conversational AI
A frontier, first-principles handbook
SimpleMem: Efficient Lifelong Memory for LLM Agents
Low-code framework for building custom LLMs, neural networks
Framework and no-code GUI for fine-tuning LLMs
Unified KV Cache Compression Methods for Auto-Regressive Models
Benchmark LLMs by fighting in Street Fighter 3
Research code artifacts for Code World Model (CWM)
Qwen3-omni is a natively end-to-end, omni-modal LLM
I Agent designed to interact with ROS1- and ROS2-based robotics system
Designed for text embedding and ranking tasks