Open-source, high-performance AI model with advanced reasoning
Advanced language and coding AI model
Powerful AI language model (MoE) optimized for efficiency/performance
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Agentic, Reasoning, and Coding (ARC) foundation models
TokenSpeed is a speed-of-light LLM inference engine
Universal LLM Deployment Engine with ML Compilation
Open-source large language model family from Tencent Hunyuan
How to optimize some algorithm in cuda
Diversity-driven optimization and large-model reasoning ability
Run Local LLMs on Any Device. Open-source
The official repo of Qwen chat & pretrained large language model
Redundancy-aware KV Cache Compression for Reasoning Models
High-performance inference framework for large language models
Designed for text embedding and ranking tasks
Unified KV Cache Compression Methods for Auto-Regressive Models
On the Structural Pruning of Large Language Models
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Retrieval and Retrieval-augmented LLMs
Capable of understanding text, audio, vision, video
Bringing BERT into modernity via both architecture changes and scaling
Make your agents learn from experience
An open-source, modern-design AI training tracking and visualization
Qwen3-Coder is the code version of Qwen3
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)