Redundancy-aware KV Cache Compression for Reasoning Models
The official implementation of RAPTOR
AI-driven multi-agent research assistant automating hypothesis
Specify a github or local repo, github pull request
From nobody to big model (LLM) hero
MoBA: Mixture of Block Attention for Long-Context LLMs
The first AI agent that builds permissionless integrations
Llama Chinese community, real-time aggregation
Large Language Model Principles and Practice Tutorial from Scratch
RAG Search API
Public opinion analysis system
Full-stack AI Red Teaming platform
Qwen3-ASR is an open-source series of ASR models
Run LLM prompts from your shell
Spark-TTS Inference Code
Foundation model for image generation
Analyzing Hacker News discussions from a decade ago in hindsight
A Pragmatic VLA Foundation Model
End-to-end pipeline converting generative videos
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Motion-controllable Video Generation via Latent Trajectory Guidance
A tool to use the Ai2 Open Coding Agents Soft-Verified Agents
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
A New Axis of Sparsity for Large Language Models