Sharp Monocular Metric Depth in Less Than a Second
Generate audiobooks from e-books, voice cloning & 1107+ languages
High-resolution models for human tasks
Extensible AGI Framework
Generate audiobooks from EPUBs, PDFs and text with captions
The official repo of Qwen chat & pretrained large language model
Toolkit for audio, music, and speech generation
We write your reusable computer vision tools
Machine Learning Pipelines for Kubeflow
A program that can do anything to earn money without human operators
Qwen2.5-VL is the multimodal large language model series
Qwen3-Coder is the code version of Qwen3
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
One-click deployment (including offline integration package)
CodeGeeX4-ALL-9B, a versatile model for all AI software development
GUI Exploration Lab. One of the best GUI agent solutions
Document Image Parsing via Heterogeneous Anchor Prompting”
A nearly-live implementation of OpenAI's Whisper
Qwen-Image is a powerful image generation foundation model
Evaluation and Tracking for LLM Experiments
Aider is AI pair programming in your terminal
Industrial-level controllable zero-shot text-to-speech system
Use Microsoft Edge's online text-to-speech service from Python
"Big Model" trains a visual multimodal VLM with 26M parameters
Tensor Learning in Python