Accurate × Fast × Comprehensive
A SOTA open-source image editing model
Code for running inference with the SAM 3D Body Model 3DB
Project Lyra: Open Generative 3D World Models
Multi-modal large language model designed for audio understanding
OCR expert VLM powered by Hunyuan's native multimodal architecture
Python bindings for llama.cpp
Contexts Optical Compression
Sharp Monocular Metric Depth in Less Than a Second
A state-of-the-art open visual language model
LLM-based Reinforcement Learning audio edit model
An Efficient Agentic Model for Computer Use
An AI-powered security review GitHub Action using Claude
A Customizable Image-to-Video Model based on HunyuanVideo
State-of-the-art TTS model under 25MB
Open Source Speech Language Model
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal Diffusion with Representation Alignment
Qwen3-omni is a natively end-to-end, omni-modal LLM
HY-Motion model for 3D character animation generation
PyTorch code and models for the DINOv2 self-supervised learning
A series of math-specific large language models of our Qwen2 series
Pretrained time-series foundation model developed by Google Research
Audio foundation model excelling in audio understanding
Open-source deep-learning framework