Stable Diffusion with Core ML on Apple Silicon
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Repo of Qwen2-Audio chat & pretrained large audio language model
Qwen2.5-VL is the multimodal large language model series
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
ChatGLM-6B: An Open Bilingual Dialogue Language Model
A Powerful Native Multimodal Model for Image Generation
Qwen-Image is a powerful image generation foundation model
Block Diffusion for Ultra-Fast Speculative Decoding
Code for running inference with the SAM 3D Body Model 3DB
Unified Multimodal Understanding and Generation Models
Open-source large language model family from Tencent Hunyuan
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
A series of math-specific large language models of our Qwen2 series
Implementation of the Surya Foundation Model for Heliophysics
Capable of understanding text, audio, vision, video
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Powerful open source image generation model
A Conversational Speech Generation Model
AI Suite for upscaling, interpolating & restoring images/videos
Implementation of model parallel autoregressive transformers on GPUs