Multimodal Diffusion with Representation Alignment
Accurate × Fast × Comprehensive
HY-Motion model for 3D character animation generation
PyTorch code and models for the DINOv2 self-supervised learning
CogView4, CogView3-Plus and CogView3(ECCV 2024)
tiktoken is a fast BPE tokeniser for use with OpenAI's models
A Customizable Image-to-Video Model based on HunyuanVideo
A Systematic Framework for Interactive World Modeling
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Unified Multimodal Understanding and Generation Models
A series of math-specific large language models of our Qwen2 series
A SOTA open-source image editing model
State-of-the-art (SoTA) text-to-video pre-trained model
An Efficient Agentic Model for Computer Use
Repo for SeedVR2 & SeedVR
Programmatic access to the AlphaGenome model
26m function call model that runs on incredibly small devices
Qwen3-ASR is an open-source series of ASR models
Implementation of "MobileCLIP" CVPR 2024
Video understanding codebase from FAIR for reproducing video models
Tool for exploring and debugging transformer model behaviors
Multimodal-Driven Architecture for Customized Video Generation
Project Lyra: Open Generative 3D World Models
Achieving 3+ generation speedup on reasoning tasks
Ultra-Efficient LLMs on End Device