HunyuanVideo-AvatarTencent-Hunyuan
|
ModelScopeAlibaba Cloud
|
|||||
Related Products
|
||||||
About
HunyuanVideo‑Avatar supports animating any input avatar images to high‑dynamic, emotion‑controllable videos using simple audio conditions. It is a multimodal diffusion transformer (MM‑DiT)‑based model capable of generating dynamic, emotion‑controllable, multi‑character dialogue videos. It accepts multi‑style avatar inputs, photorealistic, cartoon, 3D‑rendered, anthropomorphic, at arbitrary scales from portrait to full body. Provides a character image injection module that ensures strong character consistency while enabling dynamic motion; an Audio Emotion Module (AEM) that extracts emotional cues from a reference image to enable fine‑grained emotion control over generated video; and a Face‑Aware Audio Adapter (FAA) that isolates audio influence to specific face regions via latent‑level masking, supporting independent audio‑driven animation in multi‑character scenarios.
|
About
This model is based on a multi-stage text-to-video generation diffusion model, which inputs a description text and returns a video that matches the text description. Only English input is supported.
This model is based on a multi-stage text-to-video generation diffusion model, which inputs a description text and returns a video that matches the text description. Only English input is supported.
The text-to-video generation diffusion model consists of three sub-networks: text feature extraction, text feature-to-video latent space diffusion model, and video latent space to video visual space. The overall model parameters are about 1.7 billion. Support English input. The diffusion model adopts the Unet3D structure, and realizes the function of video generation through the iterative denoising process from the pure Gaussian noise video.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
Researchers and developers in AI-driven animation looking for a tool to generate emotion‑aligned, multi-character audio‑driven avatar videos
|
Audience
Users interested in an open source text-to-video AI video generation model
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and Videos |
Screenshots and Videos |
|||||
Pricing
Free
Free Version
Free Trial
|
Pricing
Free
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationTencent-Hunyuan
United States
github.com/Tencent-Hunyuan/HunyuanVideo-Avatar
|
Company InformationAlibaba Cloud
China
modelscope.cn/
|
|||||
Alternatives |
Alternatives |
|||||
|
|
|
|||||
|
|
||||||
Categories |
Categories |
|||||
Integrations
01.AI
CodeQwen
GLM-4.5
Gradio
Qwen
Qwen-7B
Qwen-Image
Qwen2
Qwen2-VL
Qwen2.5
|
Integrations
01.AI
CodeQwen
GLM-4.5
Gradio
Qwen
Qwen-7B
Qwen-Image
Qwen2
Qwen2-VL
Qwen2.5
|
|||||
|
|
|