LatentSync

LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net diffusion backbone, with cross-attention of audio embeddings (via an audio encoder) and reference video frames to guide generation, and applies a set of loss functions (temporal, perceptual, sync-net based) to enforce lip-sync accuracy, visual fidelity, and temporal consistency. Over versions, LatentSync has improved temporal stability and lowered resource requirements — making inference more practical (e.g. 8 GB VRAM for earlier versions, somewhat higher for latest models).

Features

End-to-end lip-sync generation: video frames updated to match input audio without explicit motion rigs
Audio-conditioned latent diffusion model, integrating audio embeddings with visual latents for synchronized output
Support for both real video and stylized/animated input — flexible for dubbing, avatars, animation, social-content creation
Temporal-consistency optimization (via additional losses) to reduce jitter, flicker and maintain smooth motion across frames
Relatively modest inference requirements (inference possible with ~8–20 GB VRAM depending on version) for high-quality output
Fully open-source: code, pretrained weights, and inference/training pipeline available for research or integration

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow LatentSync

LatentSync Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.

Start Free

Rate This Project

User Reviews

Be the first to post a review of LatentSync!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

1 day ago

Similar Business Software

LTX

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX empowers individuals to share their visions,...

See Software
Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software
Picsart Enterprise

AI-Powered Image & Video Editing for Seamless Integration. Enhance your visual content workflows with Picsart Creative APIs, a robust suite of AI-driven tools for developers, product owners, and entrepreneurs. Easily integrate advanced image and video processing capabilities into your...

See Software
Process Street

Process Street is an AI-powered compliance operations platform that automates complex workflows, enforces standards, and tracks audit data in real time. Teams use it to create structured SOPs, assign tasks, collect data, and monitor execution with intelligent oversight. From onboarding and...

See Software
LALAL.AI

LALAL.AI is a next-generation audio separation service powered by advanced AI technology. With a suite of innovative tools - Stem Splitter, Voice Cleaner, Voice Changer, Voice Cloner, LALAL.AI enables users to take their audio content to the next level. Stem Splitter The core service of...

See Software
Windsurf Editor

The Windsurf Editor is a free AI-powered IDE and AI coding assistant that accelerates development by providing intelligent code generation and agents in over 70 programming languages and more than 40 IDEs, including VSCode, JetBrains, and Jupyter Notebooks. With Windsurf, developers can write...

See Software

Report inappropriate content

LatentSync

Taming Stable Diffusion for Lip Sync

Get an email when there's a new version of LatentSync

Features

Project Samples

Project Activity

Categories

License

Follow LatentSync

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered