Flexible Photo Recrafting While Preserving Your Identity
Multimodal-Driven Architecture for Customized Video Generation
Instant voice cloning by MIT and MyShell. Audio foundation model
Qwen-Image is a powerful image generation foundation model
Official inference repo for FLUX.2 models
Open-source, code-first Python toolkit for building, evaluating, etc.
A Unified Framework for Image Customization
Bindu: Turn any AI agent into a living microservice
One-stop solution for creating your digital avatar from chat history
Personalize Any Characters with a Scalable Diffusion Transformer
The common language for platforms, agents and businesses.
A Customizable Image-to-Video Model based on HunyuanVideo
A Universal Customization Method for Single and Multi Conditioning
Pushing the Frontier of Long Audio-Visual Generation
AI agent microservice
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Interface for OuteTTS models
MARS5 speech model (TTS) from CAMB.AI
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Implementation of Make-A-Video, new SOTA text to video generator
FaceOnLive Open KYC: Streamlining Identity Verification with AI
CoTracker is a model for tracking any point (pixel) on a video
Elegant, modern and asynchronous Telegram MTProto API framework
A framework for autonomous economic agent (AEA) development
Demo for the "Talking Head Anime from a Single Image"