A fast TTS architecture with conditional flow matching
A Powerful Native Multimodal Model for Image Generation
Virtual AI anchor that combines state-of-the-art technology
A Universal Customization Method for Single and Multi Conditioning
Flexible Photo Recrafting While Preserving Your Identity
An Open Source text-to-speech system built by inverting Whisper
C++ inference library for multiple SVC/TTS
Consistency Distilled Diff VAE
Run the Stable Diffusion releases in a Docker container
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Generate 3D objects conditioned on text or images
Plug-n-play module turning text-to-image models into animation
Chat-based assistant that understands tasks
Overcoming Data Limitations for High-Quality Video Diffusion Models
View Extract & Remove AI generation metadata with right click
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
Let us control diffusion models
Run GGUF models easily with a UI or API. One File. Zero Install.
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
Official repo for consistency models
Official PyTorch Implementation of "Scalable Diffusion Models"
Basaran, an open-source alternative to the OpenAI text completion API
A converter for seamless transformation of files, data, and media ...
Supercharged experience for multiple models such as ChatGPT
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion