An Open Source text-to-speech system built by inverting Whisper
Implementation of Recurrent Interface Network (RIN)
A fast TTS architecture with conditional flow matching
Virtual AI anchor that combines state-of-the-art technology
Global weather forecasting model using graph neural networks and JAX
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
MII makes low-latency and high-throughput inference possible
Qwen3-omni is a natively end-to-end, omni-modal LLM
A Universal Customization Method for Single and Multi Conditioning
Flexible Photo Recrafting While Preserving Your Identity
Consistency Distilled Diff VAE
Run the Stable Diffusion releases in a Docker container
Plug-n-play module turning text-to-image models into animation
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Generate 3D objects conditioned on text or images
Chat-based assistant that understands tasks
Overcoming Data Limitations for High-Quality Video Diffusion Models
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
Let us control diffusion models
Run GGUF models easily with a UI or API. One File. Zero Install.
numerical simulation code for solving transport equations in 1D/2D/3D
Official repo for consistency models
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
Official PyTorch Implementation of "Scalable Diffusion Models"
Basaran, an open-source alternative to the OpenAI text completion API