Focus on prompting and generating
Chemcrow
1 min voice data can also be used to train a good TTS model
The most powerful local music generation model
Long-form streaming TTS system for multi-speaker dialogue generation
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Wan2.1: Open and Advanced Large-Scale Video Generative Model
A Systematic Framework for Interactive World Modeling
From Images to High-Fidelity 3D Assets
AI-Researcher: Autonomous Scientific Innovation
Inference code for CodeLlama models
Offline Text To Speech synthesis for python
Industrial-level controllable zero-shot text-to-speech system
High-quality multi-lingual text-to-speech library by MyShell.ai
Controllable & emotion-expressive zero-shot TTS
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
A generative speech model for daily dialogue
The behavior guidance framework for customer-facing LLM agents
Synchronized Translation for Videos
State-of-the-art TTS model under 25MB
TTS with kokoro and onnx runtime
PyTorch3D is FAIR's library of reusable components for deep learning
Qwen3-TTS is an open-source series of TTS models