Offline Text To Speech synthesis for python
Video player for improving quality of hand-drawn images
Framework for building real-time voice and multimodal AI agents
Automatic Speech Recognition with Word-level Timestamps
Converts text to speech in realtime
Use Microsoft Edge's online text-to-speech service from Python
Generate blog articles from video or audio
Sample code and notebooks for Generative AI on Google Cloud
Free, high-quality text-to-speech API endpoint to replace OpenAI
Multimodal-Driven Architecture for Customized Video Generation
Voice Recognition to Text Tool
MARS5 speech model (TTS) from CAMB.AI
Label Studio is a multi-type data labeling and annotation tool
Unified web UI for training and running open models locally
Build multimodal language agents for fast prototype and production
A high-quality rapid TTS voice cloning model
A fast TTS architecture with conditional flow matching
Controllable & emotion-expressive zero-shot TTS
The most powerful and modular diffusion model GUI, api and backend
An Open Source implementation of Notebook LM with more flexibility
A python tool that uses GPT-4, FFmpeg, and OpenCV
Open source AI wearable platform for recording and summarizing speech
GenAI Processors is a lightweight Python library
Automatically translates the text of a video based on a subtitle file
A TTS model capable of generating ultra-realistic dialogue