Lightning-fast, on-device TTS, running natively via ONNX
Speech-AI-Forge is a project developed around TTS generation model
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Generate blog articles from video or audio
A fast TTS architecture with conditional flow matching
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Synchronized Translation for Videos
Extract audio and video content and organize it into a Markdown note
StreamSpeech is a seamless model for offline speech recognition
Easy to use Python library for creating 2D arcade games
An Open Source text-to-speech system built by inverting Whisper
go1pylib is a Python library designed to control the Go1 robot
Towards Human-Sounding Speech
Interface for OuteTTS models
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
A Python library for extracting structured information
GenAI Processors is a lightweight Python library
Open-Sora: Democratizing Efficient Video Production for All
End-to-end speech processing toolkit
Implementation of AudioLM audio generation model in Pytorch
TextWorld is a sandbox learning environment for the training
Python CLI utility and library for manipulating SQLite databases
Python Terminal Toolkit - a Spiced Up TUI Library
A TTS model capable of generating ultra-realistic dialogue