A high-quality rapid TTS voice cloning model
A simple, high-quality voice conversion tool focused on ease of use
Persian NLP Toolkit
A library for converting HTML into PDFs using ReportLab
Framework for building realtime multimodal voice AI agents apps
A Model Context Protocol (MCP) server
A general purpose syntax highlighter in pure Go
Free, high-quality text-to-speech API endpoint to replace OpenAI
Mozc - a Japanese Input Method Editor designed for multi-platform
A Powerful Native Multimodal Model for Image Generation
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Label Studio is a multi-type data labeling and annotation tool
Compute distance between sequences
Framework for building real-time voice and multimodal AI agents
Snippet solution for Vim
Industrial-level controllable zero-shot text-to-speech system
TextWorld is a sandbox learning environment for the training
Spark-TTS Inference Code
A Unified Framework for Text-to-3D and Image-to-3D Generation
Implementation of Phenaki Video, which uses Mask GIT
Speech-AI-Forge is a project developed around TTS generation model
Tools for manipulating datasets
Easily compute clip embeddings and build a clip retrieval system
Handwritten Text Recognition (HTR) system implemented with TensorFlow
A simple native web interface that uses ChatTTS to synthesize text