A simple, high-quality voice conversion tool focused on ease of use
A high-quality rapid TTS voice cloning model
Framework for building realtime multimodal voice AI agents apps
Persian NLP Toolkit
A library for converting HTML into PDFs using ReportLab
A general purpose syntax highlighter in pure Go
Free, high-quality text-to-speech API endpoint to replace OpenAI
A Model Context Protocol (MCP) server
Label Studio is a multi-type data labeling and annotation tool
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Mozc - a Japanese Input Method Editor designed for multi-platform
A Powerful Native Multimodal Model for Image Generation
Framework for building real-time voice and multimodal AI agents
Compute distance between sequences
Industrial-level controllable zero-shot text-to-speech system
Spark-TTS Inference Code
A Unified Framework for Text-to-3D and Image-to-3D Generation
TextWorld is a sandbox learning environment for the training
Snippet solution for Vim
Speech-AI-Forge is a project developed around TTS generation model
Implementation of Phenaki Video, which uses Mask GIT
Easily compute clip embeddings and build a clip retrieval system
Tools for manipulating datasets
Handwritten Text Recognition (HTR) system implemented with TensorFlow
A simple native web interface that uses ChatTTS to synthesize text