Speech recognition module for Python
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Generate audiobooks from EPUBs, PDFs and text with captions
Official inference repo for FLUX.2 models
Official MiniMax Model Context Protocol (MCP) server
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Web interface for generating images using Stable Diffusion models
The python library for real-time communication
State-of-the-art TTS model under 25MB
Offline inference engine for art, real-time voice conversations
A generative speech model for daily dialogue
Converts text to speech in realtime
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
MTEB: Massive Text Embedding Benchmark
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Generating Immersive, Explorable, and Interactive 3D Worlds
The most accurate natural language detection library for Python
Audiocraft is a library for audio processing and generation
Persian NLP Toolkit
TTS with kokoro and onnx runtime
A Powerful Native Multimodal Model for Image Generation
The behavior guidance framework for customer-facing LLM agents
Stanford NLP Python library for many human languages