High-performance inference server for text embeddings models API layer
Module for automatic summarization of text documents and HTML pages
Large Language Model Text Generation Inference
Hypernetworks that adapt LLMs for specific benchmark tasks
AI tool that removes hardcoded subtitles and text from videos locally
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Awesome multilingual OCR toolkits based on PaddlePaddle
TTS with kokoro and onnx runtime
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Official MiniMax Model Context Protocol (MCP) server
Offline inference engine for art, real-time voice conversations
A simple, high-quality voice conversion tool focused on ease of use
A robust, efficient, low-latency speech-to-text library
EPUB to audiobook converter, optimized for Audiobookshelf
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Speech-AI-Forge is a project developed around TTS generation model
Official inference repo for FLUX.1 models
High-Quality Voice Cloning TTS for 600+ Languages
Tokenizer-Free TTS for Multilingual Speech Generation
A simple native web interface that uses ChatTTS to synthesize text
Multimodal-Driven Architecture for Customized Video Generation
ComfyUI wrapper nodes for HunyuanVideo
Generating Immersive, Explorable, and Interactive 3D Worlds
Handwritten Text Recognition (HTR) system implemented with TensorFlow
State-of-the-art (SoTA) text-to-video pre-trained model