Screenshots, word marking, OCR, AI, translation software
Canvas-based WYSIWYG rich text editor with advanced layout tools
Multimodal-Driven Architecture for Customized Video Generation
ComfyUI wrapper nodes for HunyuanVideo
The pluggable natural language linter for text and markdown
Unsupervised text tokenizer for Neural Network-based text generation
Generating Immersive, Explorable, and Interactive 3D Worlds
Tokenizer-Free TTS for Multilingual Speech Generation
A robust, efficient, low-latency speech-to-text library
A Family of Open Sourced Music Foundation Models
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Lightning-fast, on-device TTS, running natively via ONNX
JavaScript OCR and text extraction for images and PDFs
State-of-the-art (SoTA) text-to-video pre-trained model
Cross-platform AI language practice app
Generate audiobooks from e-books
Official MiniMax Model Context Protocol (MCP) server
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Unified web UI for training and running open models locally
A persistent, network resilient, full text search library
super expressive prompting model based on ltx2.3
A multimodal model for brain response prediction
TextWorld is a sandbox learning environment for the training
Diffusion Bee is the easiest way to run Stable Diffusion locally
Faster Whisper transcription with CTranslate2