3D reconstruction software
Comprehensive Gradio WebUI for audio processing
OCRmyPDF adds an OCR text layer to scanned PDF files
Awesome multilingual OCR toolkits based on PaddlePaddle
Focus on prompting and generating
Generate audiobooks from EPUBs, PDFs and text with captions
TTS with kokoro and onnx runtime
A lightweight approach to removing Google web service dependency
Flet enables developers to easily build realtime web and mobile apps
Open-Source Python3 tool for recognizing layouts, tables, and math
Rich is a Python library for rich text and beautiful formatting
The scientific Python development environment
Wan2.2: Open and Advanced Large-Scale Video Generative Model
SOTA Open Source TTS
Tokenizer-Free TTS for Multilingual Speech Generation
Official inference repo for FLUX.1 models
A simple native web interface that uses ChatTTS to synthesize text
Code for running inference and finetuning with SAM 3 model
OCR software, free and offline
AI bridge enabling assistants to control and automate Unity Editor
Qwen3-TTS is an open-source series of TTS models
A robust, efficient, low-latency speech-to-text library
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression