Recognition and resolution of numbers, units, date/time, etc.
Text generator is a handy plugin for Obsidian
Provides line-oriented text file editing capabilities
Oobabooga - The definitive Web UI for local AI, with powerful features
Large Language Model Text Generation Inference
Open Source OCR Engine
Speech-to-text, text-to-speech, and speaker recognition
OCRmyPDF adds an OCR text layer to scanned PDF files
Wan2.2: Open and Advanced Large-Scale Video Generative Model
A free, open source, and extensible speech-to-text application
Code for running inference and finetuning with SAM 3 model
A cross-platform software for text translation and recognition
Readest is a modern, feature-rich ebook reader
Generate audiobooks from EPUBs, PDFs and text with captions
Awesome multilingual OCR toolkits based on PaddlePaddle
Speech Note Linux app. Note taking, reading and translating
Focus on prompting and generating
Code for openai.fm, a demo for the OpenAI Speech API
Contexts Optical Compression
High-quality multi-lingual text-to-speech library by MyShell.ai
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Comprehensive Gradio WebUI for audio processing
Qwen3-TTS is an open-source series of TTS models
A Family of Open Sourced Music Foundation Models
A text-to-speech, speech-to-text and speech-to-speech library