Recognition and resolution of numbers, units, date/time, etc.
Text generator is a handy plugin for Obsidian
Provides line-oriented text file editing capabilities
Large Language Model Text Generation Inference
Oobabooga - The definitive Web UI for local AI, with powerful features
Open Source OCR Engine
Speech-to-text, text-to-speech, and speaker recognition
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
OCRmyPDF adds an OCR text layer to scanned PDF files
Generate audiobooks from EPUBs, PDFs and text with captions
Wan2.2: Open and Advanced Large-Scale Video Generative Model
A free, open source, and extensible speech-to-text application
A cross-platform software for text translation and recognition
Code for running inference and finetuning with SAM 3 model
Readest is a modern, feature-rich ebook reader
Awesome multilingual OCR toolkits based on PaddlePaddle
Comprehensive Gradio WebUI for audio processing
Focus on prompting and generating
Speech Note Linux app. Note taking, reading and translating
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Contexts Optical Compression
High-quality multi-lingual text-to-speech library by MyShell.ai
A text-to-speech, speech-to-text and speech-to-speech library
A Family of Open Sourced Music Foundation Models
Code for openai.fm, a demo for the OpenAI Speech API