Python binding to the Apache Tika™ REST services
The official Python SDK for the ElevenLabs API
Recognition and resolution of numbers, units, date/time, etc.
Provides line-oriented text file editing capabilities
Large Language Model Text Generation Inference
A gradio web UI for running Large Language Models like LLaMA
NLP Cloud serves high performance pre-trained or custom models for NER
Parse files for optimal RAG
Contexts Optical Compression
High-quality multi-lingual text-to-speech library by MyShell.ai
OCRmyPDF adds an OCR text layer to scanned PDF files
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
File Parser optimised for LLM Ingestion with no loss
Code for running inference and finetuning with SAM 3 model
A robust, efficient, low-latency speech-to-text library
Awesome multilingual OCR toolkits based on PaddlePaddle
Speech-to-text, text-to-speech, and speaker recognition
Comprehensive Gradio WebUI for audio processing
Python library and CLI tool to interface with Google Translate
Focus on prompting and generating
Python implementation of TextRank algorithms
Offline Text To Speech synthesis for python
A text-to-speech, speech-to-text and speech-to-speech library
Robust Speech Recognition via Large-Scale Weak Supervision
Use Microsoft Edge's online text-to-speech service from Python