A gradio web UI for running Large Language Models like LLaMA
Python binding to the Apache Tika™ REST services
Provides line-oriented text file editing capabilities
Large Language Model Text Generation Inference
Recognition and resolution of numbers, units, date/time, etc.
NLP Cloud serves high performance pre-trained or custom models for NER
Focus on prompting and generating
Robust Speech Recognition via Large-Scale Weak Supervision
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
OCRmyPDF adds an OCR text layer to scanned PDF files
Comprehensive Gradio WebUI for audio processing
Stable Diffusion web UI
Speech-to-text, text-to-speech, and speaker recognition
Ready-to-use OCR with 80+ supported languages
A deep learning toolkit for Text-to-Speech, battle-tested in research
Open-Sora: Democratizing Efficient Video Production for All
Web interface for generating images using Stable Diffusion models
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
InvokeAI is a leading creative engine for Stable Diffusion models
Library for OCR-related tasks powered by Deep Learning
Open Source Document Management System for Digital Archives
Awesome multilingual OCR toolkits based on PaddlePaddle
Label Studio is a multi-type data labeling and annotation tool
Speech recognition module for Python
Models for the spaCy Natural Language Processing (NLP) library