A gradio web UI for running Large Language Models like LLaMA
Python binding to the Apache Tika™ REST services
Provides line-oriented text file editing capabilities
Large Language Model Text Generation Inference
Recognition and resolution of numbers, units, date/time, etc.
NLP Cloud serves high performance pre-trained or custom models for NER
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Robust Speech Recognition via Large-Scale Weak Supervision
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
OCRmyPDF adds an OCR text layer to scanned PDF files
Focus on prompting and generating
Comprehensive Gradio WebUI for audio processing
Speech-to-text, text-to-speech, and speaker recognition
Stable Diffusion web UI
A deep learning toolkit for Text-to-Speech, battle-tested in research
Open-Sora: Democratizing Efficient Video Production for All
Ready-to-use OCR with 80+ supported languages
Qwen3 is the large language model series developed by Qwen team
A Powerful Native Multimodal Model for Image Generation
Awesome multilingual OCR toolkits based on PaddlePaddle
Qwen3-Coder is the code version of Qwen3
Parse files for optimal RAG
Generating Immersive, Explorable, and Interactive 3D Worlds
Label Studio is a multi-type data labeling and annotation tool
Qwen-Image is a powerful image generation foundation model