Powerful AI language model (MoE) optimized for efficiency/performance
Image inpainting tool powered by SOTA AI Model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Robust Speech Recognition via Large-Scale Weak Supervision
Awesome multilingual OCR toolkits based on PaddlePaddle
Qwen3 is the large language model series developed by Qwen team
OCR software, free and offline
1 min voice data can also be used to train a good TTS model
Web interface for generating images using Stable Diffusion models
YOLOv5 is the world's most loved vision AI
A high-throughput and memory-efficient inference and serving engine
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Comprehensive Gradio WebUI for audio processing
Open source personal AI Assistant for Linux, Windows and Mac
NVR with realtime local object detection for IP cameras
Generate short videos with one click using AI LLM
Synchronized Translation for Videos
A Lightweight Face Recognition and Facial Attribute Analysis
A gradio web UI for running Large Language Models like LLaMA
A Python wrapper you can't refuse
Use Microsoft Edge's online text-to-speech service from Python
Ready-to-use OCR with 80+ supported languages
Chemcrow
EPUB to audiobook converter, optimized for Audiobookshelf
A simple, high-quality voice conversion tool focused on ease of use