Real time face swap and one-click video deepfake
GUI for a Vocal Remover that uses Deep Neural Networks
State-of-the-art 2D and 3D Face Analysis Project
The most powerful and modular diffusion model GUI, api and backend
Focus on prompting and generating
Stable Diffusion web UI
Advanced language and coding AI model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Agentic, Reasoning, and Coding (ARC) foundation models
OCRmyPDF adds an OCR text layer to scanned PDF files
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Code for running inference and finetuning with SAM 3 model
Run Local LLMs on Any Device. Open-source
Robust Speech Recognition via Large-Scale Weak Supervision
Qwen3 is the large language model series developed by Qwen team
Powerful AI language model (MoE) optimized for efficiency/performance
Awesome multilingual OCR toolkits based on PaddlePaddle
A Lightweight Face Recognition and Facial Attribute Analysis
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Comprehensive Gradio WebUI for audio processing
1 min voice data can also be used to train a good TTS model
Web interface for generating images using Stable Diffusion models
Open-source, high-performance AI model with advanced reasoning
A high-throughput and memory-efficient inference and serving engine
Official inference repo for FLUX.2 models