A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Implementation of Make-A-Video, new SOTA text to video generator
A gradio web UI for running Large Language Models like LLaMA
Implementation of Video Diffusion Models
A minimal implementation of diffusion models for text generation
Real time face swap and one-click video deepfake
Real-time face swap for PC streaming or video calls
Image polygonal annotation with Python
Robust Speech Recognition via Large-Scale Weak Supervision
OCRmyPDF adds an OCR text layer to scanned PDF files
NVR with realtime local object detection for IP cameras
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Speech recognition module for Python
A deep learning toolkit for Text-to-Speech, battle-tested in research
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
Ready-to-use OCR with 80+ supported languages
Awesome multilingual OCR toolkits based on PaddlePaddle
GFPGAN aims at developing Practical Algorithms
Open source machine learning framework to automate text conversations
Open Source Document Management System for Digital Archives
Open source personal AI Assistant for Linux, Windows and Mac
Label Studio is a multi-type data labeling and annotation tool
One-click face swap
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis