Real time face swap and one-click video deepfake
State-of-the-art 2D and 3D Face Analysis Project
A Family of Open Sourced Music Foundation Models
A simple, high-quality voice conversion tool focused on ease of use
Instant voice cloning by MIT and MyShell. Audio foundation model
Code for running inference and finetuning with SAM 3 model
Core ML tools contain supporting tools for Core ML model conversion
Use Microsoft Edge's online text-to-speech service from Python
A high-performance ML model serving framework, offers dynamic batching
Synchronized Translation for Videos
Awesome multilingual OCR toolkits based on PaddlePaddle
Deep learning library
Lazy Predict help build a lot of basic models without much code
Lets make video diffusion practical
Ready-to-use OCR with 80+ supported languages
Generate audiobooks from e-books, voice cloning & 1107+ languages
A command-line productivity tool powered by AI large language models
A community-supported supercharged version of paperless
Toloka-Kit is a Python library for working with Toloka API
Official inference repo for FLUX.2 models
Datasets, transforms and models specific to Computer Vision
EPUB to audiobook converter, optimized for Audiobookshelf
The official repo of Qwen chat & pretrained large language model
An open-source toolkit for monitoring Language Learning Models (LLMs)
⚡ Building applications with LLMs through composability ⚡