The most powerful and modular diffusion model GUI, api and backend
3D reconstruction software
Awesome multilingual OCR toolkits based on PaddlePaddle
Industrial-level controllable zero-shot text-to-speech system
1 min voice data can also be used to train a good TTS model
Comprehensive Gradio WebUI for audio processing
Open Source Document Management System for Digital Archives
Synchronized Translation for Videos
AI-powered video clipping and highlight generation
GPU environment management and cluster orchestration
Uncover insights, surface problems, monitor, and fine tune your LLM
Game Boy emulator written in Python
A program that can do anything to earn money without human operators
AlphaFold 3 inference pipeline
Improve human sleep through scientifically
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible
RGBD video generation model conditioned on camera input
An Open Source text-to-speech system built by inverting Whisper
A community-supported supercharged version of paperless
Code to accompany "A Method for Animating Children's Drawings"
Interact with your documents using the power of GPT
Models for the spaCy Natural Language Processing (NLP) library
Instant voice cloning by MIT and MyShell. Audio foundation model
From Images to High-Fidelity 3D Assets
Build applications that make decisions. Chatbots, agents, simulations