OCRmyPDF adds an OCR text layer to scanned PDF files
MOSS-TTS-Nano is an open-source multilingual tiny speech generation
A nearly-live implementation of OpenAI's Whisper
A high-quality rapid TTS voice cloning model
A high-performance ML model serving framework, offers dynamic batching
A Python toolbox for scalable outlier detection
Supercharge Your LLM with the Fastest KV Cache Layer
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
The most powerful and modular diffusion model GUI, api and backend
Fast Python collaborative filtering for implicit feedback datasets
Easy Docker setup for Stable Diffusion with user-friendly UI
Collection of reference environments, offline reinforcement learning
A Modular Simulation Framework and Benchmark for Robot Learning
An unsupervised and free tool for image and video dataset analysis
Implement CPU from scratch and play with large model deployments
Self-host the powerful Chatterbox TTS model
Faster Whisper transcription with CTranslate2
Advanced Privacy-Preserving Federated Learning framework
3D reconstruction software
Collections of robotics environments
SAPIEN Manipulation Skill Framework
Fast and accurate AI powered file content types detection
MemU is an open-source memory framework for AI companions
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat