Port of OpenAI's Whisper model in C/C++
Speech-to-text, text-to-speech, and speaker recognition
LLM Frontend for Power Users
The most powerful and modular diffusion model GUI, api and backend
3D reconstruction software
Open source machine learning framework
2^x Image Super-Resolution
Local-first AI Notepad for Private Meetings
Awesome multilingual OCR toolkits based on PaddlePaddle
1 min voice data can also be used to train a good TTS model
Industrial-level controllable zero-shot text-to-speech system
Synchronized Translation for Videos
Autonomous agents for everyone
An open-source, modern-design AI chat framework
Lightweight coding agent that runs in your terminal
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible
Modular quant framework
RGBD video generation model conditioned on camera input
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
A simple, high-quality voice conversion tool focused on ease of use
A free, open source, and extensible speech-to-text application
Open source personal AI Assistant for Linux, Windows and Mac
Asynchronous multi-platform robot framework written in Python
Generate short videos with one click using AI LLM
Comprehensive Gradio WebUI for audio processing