The Multi-Agent Framework
1 min voice data can also be used to train a good TTS model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
A lightweight audio-to-MIDI converter with pitch bend detection
A community-supported supercharged version of paperless
lightweight package to simplify LLM API calls
Official inference framework for 1-bit LLMs
Deepfakes Software For All
Comprehensive Gradio WebUI for audio processing
OpenDAN is an open source Personal AI OS
An open source implementation of CLIP
ChatGPT extension for scientific research work
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
High-Quality Voice Cloning TTS for 600+ Languages
An Open Source implementation of Notebook LM with more flexibility
Stable Diffusion WebUI optimized for AMD GPUs with editing tools
Generate audiobooks from e-books, voice cloning & 1107+ languages
Automatically translates the text of a video based on a subtitle file
Offline Text To Speech synthesis for python
One-click deployment (including offline integration package)
Powerful tool that lets you create and run intelligent agents
Run a full local LLM stack with one command using Docker
A command-line productivity tool powered by AI large language models
Python library for defining and optimizing mathematical expressions
Modular quant framework