GUI for a Vocal Remover that uses Deep Neural Networks
The most powerful and modular diffusion model GUI, api and backend
GUI Exploration Lab. One of the best GUI agent solutions
Image polygonal annotation with Python
A state-of-the-art open visual language model
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Witness the aha moment of VLM with less than $3
Framework and no-code GUI for fine-tuning LLMs
UI-TARS-desktop version that can operate on your local personal device
Generate audiobooks from e-books
GUI/CLI tool for downloading Xiaohongshu
Convert AI papers to GUI
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Agent framework and applications built upon Qwen>=3.0
The AI toolkit for the AI developer
Generate audiobooks from e-books, voice cloning & 1107+ languages
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Enable AI to control your desktop, mobile and HMI devices
A single Gradio + React WebUI with extensions for ACE-Step
Meta Agents Research Environments is a comprehensive platform
StreamSpeech is a seamless model for offline speech recognition
AI-powered tool for developers, simplifying coding tasks
A graphical manager for ollama that can manage your LLMs
Graphical User Interface Face Anonymization Tool