GUI for a Vocal Remover that uses Deep Neural Networks
Real-World Centric Foundation GUI Agents
The most powerful and modular diffusion model GUI, api and backend
GUI Exploration Lab. One of the best GUI agent solutions
A state-of-the-art open visual language model
Agent framework and applications built upon Qwen>=3.0
Framework and no-code GUI for fine-tuning LLMs
Convert AI papers to GUI
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Witness the aha moment of VLM with less than $3
UI-TARS-desktop version that can operate on your local personal device
Generate audiobooks from e-books
Image polygonal annotation with Python
Generate audiobooks from e-books, voice cloning & 1107+ languages
AI-powered tool for developers, simplifying coding tasks
GUI/CLI tool for downloading Xiaohongshu
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Enable AI to control your desktop, mobile and HMI devices
The AI toolkit for the AI developer
StreamSpeech is a seamless model for offline speech recognition
Meta Agents Research Environments is a comprehensive platform
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Real-time behaviour synthesis with MuJoCo, using Predictive Control