A GUI Agent app based on UI-TARS to control your computer using AI
UI-TARS-desktop version that can operate on your local personal device
Real-World Centric Foundation GUI Agents
Framework and no-code GUI for fine-tuning LLMs
Free, local, open-source Cowork for Gemini CLI, Claude Code, Codex
Agent framework and applications built upon Qwen>=3.0
Generate audiobooks from e-books, voice cloning & 1107+ languages
An open sourced end-to-end VLM-based GUI Agent
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Agent S: an open agentic framework that uses computers like a human
A state-of-the-art open visual language model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Real-time behaviour synthesis with MuJoCo, using Predictive Control
All-in-one web-based IDE specialized for machine learning
A forensic file identification tool using neural networks
A low code unified framework for computer vision and deep learning