Automate browser-based workflows with LLMs and Computer Vision
Open-source MCP server that gives your coding agent
Open multimodal web agent built by Ai2
A sound cloning tool with a web interface, using your voice
Self-host the powerful Chatterbox TTS model
Tools like web browser, computer access and code runner for LLMs
Library for OCR-related tasks powered by Deep Learning
Stable Diffusion WebUI optimized for AMD GPUs with editing tools
The most reliable AI agent framework that supports MCP
Automate native Android apps with AI using accessibility APIs
Qwen3-Coder is the code version of Qwen3
Fast-stable-diffusion + DreamBooth
Multi-user UI for managing and running Stable Diffusion workflows tool
Linkedin Automation Tool
A fast TTS architecture with conditional flow matching
A meta-harness for all your AI agents
AI tool converting video/audio into structured documents instantly
Context-aware desktop AI assistant that understands screen content
Stable Diffusion web UI
AI tool for real-time monitoring and analysis of Goofish listings
Gracefully face hCaptcha challenge with multimodal llms
Use Claude Code's agent loop with DeepSeek V4 Pro, OpenRouter & more
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
Your Personal AI Assistant; easy to install, deploy on local or coud
Python SDK for the Computer Use model Lux, developed by OpenAGI