A framework to enable multimodal models to operate a computer
AI-powered code generation tool for scratch development of web apps
Gracefully face hCaptcha challenge with multimodal llms
Generative AI reference workflows
Advancing Open-source World Models
GUI/CLI tool for downloading Xiaohongshu
Implement CPU from scratch and play with large model deployments
A course of learning LLM inference serving on Apple Silicon
Free, high-quality text-to-speech API endpoint to replace OpenAI
Revolutionizes the way users interact with Autogen
Intelligent automation and multi-agent orchestration for Claude Code
Open source async coding agent that plans, codes, and opens PRs
Open platform for building, deploying, and managing LLM agents
Evaluate your LLM's response with Prometheus and GPT4
Extension of Google Research’s PaperBanana
Skills Catalog for Codex
A feature rich discord Modmail bot
UI-TARS-desktop version that can operate on your local personal device
Learn to build your Second Brain AI assistant with LLMs
One-click deployment (including offline integration package)
Build reliable Gen AI solutions without overhead
Memory Management Kit for Agents
"Big Model" trains a visual multimodal VLM with 26M parameters
Build Vision Agents quickly with any model or video provider
I Agent designed to interact with ROS1- and ROS2-based robotics system