Vision Agents

Vision Agents is an open-source Python framework for building real-time voice and video AI agents. It is designed for applications that need to watch, listen, understand, and respond with very low latency. The framework can combine vision models, speech models, LLMs, and real-time transport providers into one agent workflow. It supports use cases such as live coaching, telehealth, customer support, security monitoring, interactive video assistants, and voice-controlled tools. Vision Agents is model-agnostic, so developers can connect providers such as OpenAI, Gemini, Claude, Hugging Face, YOLO, Roboflow, and others. Its main value is giving developers a flexible foundation for multimodal agents that operate on live audio and video instead of only static prompts.

Features

Real-time voice and video AI agents
Python framework for multimodal workflows
Model-agnostic provider integration
WebRTC and edge-network support
Vision, speech, and LLM pipelines
Low-latency interactive agent design

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow Vision Agents

Vision Agents Web Site

Other Useful Business Software

Build Agents and Models on One Platform

Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free

Rate This Project

User Reviews

Be the first to post a review of Vision Agents!

Additional Project Details

Programming Language

Python

Related Categories

Python Agentic AI Tool

Registered

2026-06-09

Similar Business Software

Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software
StackAI

StackAI is an enterprise AI automation platform to build end-to-end internal tools and processes with AI agents in a fully compliant and secure way. Designed for large, regulated organizations, it enables teams to automate complex workflows across operations, compliance, finance, IT, and support...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3.5. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
Hostinger

Start your online journey with fast and secure web hosting that enables you to take the Internet by storm. At Hostinger, you can choose from various web hosting-related services that include Domain Registration, Cloud Hosting, Email Hosting, SSL Certificate, and LiteSpeed Servers. Choose...

See Software
Flowise

Flowise is an open-source platform that enables developers and teams to build AI agents and LLM-powered applications through a visual interface. The platform provides modular building blocks that allow users to create everything from simple chatbot workflows to complex multi-agent systems. With...

See Software
DemoGPT

DemoGPT is an open source platform that simplifies the creation of LLM (Large Language Model) agents by providing an all-in-one toolkit. It offers tools, frameworks, prompts, and models for rapid agent development. The platform automatically generates LangChain code, which can be used for...

See Software

Report inappropriate content

Vision Agents

Open Vision Agents by Stream. Build voice and vision agents quickly

Get an email when there's a new version of Vision Agents

Features

Project Samples

Project Activity

Categories

License

Follow Vision Agents

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered