This repository demonstrates how to build low-latency, streaming “voice + chat” agents using OpenAI’s Realtime API combined with the OpenAI Agents SDK. The demo shows patterns for connecting a realtime voice stream (audio in/out) with agents that can use tools, maintain state, and orchestrate multi-agent workflows. The SDK offers abstractions such as agent orchestration, event handling, handoffs, state management, and guardrails, tailored to support realtime, conversational systems. The demo includes a Next.js frontend for browser interaction and likely a backend component to orchestrate realtime sessions and agent logic. It also supports a “Chat-Supervisor” pattern where a lightweight realtime chat agent handles user interactions and delegates more complex reasoning or tool usage to a stronger textual model (e.g. GPT-4). Because realtime agents are still a beta feature, the code and API surface are subject to changes and may evolve.
Features
- Realtime voice + chat agent integration using the Realtime API
- Agent orchestration, handoff, and multi-agent coordination via the Agents SDK
- Chat-Supervisor pattern: lightweight realtime agent delegating to stronger models
- Frontend (Next.js) and backend setup for browser-based interaction
- State management, guardrails, and event handling built in
- Demo scaffolding for building voice-enabled, tool-augmented agents