Showing 43 open source projects for "voice chat"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    Fun Audio Chat

    Fun Audio Chat

    Large Audio Language Model built for natural interactions

    Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Gemini Next Chat

    Gemini Next Chat

    Deploy your private Gemini application for free with one click

    Gemini Next Chat is an open-source web application that allows you to deploy your own private chat interface powered by Google’s Gemini models (e.g., Gemini 1.5, Gemini 2.0, etc.). It is built with Next.js/TypeScript and targets developers and hobbyists who want a self-hosted solution for interacting with advanced multimodal models (text, image, voice).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Live helper chat

    Live helper chat

    Live support for your website. Featuring web and mobile apps

    Live helper chat is a mature, open-source customer support platform that enables real-time communication between businesses and website visitors through chat, messaging, and integrated communication channels. Designed to handle high volumes of interactions, it can support thousands of concurrent conversations and multiple operators, making it suitable for enterprise-level deployments. The platform includes a web-based interface as well as mobile applications, allowing support teams to manage...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    No Cost AI

    No Cost AI

    80+ free AI services for chat, image, video, voice & APIs

    No Cost AI is a curated directory of free AI services across chat, image generation, video, voice, music, APIs, and automation tools. It is designed for users who want to discover AI resources without immediately committing to paid subscriptions. The project gathers many external services in one place, making it easier to compare options for different creative, technical, and productivity needs.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    OpenAI Realtime Agents

    OpenAI Realtime Agents

    This is a simple demonstration of more advanced, agentic patterns

    This repository demonstrates how to build low-latency, streaming “voice + chat” agents using OpenAI’s Realtime API combined with the OpenAI Agents SDK. The demo shows patterns for connecting a realtime voice stream (audio in/out) with agents that can use tools, maintain state, and orchestrate multi-agent workflows. The SDK offers abstractions such as agent orchestration, event handling, handoffs, state management, and guardrails, tailored to support realtime, conversational systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Rasa

    Rasa

    Open source machine learning framework to automate text conversations

    Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual assistants on Facebook Messenger, Slack, Google Hangouts, Webex Teams, Microsoft Bot Framework, Rocket.Chat, Mattermost, Telegram, and Twilio or on your own custom conversational channels. Rasa helps you build contextual assistants capable of having layered conversations with lots of back-and-forths. In order for a human to have a meaningful exchange with a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    OpenLess

    OpenLess

    AI-polished text appears at your cursor in any app

    OpenLess is an open-source voice input application for macOS and Windows that turns spoken ideas into polished text at the current cursor position. Users press a global hotkey, speak naturally, and release the key to receive cleaned-up text inside apps such as ChatGPT, Claude, Cursor, Notion, email clients, or chat boxes. Unlike basic dictation tools, it is designed to restructure loose speech into more useful writing, especially AI prompts with clearer context and constraints. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound classification, emotion, etc.), and offers pretrained models (e.g. 7B) released via ModelScope and Hugging Face. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CosyVoice

    CosyVoice

    Multi-lingual large voice generation model, providing inference

    CosyVoice is a multilingual large voice generation model that offers a full-stack solution for training, inference, and deployment of high-quality TTS systems. The model supports multiple languages, including Chinese, English, Japanese, Korean, and a range of Chinese dialects such as Cantonese, Sichuanese, Shanghainese, Tianjinese, and Wuhanese. It is designed for zero-shot voice cloning and cross-lingual or mix-lingual scenarios, so a single reference voice can be used to synthesize speech...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 10
    Open-LLM-VTuber

    Open-LLM-VTuber

    Open source AI VTuber platform with voice chat and Live2D avatars

    Open-LLM-VTuber is an open source platform designed to create AI-powered VTuber characters that can interact with users through voice and animated avatars. It enables hands-free conversations with large language models by combining speech recognition, language processing, and text-to-speech synthesis into a single system. Users can speak directly to the AI character, and the system can respond with a generated voice while animating a Live2D avatar to simulate a talking virtual personality....
    Downloads: 21 This Week
    Last Update:
    See Project
  • 11
    OpenAI-Compatible Edge-TTS API

    OpenAI-Compatible Edge-TTS API

    Free, high-quality text-to-speech API endpoint to replace OpenAI

    ...The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to equivalent Edge voices. Because it relies on Edge’s TTS, the audio generation itself is free, and the project essentially acts as a smart proxy that handles formatting and streaming. The server supports Server-Sent Events (SSE) for streaming audio, enabling low-latency playback in chat UIs and other interactive tools. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Project AIRI

    Project AIRI

    Self hosted, you-owned Grok Companion

    AIRI is a self-hosted AI companion platform designed to create interactive virtual characters capable of real-time conversation, gameplay interaction, and multimedia presence. The project aims to emulate advanced AI personalities similar to popular autonomous VTuber-style agents, combining voice interaction, animation, and behavioral logic into a unified system. It supports deployment across web, macOS, and Windows environments, making it accessible for hobbyists and developers building digital companions. AIRI integrates real-time voice chat capabilities and can interact with external applications such as games, enabling more immersive and dynamic experiences. ...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 13
    DiscordGo

    DiscordGo

    (Golang) Go bindings for Discord

    DiscordGo is a Go package that provides low level bindings to the Discord chat client API. DiscordGo has nearly complete support for all of the Discord API endpoints, websocket interface, and voice interface. If you would like to help the DiscordGo package please use this link to add the official DiscordGo test bot dgo to your server. This provides indispensable help to this project. Construct a new Discord client which can be used to access the variety of Discord API functions and to set callback functions for Discord events. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Jovo Framework

    Jovo Framework

    The React for Voice and Chat, build apps for Alexa, Google Assistant

    The multimodal experience platform enables professional teams to build and run apps that work across smart speakers, the web, mobile, and more. Fully customizable and open source. The Jovo product ecosystem allows you to build, test, and run powerful experiences for voice, chat, and web platforms. From local development to production, Jovo allows you to build robust experiences, faster. Build across devices and platforms and use all supported modalities thanks to the Jovo output template engine. Our component and plugin architecture makes it possible to make Jovo work for your specific use case, across projects. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    FastRTC

    FastRTC

    The python library for real-time communication

    ...It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application). This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. FastRTC also integrates nicely with UI frameworks (e.g. via a web demo using Gradio), so developers can rapidly prototype and deploy real-time streaming applications without deep knowledge of low-level WebRTC internals. Because voice-enabled AI agents often involve many moving parts (speech-to-text, text processing, text-to-speech, streaming, session/chat management), FastRTC helps by handling the streaming aspect, leaving the rest to be plugged in modularly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Happy Coder

    Happy Coder

    Mobile and Web client for Codex and Claude Code, with realtime voice

    Happy is an open-source, cross-platform mobile and web client designed to bring powerful AI coding agents such as Claude Code and Codex to your fingertips no matter where you are. At its core, Happy wraps existing AI coding tools with a unified interface, providing real-time voice interactions, encrypted communication, and seamless device switching between desktop and mobile. You can start a coding session locally through the Happy CLI or connect from a phone or browser, allowing developers...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 17
    Bailing

    Bailing

    Bailing is a voice dialogue robot similar to GPT-4o

    Bailing is an open-source voice-dialogue assistant designed to deliver natural voice-based conversations by combining automatic speech recognition (ASR), voice activity detection (VAD), a large language model (LLM), and text-to-speech (TTS) in a single pipeline. Its goal is to offer a “voice-first” chat experience similar to what one might expect from a system like GPT-4o, but fully open and deployable by users.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ChatOllama

    ChatOllama

    ChatOllama is an open-source AI chatbot

    ChatOllama is an open-source chatbot platform built with Nuxt 3 and designed to provide a private, extensible interface for working with multiple modern language model providers. It goes beyond a basic chat UI by supporting a broad model ecosystem that includes OpenAI, Azure OpenAI, Anthropic, Google Gemini, Groq, Moonshot, Ollama, and other OpenAI-compatible services. The platform also includes higher-level capabilities such as AI agents, document-backed knowledge bases, real-time voice chat, and Model Context Protocol integration for external tools. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PageLM

    PageLM

    PageLM is a community driven version of NotebookLM

    ...It is built to help students, educators, and researchers turn documents and topics into more engaging forms of study rather than leaving content in static notes or isolated files. The platform includes a broad set of learning tools such as contextual chat, Cornell-style note generation, flashcards, quizzes, AI podcasts, voice transcription, homework planning, exam simulation, debate practice, and a personalized study companion. It supports uploaded documents including PDF, DOCX, Markdown, and TXT, allowing users to ground questions and generated materials in source content. On the technical side, it supports multiple model providers, multiple embedding back ends, WebSocket streaming for real-time generation, persistent content storage, and structured markdown outputs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    RunAnywhere

    RunAnywhere

    Production ready toolkit to run AI locally

    RunAnywhere SDKs are a set of cross-platform development tools that enable applications to run artificial intelligence models directly on user devices instead of relying on cloud infrastructure. The toolkit allows developers to integrate language models, speech recognition, and voice synthesis capabilities into mobile or desktop applications while keeping all computation local. By running models entirely on device, the platform eliminates network latency and protects user data because information does not leave the device. The SDK supports popular open-source models such as Llama, Mistral, and Qwen, enabling developers to build AI-powered features such as chat interfaces and voice assistants with minimal external dependencies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Better Chatbot

    Better Chatbot

    Just a Better Chatbot. Powered by MCP Client & Workflows

    Better‑chatbot is an AI chatbot framework powered by MCP protocols and workflows, allowing developers to deploy and integrate AI-powered chat systems with ease. Integrates all major LLMs: OpenAI, Anthropic, Google, xAI, Ollama, and more. MCP protocol, web search, JS/Python code execution, data visualization. Custom agents, visual workflows, artifact generation. Custom agents, visual workflows, artifact generation. Realtime voice chat with full MCP tool integration.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    ...It combines a CLI and companion app to launch backends, frontends, and supporting services with minimal setup. With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    SafeClaw

    SafeClaw

    Chat with it via text and voice

    SafeClaw is an open-source, entirely local alternative to cloud-based AI assistants like OpenClaw, enabling users to build a personal assistant that runs on their own machine without incurring API usage charges or exposing data to third-party services. It emphasizes privacy and predictability by using traditional programming, rule-based intent parsing, and established machine learning tools rather than large language models, meaning there are no per-token API costs and deterministic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Big-AGI

    Big-AGI

    AI suite powered by state-of-the-art models and providing advanced AI

    Big-AGI is a comprehensive, open-source AI workspace built to serve as a powerful multi-model interface for developers, researchers, and professionals who want deep control over generative AI workflows and outputs. It unifies access to multiple large language models (LLMs) and AI services through a modern web UI that emphasizes effi­cient interaction, flexibility, and extensibility, enabling users to conduct multi-model chats, execute code, generate images, and perform voice or text-based...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Open Interpreter

    Open Interpreter

    A natural language interface for computers

    Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards. Runs locally or via configured remote LLM servers/inference backends, giving flexibility to use models you trust or have...
    Downloads: 22 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo