57 projects for "audio streaming server" with 2 filters applied:

  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    Fun Audio Chat

    Fun Audio Chat

    Large Audio Language Model built for natural interactions

    ...The system supports dynamic audio input and output, meaning it can handle different voices, tones, and conversational contexts without forcing users into typed interactions. With real-time streaming, it minimizes latency and delivers responses quickly, making it suitable for applications where responsiveness matters, such as interactive demos, accessibility tools, and conversational games.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Markdownify MCP Server

    Markdownify MCP Server

    Convert files and web content into clean, usable Markdown easily

    Markdownify MCP is a Model Context Protocol server that converts many types of files and web content into clean Markdown. It supports formats such as PDFs, images, audio with transcription, DOCX, XLSX, and PPTX, along with web sources like YouTube transcripts, Bing results, and general webpages. Markdownify MCP is designed to simplify content extraction and make data easier to read, share, and reuse in structured workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    FastRTC

    FastRTC

    The python library for real-time communication

    FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application). This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Sopro TTS

    Sopro TTS

    A lightweight text-to-speech model with zero-shot voice cloning

    Sopro TTS is an open-source text-to-speech (TTS) project that implements a lightweight model capable of producing speech from text with zero-shot voice cloning, meaning it can mimic a speaker’s voice from only a few seconds of reference audio. Built with a 169 million-parameter architecture that uses dilated convolutions and cross-attention layers instead of large Transformer stacks, it achieves relatively fast real-time performance even on CPUs (about a 0.25 real-time factor measured on an M3 base). The model is designed to work with a small set of dependencies and to be accessible for developers who want offline TTS with customizable voice style, including options for streaming or non-streaming generation modes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    Dolphin

    Dolphin

    Document Image Parsing via Heterogeneous Anchor Prompting”

    Dolphin — maintained by ByteDance — is a project aimed at providing a high-performance, robust, and extensible media or multimedia framework / player infrastructure (or possibly a streaming media solution), intended to meet modern demands for efficiency, flexibility, and integration in media-heavy applications. It seeks to combine performant media playback or handling (audio/video decoding, streaming, buffering) with a modular, developer-friendly API that allows easy embedding into larger applications or services. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MOSS-TTS Family

    MOSS-TTS Family

    MOSS‑TTS Family open‑source speech and sound generation model

    ...The broader family also includes dialogue generation, prompt-based voice creation, streaming voice-agent support, and a unified audio tokenizer. It is especially useful for developers building dubbing, podcasts, audiobooks, voice assistants, character voices, and creative audio tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Live API Web Console

    Live API Web Console

    A react-based starter app for using the Live API over websockets

    Live API Web Console is a React starter that demonstrates how to use Gemini’s Live API over WebSockets to build real-time, multimodal experiences. The app includes modules for streaming audio playback, recording user media from the microphone, webcam, or even screen capture, and it surfaces a unified event log so you can debug the session as it flows. Configuration lives in a simple .env file and the project boots with standard web tooling, letting you experiment quickly with models, system prompts, and tool declarations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Anthropic SDK TypeScript

    Anthropic SDK TypeScript

    Access to Anthropic's safety-first language model APIs

    anthropic-sdk-typescript is the TypeScript / JavaScript client library for the Anthropic REST API, enabling backend or Node.js usage of models like Claude. It wraps API endpoints for creating messages, streaming responses, and managing parameters in a type-safe TS environment. The library is designed for server-side use, interfacing with REST, and is stable for integration in web services or backend agents. Example usage shows how to instantiate the Anthropic client, call client.messages.create(...), and obtain responses. It supports streaming endpoints as well. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    WhatsApp MCP Server

    WhatsApp MCP Server

    WhatsApp MCP server enabling AI access to chats and messaging

    whatsapp-mcp is an open source Model Context Protocol (MCP) server that enables AI agents to interact directly with a user’s WhatsApp account through a structured interface. It acts as a bridge between WhatsApp and large language models, allowing controlled access to messages, chats, and contacts. whatsapp-mcp is composed of two main components: a Go-based bridge that connects to the WhatsApp Web API and stores data locally, and a Python-based MCP server that exposes tools for AI interaction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 10
    VoxCPM2

    VoxCPM2

    Tokenizer-Free TTS for Multilingual Speech Generation

    VoxCPM2 is an advanced open-source text-to-speech system that redefines speech synthesis by eliminating traditional tokenization and instead generating continuous speech representations through a diffusion-based autoregressive architecture. Built on top of the MiniCPM model family, it enables highly natural, expressive, and context-aware speech generation that adapts tone, emotion, and pacing directly from input text. The system is trained on massive multilingual datasets, enabling support...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 11
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech translation, and TTS, as well as their streaming or simultaneous counterparts, all handled by the same underlying system. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    ...It offers streaming playback so audio starts almost immediately, even for very long inputs, and automatically generates subtitle files suitable for video production or translation workflows. Under the hood, easyVoice uses a modern stack with Vue 3 and Element Plus on the front end, Node.js and Express on the back end, and TTS engines such as Microsoft Azure TTS and OpenAI-compatible APIs, orchestrated through ffmpeg.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    xgplayer

    xgplayer

    A HTML5 video player with a parser that saves traffic

    xgplayer is a web-friendly, open-source media player library maintained by ByteDance, designed for playing audio/video streams in browsers or web applications with robust control, flexibility, and extensibility. It abstracts many of the lower-level complexities of HTML5 media, providing a consistent API for playback control, custom UI overlays, adaptive streaming, plugin hooks, and cross-browser compatibility. Because of its emphasis on modularity and extensibility, xgplayer can be embedded into modern web projects and customized — developers can add controls, custom buffering strategies, subtitle handling, adaptive bitrate streaming, or integrate with other web-based video infrastructures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 15
    Orpheus TTS

    Orpheus TTS

    Towards Human-Sounding Speech

    ...The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. Inference is provided through a Python package that uses vLLM under the hood for high-throughput, low-latency generation, including streaming examples that show how to generate audio chunks in real time. The maintainers provide Colab notebooks, a standardized prompting format, and one-click deployment via Baseten for production-grade, FP8/FP16 optimized inference with ~200 ms streaming latency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MiniMind-O

    MiniMind-O

    A 0.1B Omni model trained from scratch

    MiniMind-O is an educational open-source project for building a small end-to-end Omni model from scratch. It extends the MiniMind family by exploring a model that can handle text, audio, and image inputs while producing text and streaming speech outputs. The project is designed to make multimodal AI training more accessible by keeping the model size small enough for ordinary personal hardware. It includes both mini and full training data paths, allowing learners to run a complete workflow quickly or reproduce the released model setup more closely. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ...It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha” system, which batch-generates many distinct voice timbres and allows users to save the ones they like into a curated voice library. It has first-class support for long-form audio generation, making it suitable for audiobooks, podcasts, or long narration tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Desktop Commander MCP

    Desktop Commander MCP

    AI-powered MCP server for desktop file and terminal automation

    Desktop Commander MCP is an advanced Model Context Protocol server designed to extend AI assistants with direct control over a user’s local machine, including the file system and terminal. It integrates with clients like Claude Desktop to enable AI-driven workflows such as editing files, executing commands, and automating development tasks from a single conversational interface. Desktop Commander MCP builds on top of an MCP filesystem server and enhances it with powerful search, replace, and code editing capabilities tailored for real-world development environments. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    OpenAI Assistants Quickstart

    OpenAI Assistants Quickstart

    OpenAI Assistants API quickstart with Next.js

    openai-assistants-quickstart is a template for using the Assistants API in a Next.js app, demonstrating streaming, tool use, and function calling in one place. The repository includes multiple example pages that each showcase specific capabilities, while all examples share the same underlying assistant with all capabilities enabled. The primary chat logic lives in the Chat component at app/components/chat.tsx, which manages rendering, streaming, and forwarding function calls. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    MediaPipe Solutions

    MediaPipe Solutions

    Cross-platform, customizable ML solutions

    MediaPipe is an open-source framework developed by Google for building cross-platform machine learning pipelines that process audio, video, and other streaming data in real time. The system provides developers with tools and reusable components that allow them to combine multiple machine learning models with preprocessing and postprocessing logic into efficient perception pipelines. These pipelines can run on a wide variety of platforms including mobile devices, desktop systems, web browsers, and embedded edge devices. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Google Antigravity SDK

    Google Antigravity SDK

    Python library for building agents that leverages Google Antigravity

    ...The SDK includes a high-level Agent class for quick setup, as well as lower-level conversation and connection abstractions for more controlled workflows. It supports streaming responses, stateful sessions, custom Python tools, MCP server integration, hooks, policies, and event-driven triggers. The package relies on a compiled runtime binary distributed through platform-specific PyPI wheels, so installation from PyPI is required for normal use. Its main value is giving developers a structured Python framework for creating local, tool-using, multimodal, policy-controlled AI agents.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    MOSS-TTS-Nano

    MOSS-TTS-Nano

    MOSS-TTS-Nano is an open-source multilingual tiny speech generation

    MOSS-TTS-Nano is a lightweight text-to-speech model designed for real-time voice generation in resource-constrained environments. It is part of the broader MOSS-TTS family and focuses on delivering high-quality speech synthesis with a compact architecture. The model operates efficiently on CPU-only systems, enabling deployment without specialized hardware. It supports multilingual voice cloning and produces high-fidelity audio with low latency. The system uses an autoregressive audio...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Atmosphere

    Atmosphere

    Real-time transport layer for Java AI agents

    Atmosphere is a Java framework for building streaming AI agents on the JVM. It lets developers declare agent behavior with an @Agent annotation while the framework handles transport, streaming, tool calls, memory, reconnect behavior, authorization, and observability. A single agent can be exposed over WebSocket, Server-Sent Events, long polling, gRPC, and WebTransport over HTTP/3 depending on the modules included.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Anthropic SDK Python

    Anthropic SDK Python

    Provides convenient access to the Anthropic REST API from any Python 3

    ...The library includes definitions for all request and response parameters using Python typed objects, automatically handles serialization and deserialization, and wraps HTTP logic (timeouts, retries, error mapping) so that developers can call the API in a clean, high-level way. The SDK supports both synchronous and asynchronous usage (via async/await) depending on context. Importantly, it also supports streaming responses via Server-Sent Events (SSE) so that large outputs can be consumed incrementally rather than waiting for the full response. The client offers helper abstractions for tools (function-style “tools”) and streaming utilities for building interactive agents.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    ds4.c

    ds4.c

    DeepSeek 4 Flash local inference engine for Metal

    ...Unlike general-purpose inference runtimes, the project is intentionally optimized for a specific model family, enabling highly efficient execution and simplified architecture. The engine includes DS4-specific model loading, KV cache management, prompt rendering, and OpenAI-compatible server APIs for local deployment workflows. Built as a native low-level implementation, it focuses on performance, reduced abstraction overhead, and direct integration with Apple GPU acceleration through Metal compute graphs. The project also supports streaming inference behavior and local API serving for integration with external tools and AI applications. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB