55 projects for "python-dpkt" with 2 filters applied:

  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps. ...
    Downloads: 41 This Week
    Last Update:
    See Project
  • 2
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a mode-based architecture with specialized “modes” such as Author, Code, Research, QA and General, and a workflow manager that automatically routes user requests to the right agent based on the task. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 99.99% Uptime for Your Most Critical Databases Icon
    99.99% Uptime for Your Most Critical Databases

    Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

    Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.
    Try Free
  • 5
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    ...The project supports zero-shot voice cloning, meaning it can adapt to a reference speaker’s voice with minimal example data, enabling realistic and personalized synthetic speech. Intended for developers, hobbyists, and creators, the repository includes installation instructions, usage examples, and Python APIs that make it feasible to integrate the model in local workflows, web demos, or production systems. Its design emphasizes efficiency and practicality, fitting within modest GPU memory footprints.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Qwen3-TTS

    Qwen3-TTS

    Qwen3-TTS is an open-source series of TTS models

    Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 7
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    Pocket TTS is a lightweight text-to-speech project designed to run efficiently on CPUs, targeting developers who want local speech generation without depending on GPUs or hosted web APIs. It is built to feel practical in everyday applications, where installation and usage should be as simple as adding a dependency and calling a function. The project focuses on keeping the runtime footprint manageable while still producing natural-sounding speech, which makes it attractive for offline tools,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Sopro TTS

    Sopro TTS

    A lightweight text-to-speech model with zero-shot voice cloning

    ...The model is designed to work with a small set of dependencies and to be accessible for developers who want offline TTS with customizable voice style, including options for streaming or non-streaming generation modes. Users can install it with standard Python tools, run a demo server locally, and experiment with CLI or Python API usage for producing synthetic speech.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Go From Idea to Deployed AI App Fast Icon
    Go From Idea to Deployed AI App Fast

    One platform to build, fine-tune, and deploy. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Orpheus TTS

    Orpheus TTS

    Towards Human-Sounding Speech

    ...The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. Inference is provided through a Python package that uses vLLM under the hood for high-throughput, low-latency generation, including streaming examples that show how to generate audio chunks in real time. The maintainers provide Colab notebooks, a standardized prompting format, and one-click deployment via Baseten for production-grade, FP8/FP16 optimized inference with ~200 ms streaming latency.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Open Vision Agents by Stream

    Open Vision Agents by Stream

    Build Vision Agents quickly with any model or video provider

    Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    FastRTC

    FastRTC

    The python library for real-time communication

    FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Luna AI

    Luna AI

    Virtual AI anchor that combines state-of-the-art technology

    Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    StyleTTS 2

    StyleTTS 2

    Towards Human-Level Text-to-Speech through Style Diffusion

    StyleTTS2 is a state-of-the-art text-to-speech system that aims for human-level naturalness by combining style diffusion, adversarial training, and large speech language models. It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    openctp

    openctp

    Provides CTP stock options and Zhongtai Securities XTP

    ...The project offers a comprehensive simulation environment similar to SimNow that supports futures, options, A share stocks, funds, bonds, and stock options, and even extends to Hong Kong and US markets. In addition to the core library, openctp supplies Python bindings for CTPAPI and stock options APIs, making it easier to build strategies, tools, and analytics in Python. It also develops full featured and lightweight trading clients like TickTrader, TickTraderMini, and ViTrader, which support multiple desks and markets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    WhisperSpeech

    WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper

    WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS:...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    ...It provides a high-level Interface API that wraps model configuration, speaker handling, and audio generation so you can focus on integrating speech into your application rather than wiring up low-level engines. The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    GLM-TTS

    GLM-TTS

    Controllable & emotion-expressive zero-shot TTS

    GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    ...It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. The system is customizable: you can define your own avatar appearance and voice, and it supports voice cloning so you can generate a new voice from a short 3–10 second reference sample. The tech stack integrates FunASR for speech recognition, Qwen for language understanding, multiple TTS engines like GPT-SoVITS, CosyVoice, or edge-tts, and MuseTalk for talking-head generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Supertonic

    Supertonic

    Lightning-fast, on-device TTS, running natively via ONNX

    ...Supertonic is designed to handle real-world text gracefully, including numbers, dates, currency symbols, abbreviations, and technical units, without requiring heavy pre-processing or custom text normalization. The repository provides complete reference implementations across many programming ecosystems—Python, Node.js, browser (WebGPU/WASM), Java, C++, C#, Go, Swift, iOS, Rust, and Flutter.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    FireRedTTS-2

    FireRedTTS-2

    Long-form streaming TTS system for multi-speaker dialogue generation

    FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB