• Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    NeuTTS Air

    NeuTTS Air

    NeuTTS model built from small LLM backbones

    ...The project emphasizes instant voice cloning, real-time performance, and deployment on smaller devices such as phones, laptops, and Raspberry Pi-class hardware. Its LLM-based architecture is intended to bring more expressive and flexible speech generation to local applications. NeuTTS is especially useful for embedded voice agents, private assistants, toys, accessibility tools, and compliance-sensitive apps. Its main value is making modern voice AI more portable, private, and practical for local deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    NeuTTS Nano

    NeuTTS Nano

    On-device TTS model by Neuphonic

    ...The project emphasizes instant voice cloning, real-time performance, and deployment on smaller devices such as phones, laptops, and Raspberry Pi-class hardware. Its LLM-based architecture is intended to bring more expressive and flexible speech generation to local applications. NeuTTS is especially useful for embedded voice agents, private assistants, toys, accessibility tools, and compliance-sensitive apps. Its main value is making modern voice AI more portable, private, and practical for local deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    Pocket TTS is a lightweight text-to-speech project designed to run efficiently on CPUs, targeting developers who want local speech generation without depending on GPUs or hosted web APIs. It is built to feel practical in everyday applications, where installation and usage should be as simple as adding a dependency and calling a function. The project focuses on keeping the runtime footprint manageable while still producing natural-sounding speech, which makes it attractive for offline tools, prototypes, and privacy-sensitive workflows. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • 5
    MOSS-TTS-Nano

    MOSS-TTS-Nano

    MOSS-TTS-Nano is an open-source multilingual tiny speech generation

    ...It supports multilingual voice cloning and produces high-fidelity audio with low latency. The system uses an autoregressive audio tokenization pipeline to generate natural-sounding speech. It is suitable for local applications, web services, and embedded systems. Overall, it brings advanced speech synthesis capabilities to lightweight and accessible environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Miso TTS

    Miso TTS

    Miso TTS is an 8 billion, highly emotive text-to-speech model

    ...The model supports both standard speech synthesis and voice-conditioned generation using optional audio prompts for voice cloning. Miso TTS generates Mimi audio codes and can leverage conversation history to create more contextually aware and realistic dialogue. Designed for local deployment, it offers watermarking by default to help promote responsible use of generated audio. With its focus on emotive speech generation, Miso TTS delivers state-of-the-art performance for AI voice applications, virtual assistants, and conversational AI experiences.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 8
    OpenVoice

    OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model

    OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 9
    Fish Speech

    Fish Speech

    SOTA Open Source TTS

    Fish Speech is a state-of-the-art open-source text-to-speech project that has evolved into the OpenAudio series of advanced TTS models. The repository hosts the code and tooling for training, fine-tuning, and serving high-quality TTS, while the current flagship models (OpenAudio-S1 and S1-mini) are distributed via Fish Audio’s playground and Hugging Face. The models are evaluated with Seed TTS metrics and achieve exceptionally low word and character error rates, indicating strong...
    Downloads: 30 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL Databases Icon
    99.99% Uptime for MySQL and PostgreSQL Databases

    Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

    Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.
    Try Free
  • 10
    RealtimeTTS

    RealtimeTTS

    Converts text to speech in realtime

    ...It is designed around a streaming model: you can feed it text incrementally (for example, as an LLM responds) and get audio output almost immediately, which keeps end-to-end latency very low. The library is engine-agnostic and plugs into a wide range of cloud and local TTS systems, including OpenAI, ElevenLabs, Azure, Coqui, Piper, StyleTTS2, Edge TTS, Google TTS, system TTS and others, so you can swap providers without rewriting your pipeline. It supports both internet-based engines and fully local engines, which lets you choose between privacy, cost, and quality trade-offs. RealtimeTTS also includes robustness features such as automatic fallbacks when a backend fails, so production systems can stay responsive even if one TTS provider is temporarily unavailable.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Audiblez

    Audiblez

    Generate audiobooks from e-books

    Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ChatTTS_colab is a wrapper project around the ChatTTS model that focuses on “one-click” deployment, especially in Google Colab. It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha”...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo