Showing 16 open source projects for "unit-api"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps....
    Downloads: 284 This Week
    Last Update:
    See Project
  • 2
    OmniVoice

    OmniVoice

    High-Quality Voice Cloning TTS for 600+ Languages

    ...The system also includes advanced features like non-verbal expression tags and pronunciation overrides, enabling expressive and precise output. With support for both API-based and command-line usage, it is designed for research, production, and experimentation alike.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 3
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    ...It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple languages and voicepacks and allows phoneme based generation for more accurate pronunciation and prosody. The server also offers per-word timestamped captions, which makes it useful for creating subtitles or aligning audio with text. A built in web UI, API documentation, and debug endpoints for monitoring system status help users explore voices, test requests, and integrate the service into larger systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    ...Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 5
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it in Docker containers, or set it up locally with its environment preparation scripts. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Fish Audio Python SDK

    Fish Audio Python SDK

    The official Python library for the Fish Audio API

    Fish Audio Python is the official Python SDK for working with the Fish Audio platform. It gives developers a programmatic way to access Fish Audio features such as text-to-speech generation, audio playback, saving output files, and API-based voice workflows. The package is designed for Python applications that need speech generation without manually handling raw HTTP requests. It supports synchronous usage for simple scripts and provides utilities that make generated audio easier to play or export. The SDK is useful for developers building voice products, prototypes, assistants, content tools, or automated audio pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Kitten TTS

    Kitten TTS

    State-of-the-art TTS model under 25MB

    KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    OuteTTS is an interface library for running OuteTTS text-to-speech models across a range of backends, making it easier to deploy the same model on different hardware and runtimes. It provides a high-level Interface API that wraps model configuration, speaker handling, and audio generation so you can focus on integrating speech into your application rather than wiring up low-level engines. The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    ...The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports customizable text pre-processors, which can correct pronunciations, tweak formatting, or handle domain-specific vocabulary before sending it to the API. gTTS is primarily aimed at developers who want a quick way to add cloud-backed speech to scripts, apps, or pipelines without managing any model weights locally. A small CLI utility, gtts-cli, makes it easy to test or batch-generate MP3 files right from the shell.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    Sopro TTS

    Sopro TTS

    A lightweight text-to-speech model with zero-shot voice cloning

    ...The model is designed to work with a small set of dependencies and to be accessible for developers who want offline TTS with customizable voice style, including options for streaming or non-streaming generation modes. Users can install it with standard Python tools, run a demo server locally, and experiment with CLI or Python API usage for producing synthetic speech.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NeuTTS Air

    NeuTTS Air

    NeuTTS model built from small LLM backbones

    NeuTTS Air is an open-source collection of on-device text-to-speech speech language models from Neuphonic. It is built for natural-sounding voice generation that can run locally instead of relying on a remote web API. The project emphasizes instant voice cloning, real-time performance, and deployment on smaller devices such as phones, laptops, and Raspberry Pi-class hardware. Its LLM-based architecture is intended to bring more expressive and flexible speech generation to local applications. NeuTTS is especially useful for embedded voice agents, private assistants, toys, accessibility tools, and compliance-sensitive apps. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    ...Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. It includes examples such as audiobook generation to demonstrate long-form synthesis and joined audio segments. On top of that, MLX-Audio offers a modern web interface powered by FastAPI, with real-time waveform and 3D visualizations, file upload, and audio management.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    NeuTTS Nano

    NeuTTS Nano

    On-device TTS model by Neuphonic

    NeuTTS Nano is an open-source collection of on-device text-to-speech speech language models from Neuphonic. It is built for natural-sounding voice generation that can run locally instead of relying on a remote web API. The project emphasizes instant voice cloning, real-time performance, and deployment on smaller devices such as phones, laptops, and Raspberry Pi-class hardware. Its LLM-based architecture is intended to bring more expressive and flexible speech generation to local applications. NeuTTS is especially useful for embedded voice agents, private assistants, toys, accessibility tools, and compliance-sensitive apps. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    RealtimeTTS

    RealtimeTTS

    Converts text to speech in realtime

    RealtimeTTS is a low-latency text-to-speech library built for real-time applications such as voice chat with LLMs, assistants, and interactive tools. It is designed around a streaming model: you can feed it text incrementally (for example, as an LLM responds) and get audio output almost immediately, which keeps end-to-end latency very low. The library is engine-agnostic and plugs into a wide range of cloud and local TTS systems, including OpenAI, ElevenLabs, Azure, Coqui, Piper, StyleTTS2,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    VALL-E X

    VALL-E X

    Open source implementation of Microsoft's VALL-E X zero-shot TTS model

    VALL-E-X is an open-source implementation of Microsoft’s VALL-E X zero-shot text-to-speech model, focused on multilingual, cross-lingual voice cloning. It is capable of synthesizing speech in English, Chinese, and Japanese from text while mimicking the voice characteristics of a speaker given only a short 3–10 second prompt. The model attempts to match not just timbre, but also tone, pitch, emotion, and prosody of the reference audio, resulting in highly personalized output. VALL-E-X...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Dia-1.6B

    Dia-1.6B

    Dia-1.6B generates lifelike English dialogue and vocal expressions

    Dia-1.6B is a 1.6 billion parameter text-to-speech model by Nari Labs that generates high-fidelity dialogue directly from transcripts. Designed for realistic vocal performance, Dia supports expressive features like emotion, tone control, and non-verbal cues such as laughter, coughing, or sighs. The model accepts speaker conditioning through audio prompts, allowing limited voice cloning and speaker consistency across generations. It is optimized for English and built for real-time performance...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo