Showing 8 open source projects for "livekit-cli"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    ...It supports customizable text pre-processors, which can correct pronunciations, tweak formatting, or handle domain-specific vocabulary before sending it to the API. gTTS is primarily aimed at developers who want a quick way to add cloud-backed speech to scripts, apps, or pipelines without managing any model weights locally. A small CLI utility, gtts-cli, makes it easy to test or batch-generate MP3 files right from the shell.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    ...From the CLI you can adjust parameters such as speaking rate, volume, and pitch, giving you some control over prosody without diving into SSML. The library is asynchronous under the hood, which makes it efficient for batch jobs or web services that need to synthesize many utterances concurrently.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 3
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    ...It offers both a Gradio backend and an optional React frontend, which can be accessed on separate ports and even run inside Docker for more reproducible deployments. An extension system lets you enable extra models and tools, install community extensions from a catalog, and manage them via a dedicated GUI or CLI extension manager.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    ebook2audiobook

    ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages

    ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of...
    Downloads: 33 This Week
    Last Update:
    See Project
  • Field Service+ for MS Dynamics 365 & Salesforce Icon
    Field Service+ for MS Dynamics 365 & Salesforce

    Empower your field service with mobility and reliability

    Resco’s mobile solution streamlines your field service operations with offline work, fast data sync, and powerful tools for frontline workers, all natively integrated into Dynamics 365 and Salesforce.
    Learn More
  • 5
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    ...Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. It includes examples such as audiobook generation to demonstrate long-form synthesis and joined audio segments. On top of that, MLX-Audio offers a modern web interface powered by FastAPI, with real-time waveform and 3D visualizations, file upload, and audio management.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Mocking Bird

    Mocking Bird

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English....
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next