35 projects for "speech engine" with 1 filter applied:

  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 1
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps....
    Downloads: 327 This Week
    Last Update:
    See Project
  • 2
    RealtimeSTT

    RealtimeSTT

    A robust, efficient, low-latency speech-to-text library

    RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    ESPnet

    ESPnet

    End-to-end speech processing toolkit

    ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Luna AI

    Luna AI

    Virtual AI anchor that combines state-of-the-art technology

    ...It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE), and “xuniren,” and can output to streaming platforms like Bilibili, Douyin, Kuaishou, WeChat Channels, Pinduoduo, Douyu, YouTube, Twitch, and TikTok. For voice, it integrates with numerous TTS engines (Edge-TTS, VITS-Fast, ElevenLabs, VALL-E-X, OpenVoice, GPT-SoVITS, Azure TTS, fish-speech, ChatTTS, CosyVoice, F5-TTS, MultiTTS, MeloTTS, and others), and can optionally pass the output through voice conversion systems like so-vits-svc or DDSP-SVC to change timbre.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    EmotiVoice

    EmotiVoice

    Multi-Voice and Prompt-Controlled TTS Engine

    EmotiVoice is a multi-voice, prompt-controlled text-to-speech engine designed to generate highly expressive speech across thousands of voices. It supports both English and Chinese and ships with over 2,000 preset voices, making it suitable for everything from characters and virtual anchors to narration and dialogue. The core idea is prompt-based emotional and style control: you can ask the engine to speak “happy,” “sad,” “excited,” or with other high-level style prompts that shape prosody, pitch, speed, and energy. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    ekho

    ekho

    Chinese text-to-speech engine

    ekho is a project with relatively sparse documentation, but from the repository it appears to be a small-scale tool for audio processing and playback, possibly with features for speech synthesis or manipulation. The repo includes scripts and configuration files suggesting interactions with media/audio handling libraries. Because of limited README detail, it seems targeted at users comfortable reading and modifying code, rather than end users expecting polished UIs. The code structure implies...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10

    AhoTTS - TTS for Basque and Spanish

    Text-to-Speech for Basque and Spanish

    Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TensorFlowTTS

    TensorFlowTTS

    Real-Time State-of-the-art Speech Synthesis for Tensorflow 2

    TensorFlowTTS is a state-of-the-art, open-source speech synthesis library built on TensorFlow 2. It offers a variety of architectures for text-to-speech, including classic and modern models such as Tacotron‑2, FastSpeech / FastSpeech2, and neural vocoders like MelGAN and Multiband‑MelGAN. Because it’s based on TensorFlow 2, it can leverage optimizations such as fake-quantization aware training and pruning — which allow models to run faster than real time and to be deployable on mobile or embedded platforms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Live Transcribe Speech Engine

    Live Transcribe Speech Engine

    Live Transcribe is an Android application

    Live Transcribe Speech Engine provides on-device speech recognition components that power real-time transcription for accessibility and everyday voice-first experiences. Its design prioritizes latency and robustness in noisy, far-field environments, enabling continuous transcription with low delay on mobile hardware. The engine manages audio front-end processing—such as noise suppression and voice activity detection—before feeding audio into compact, accurate acoustic and language models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    SmartBody

    Character animation system for games and simulations.

    SmartBody is available for download for Windows, Linux and OSX users. SmartBody can also be used on Android and iOS platforms. SmartBody is a character animation platform that provides the following capabilities in real time: * Locomotion (walk, jog, run, turn, strafe, jump, etc.) * Steering - avoiding obstacles and moving objects * Object manipulation - reach, grasp, touch , pick up objects * Lip Syncing - characters can speak with simultaneous lip-sync using text-to-speech or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14

    AhoTTS Multilingual, a Multilingual TTS

    Text-to-Speech TTS for Basque, Spanish, Catalan, Galician and English

    Text-to-Speech conversor for Basque, Spanish, Catalan, Galician and English. It includes linguistic processing and built voices for all the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    eSpeak: speech synthesis
    Text to Speech engine for English and many other languages. Compact size with clear but artificial pronunciation. Available as a command-line program with many options, a shared library for Linux, and a Windows SAPI5 version.
    Leader badge
    Downloads: 1,753 This Week
    Last Update:
    See Project
  • 16
    emofilt - emotional speech synthesis
    PROJECT DEVELOPMENT MOVED TO GITHUB! EmoFilt enables the free-for-non-commercial-use speech synthesis engine MBROLA to sound emotional by manipulating the phonetic description. It does so by modifying melody and rhythm of the speech, matching a target emotion. It is available for 34 languag
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    JAVIER

    JAvascript Voicexml InterpretER

    JAVIER is a JAvascript Voicexml InterpretER, designed (but not restricted) to run inside a web browser, its main engine has less than 1000 lines of code. It's maybe, the tiniest but (almost) FULL VoiceXML implementation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The project provides a ready-to-use interface for the julius CSR engine for a handicapped child which is not able to use the keyboard well. It integrates into X11 and Windows. Find out how you can help: http://simon-listens.org/index.php?support
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19

    Mixed Excitation hts_engine API

    Adds mixed excitation to the hts_engine API

    Adds mixed excitation to the hts_engine API
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SourceSquare

    SourceSquare

    The free open source scanning engine

    SourceSquare is a free open source scanning solution, based on Antepedia, the world's largest open source knwoledge base, including more than 1,000,000 open source projects from the main forges. SourceSquare is a free (as in free beer) and free (as in free speech, licensed under GNU Affero Public Licence V3) product, which scans a folder on your hard drive and displays a treemap that shows which part of your source code is open source.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    FreeTTS is a speech synthesis engine written entirely in the Java(tm) programming language. FreeTTS was written by the Sun Microsystems Laboratories Speech Team and is based on CMU's Flite engine. FreeTTS also includes a partial JSAPI 1.0
    Downloads: 40 This Week
    Last Update:
    See Project
  • 22
    Open Path is an open source, browser-based, online RPG. It aims to be the best game engine available, boasting many inbuilt features that most frameworks don't have. It is fast, powerful, robust, and free (speech and beer).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    CJ7 is an open-source speech recognition engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DawNLITE is a Natural-Language-based Image Transmoding Engine. The software transforms an image to a video as recorded by a virtual camera panning and zooming over the image, following a natural language text description of the image.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The jTTS project provides a refactored implementation of the FreeTTS software speech synthesizer. Rather than building upon the flite implementation of a TTS engine, it implements engine capabilities from Festival. It also supports British English.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next