Showing 39 open source projects for "s-tools"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Readest

    Readest

    Readest is a modern, feature-rich ebook reader

    ...Because of that, it's oriented towards learners, researchers, or people dealing with multilingual documents — especially when they need to rapidly digest or reference large amounts of text. The design seems to prioritize flexible input formats, possibly OCR or uploaded documents, and interactive tools to navigate or annotate them.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 2
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    ...It offers both a Gradio backend and an optional React frontend, which can be accessed on separate ports and even run inside Docker for more reproducible deployments. An extension system lets you enable extra models and tools, install community extensions from a catalog, and manage them via a dedicated GUI or CLI extension manager.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    ...It is built to feel practical in everyday applications, where installation and usage should be as simple as adding a dependency and calling a function. The project focuses on keeping the runtime footprint manageable while still producing natural-sounding speech, which makes it attractive for offline tools, prototypes, and privacy-sensitive workflows. Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 4
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    ...The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the results using studio-oriented editing concepts. A standout capability is its multi-track timeline editor and supporting audio tools (like trimming and conversation mixing), which let creators compose multi-voice scenes instead of generating single clips in isolation. It is API-first, meaning you can use it as an app for production work or integrate its speech generation into your own software via an API layer.
    Downloads: 85 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Qwen3-TTS

    Qwen3-TTS

    Qwen3-TTS is an open-source series of TTS models

    Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or integrate TTS into larger pipelines such as voice assistants, accessibility tools, or multimedia generation workflows. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 6
    OmniVoice

    OmniVoice

    High-Quality Voice Cloning TTS for 600+ Languages

    The OmniVoice project is a cutting-edge multilingual text-to-speech system designed to generate high-quality speech across more than 600 languages. Built on a diffusion language model-style architecture, it combines scalability with strong performance, enabling both natural-sounding voice synthesis and efficient inference speeds. One of its most notable capabilities is zero-shot voice cloning, allowing users to replicate a speaker’s voice using only a short reference audio clip. In addition,...
    Downloads: 38 This Week
    Last Update:
    See Project
  • 7
    KrillinAI

    KrillinAI

    Video translation and dubbing tool powered by LLMs

    KrillinAI is an end-to-end content localization, translation, and dubbing tool aimed at helping creators transform videos into multiple languages with minimal manual effort. It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed audio tracks. KrillinAI supports both landscape and portrait videos, which makes it suitable for a wide range of platforms — from YouTube to TikTok or other vertical-video sites — and ensures correct formatting and layout for the final video. ...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 8
    pyttsx3

    pyttsx3

    Offline Text To Speech synthesis for python

    pyttsx3 is an offline text-to-speech library for Python that wraps native speech engines instead of calling cloud APIs. It is designed to work entirely without an internet connection, making it suitable for local automation, kiosks, accessibility tools, and embedded applications. On Windows it uses SAPI5, on Linux it typically uses eSpeak or eSpeak-NG, and on macOS it can use NSSpeechSynthesizer or AVSpeechSynthesizer, giving it broad cross-platform compatibility. The library exposes a simple but flexible API for controlling voice selection, speaking rate, volume, and other synthesis parameters from Python code. ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 9
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...
    Downloads: 13 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 10
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    ...Users can experiment with different input text and voice options directly in their browser, gaining a sense of how high-fidelity AI audio can be integrated into applications ranging from podcasts and narration to accessibility tools and interactive agents. Although the web demo is free to explore, production use of the underlying API requires an OpenAI API key and may incur costs based on usage.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 11
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 12
    sag

    sag

    Like the macOS say command, but with a modern voice

    sag is a command-line text-to-speech utility inspired by the macOS say command but powered by modern ElevenLabs voice synthesis technology. The project allows users to stream synthesized speech directly to speakers, save audio files, or list and manage available voices through a lightweight terminal interface. Designed for speed and convenience, sag supports voice selection, playback rate adjustments, output format inference, and configurable API endpoints for flexible deployment. It...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 13
    Chatterbox TTS Server

    Chatterbox TTS Server

    Self-host the powerful Chatterbox TTS model

    ...It is designed for users who want local or private speech generation without depending entirely on a hosted voice platform. The project supports predefined voices, voice cloning, and longer text workflows, making it useful for audiobooks, narration, content tools, and assistant-style applications. It also includes OpenAI-compatible API behavior, which helps developers connect it to existing tools that already expect that style of endpoint. The server can run on NVIDIA CUDA, AMD ROCm, or CPU, giving it flexibility across different hardware setups. Its main value is packaging a powerful TTS workflow into a practical service that can be accessed through a browser or integrated into other software.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    ...The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 15
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    ...The project has a strong focus on developer ergonomics, with thorough development guidelines, environment configuration using .env variables, and a clear structure for tests, tools and agents.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    Style-Bert-VITS2

    Style-Bert-VITS2

    Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

    Style-Bert-VITS2 is a text-to-speech system based on Bert-VITS2 that focuses on highly controllable voice styles and emotional expression. It takes the original Bert-VITS2 v2.1 and its Japanese-Extra variant and extends them so you can control emotion and speaking style with fine-grained intensity, not just choose a generic tone. The project targets both power users and beginners: Windows users without Git or Python can install and run it using bundled .bat scripts, while advanced users can...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    FastRTC

    FastRTC

    The python library for real-time communication

    ...This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. FastRTC also integrates nicely with UI frameworks (e.g. via a web demo using Gradio), so developers can rapidly prototype and deploy real-time streaming applications without deep knowledge of low-level WebRTC internals. Because voice-enabled AI agents often involve many moving parts (speech-to-text, text processing, text-to-speech, streaming, session/chat management), FastRTC helps by handling the streaming aspect, leaving the rest to be plugged in modularly.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    RealtimeTTS

    RealtimeTTS

    Converts text to speech in realtime

    RealtimeTTS is a low-latency text-to-speech library built for real-time applications such as voice chat with LLMs, assistants, and interactive tools. It is designed around a streaming model: you can feed it text incrementally (for example, as an LLM responds) and get audio output almost immediately, which keeps end-to-end latency very low. The library is engine-agnostic and plugs into a wide range of cloud and local TTS systems, including OpenAI, ElevenLabs, Azure, Coqui, Piper, StyleTTS2, Edge TTS, Google TTS, system TTS and others, so you can swap providers without rewriting your pipeline. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    ...The repository provides an end-to-end TTS pipeline: a PyTorch/Lightning training stack, configuration files, pre-trained checkpoints, a command-line interface, and a Gradio app for interactive testing. Users can train on standard datasets like LJSpeech or plug in their own corpora, with helper tools for computing dataset statistics, extracting phoneme durations, and running multi-GPU training.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    MiniMax-MCP

    MiniMax-MCP

    Official MiniMax Model Context Protocol (MCP) server

    MiniMax-MCP is the official Model Context Protocol (MCP) server for accessing MiniMax’s multimodal generative APIs from MCP-compatible clients. It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license, with a pyproject.toml and uv-based workflow that makes installation and execution reproducible. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    OpenAI-Compatible Edge-TTS API

    OpenAI-Compatible Edge-TTS API

    Free, high-quality text-to-speech API endpoint to replace OpenAI

    ...Because it relies on Edge’s TTS, the audio generation itself is free, and the project essentially acts as a smart proxy that handles formatting and streaming. The server supports Server-Sent Events (SSE) for streaming audio, enabling low-latency playback in chat UIs and other interactive tools. A Docker image is provided for one-command deployment, and environment variables can be used to configure default voice, language, response format, authentication, and logging options.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Sopro TTS

    Sopro TTS

    A lightweight text-to-speech model with zero-shot voice cloning

    ...The model is designed to work with a small set of dependencies and to be accessible for developers who want offline TTS with customizable voice style, including options for streaming or non-streaming generation modes. Users can install it with standard Python tools, run a demo server locally, and experiment with CLI or Python API usage for producing synthetic speech.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Lingvo

    Lingvo

    Framework for building neural networks

    Lingvo is a TensorFlow based framework focused on building and training sequence models, especially for language and speech tasks. It was originally developed for internal research and later open sourced to support reproducible experiments and shared model implementations. The framework provides a structured way to define models, input pipelines, and training configurations using a common interface for layers, which encourages reuse across different tasks. It has been used to implement state...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    openctp

    openctp

    Provides CTP stock options and Zhongtai Securities XTP

    ...The project offers a comprehensive simulation environment similar to SimNow that supports futures, options, A share stocks, funds, bonds, and stock options, and even extends to Hong Kong and US markets. In addition to the core library, openctp supplies Python bindings for CTPAPI and stock options APIs, making it easier to build strategies, tools, and analytics in Python. It also develops full featured and lightweight trading clients like TickTrader, TickTraderMini, and ViTrader, which support multiple desks and markets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    IMS Toucan

    IMS Toucan

    Controllable and fast Text-to-Speech for over 7000 languages

    ...It includes complete pipelines for preprocessing datasets, training models, and running inference, plus a storage configuration system to manage where models and caches are stored. IMS-Toucan ships with several ready-to-run scripts, including GUIs for interactive demos, prosody override tools, zero-shot language embedding injection, and text-to-audio file generation. Pretrained models are automatically downloaded when needed, and there is an online demo instance hosted on GPU that anyone can try.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo