Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Text to Speech Software
Search Results

Search Results for "yaffs2-source" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Windows 98
Linux 92
Mac 88
More...
BSD 52
ChromeOS 52
Mobile Operating Systems 2
Desktop Operating Systems 1

Category

Artificial Intelligence 104
Multimedia 7
Communications 2
Software Development 2
Business 1
Education 1
Games 1
Internet 1

License

OSI-Approved Open Source 103

Translations

English 2
German 2
Bengali 1
Chinese (Simplified) 1

Programming Language

Python 104
BASIC 1
C 1
C# 1
Java 1
More...
JavaScript 1
PHP 1
Visual Basic 1

Status

Beta 6
Alpha 4
Production/Stable 4
Pre-Alpha 1

Showing 104 open source projects for "yaffs2-source"

View related business solutions

Text to Speech Python Clear Filters & Widen Search

Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
1

ChatTTS

A generative speech model for daily dialogue

ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.

Downloads: 3 This Week

Last Update: 2026-04-10
See Project
2

clone-voice

A sound cloning tool with a web interface, using your voice

Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control...

Downloads: 6 This Week

Last Update: 2025-11-28
See Project
3

Style-Bert-VITS2

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

Style-Bert-VITS2 is a text-to-speech system based on Bert-VITS2 that focuses on highly controllable voice styles and emotional expression. It takes the original Bert-VITS2 v2.1 and its Japanese-Extra variant and extends them so you can control emotion and speaking style with fine-grained intensity, not just choose a generic tone. The project targets both power users and beginners: Windows users without Git or Python can install and run it using bundled .bat scripts, while advanced users can...

Downloads: 6 This Week

Last Update: 2025-11-28
See Project
4

RealtimeTTS

Converts text to speech in realtime

RealtimeTTS is a low-latency text-to-speech library built for real-time applications such as voice chat with LLMs, assistants, and interactive tools. It is designed around a streaming model: you can feed it text incrementally (for example, as an LLM responds) and get audio output almost immediately, which keeps end-to-end latency very low. The library is engine-agnostic and plugs into a wide range of cloud and local TTS systems, including OpenAI, ElevenLabs, Azure, Coqui, Piper, StyleTTS2,...

Downloads: 4 This Week

Last Update: 2026-05-25
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to...

Downloads: 4 This Week

Last Update: 2025-11-28
See Project
6

LuxTTS

A high-quality rapid TTS voice cloning model

LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality.

Downloads: 2 This Week

Last Update: 2026-03-12
See Project
7

gTTS

Python library and CLI tool to interface with Google Translate

gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports...

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
8

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...

Downloads: 3 This Week

Last Update: 2026-04-28
See Project
9

Speech-AI-Forge

Speech-AI-Forge is a project developed around TTS generation model

Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it...

Downloads: 3 This Week

Last Update: 2026-02-02
See Project
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
10

Auto Synced & Translated Dubs

Automatically translates the text of a video based on a subtitle file

Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken...

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
11

Open Vision Agents by Stream

Build Vision Agents quickly with any model or video provider

Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio and video latency while processing frames and generating responses. ...

Downloads: 2 This Week

Last Update: 17 hours ago
See Project
12

YandexStation

Management of Yandex Station and other smart home devices

YandexStation is a Home Assistant custom component that integrates Yandex-branded smart speakers and other devices with Alice into a unified smart home automation environment. It supports both local and cloud control, depending on the device type, with Yandex speakers often supporting both modes and third-party speakers typically limited to cloud control. The integration exposes playback and volume controls, as well as text-to-speech capabilities that send spoken messages in Alice’s voice...

Downloads: 2 This Week

Last Update: 2026-05-21
See Project
13

FastKoko

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...

Downloads: 2 This Week

Last Update: 2026-05-25
See Project
14

Matcha-TTS

A fast TTS architecture with conditional flow matching

Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still...

Downloads: 2 This Week

Last Update: 2025-11-28
See Project
15

NVIDIA NeMo Framework

Scalable generative AI framework built for researchers and developers

NVIDIA NeMo is a scalable, cloud-native generative AI framework aimed at researchers and PyTorch developers working on large language models, multimodal models, and speech AI (ASR and TTS), with growing support for computer vision. It provides collections of domain-specific modules and reference implementations that make it easier to pre-train, fine-tune, and deploy very large models on multi-GPU and multi-node infrastructure. NeMo 2.0 introduces a Python-based configuration system,...

Downloads: 2 This Week

Last Update: 2026-04-22
See Project
16

MetaVoice-1B

Foundational model for human-like, expressive TTS

...With that scale and dataset volume, MetaVoice aims to push the boundary of what open-source TTS models can achieve: high fidelity, natural prosody, and robustness even for edge cases. As a foundational model, it can serve as the backbone for downstream tasks — such as voice generation, voice cloning, speech generation for virtual agents, or even audio production pipelines.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
17

NVIDIA NeMo

Toolkit for conversational AI

NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...

Downloads: 1 This Week

Last Update: 2026-04-22
See Project
18

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper

WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
19

FireRedTTS-2

Long-form streaming TTS system for multi-speaker dialogue generation

FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like chatbots, podcasts, and applications where dynamic turn-taking between speakers is essential. ...

Downloads: 0 This Week

Last Update: 2026-02-16
See Project
20

Spark TTS

Spark-TTS Inference Code

Spark TTS is an open-source, PyTorch-based text-to-speech inference system that leverages large language models to produce highly natural, intelligible speech from text input. It uses an efficient single-stream architecture where speech tokens are directly reconstructed from the predictions of an LLM, removing the need for external acoustic models or complex vocoders and making the generation pipeline cleaner and faster.

Downloads: 0 This Week

Last Update: 2026-02-04
See Project
21

Orpheus TTS

Towards Human-Sounding Speech

Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. ...

Downloads: 0 This Week

Last Update: 2025-12-05
See Project
22

Sopro TTS

A lightweight text-to-speech model with zero-shot voice cloning

Sopro TTS is an open-source text-to-speech (TTS) project that implements a lightweight model capable of producing speech from text with zero-shot voice cloning, meaning it can mimic a speaker’s voice from only a few seconds of reference audio. Built with a 169 million-parameter architecture that uses dilated convolutions and cross-attention layers instead of large Transformer stacks, it achieves relatively fast real-time performance even on CPUs (about a 0.25 real-time factor measured on an M3 base). ...

Downloads: 0 This Week

Last Update: 2026-02-06
See Project
23

ESPnet

End-to-end speech processing toolkit

ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while...

Downloads: 0 This Week

Last Update: 2026-04-22
See Project
24

Bailing

Bailing is a voice dialogue robot similar to GPT-4o

Bailing is an open-source voice-dialogue assistant designed to deliver natural voice-based conversations by combining automatic speech recognition (ASR), voice activity detection (VAD), a large language model (LLM), and text-to-speech (TTS) in a single pipeline. Its goal is to offer a “voice-first” chat experience similar to what one might expect from a system like GPT-4o, but fully open and deployable by users.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
25

shuyuan

Reading book source

shuyuan is a project oriented around reading and knowledge consumption, especially targeting large-scale text content such as books, articles, or educational material. The name suggests “academy” or “study hall,” and the tool aims to help users ingest, organize, and manage reading content — possibly offering features like text parsing, annotation, metadata generation, translation, or storage for later reference. The repository is set up to support document ingestion, indexing, and maybe some...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project

Previous
1
You're on page 2
3
4
5
Next

Related Searches

voice cloning

audio visualization vlc

srt file

voice clone

one ui 8.5 download

google tts nvda

tts

luxtts

arabic text to speech

subtitle

Related Categories

Artificial Intelligence

Multimedia

Communications

Software Development

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise