Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Text to Speech Software
Search Results

Search Results for "linux ai"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 27
Windows 25
Mac 24
More...
BSD 16
ChromeOS 16
Mobile Operating Systems 3

Category

Artificial Intelligence 27
Business 1
Multimedia 1
Scientific/Engineering 1

License

OSI-Approved Open Source 25

Translations

English 2
Catalan 1
French 1
German 1
More...
Spanish 1

Programming Language

Python 15
TypeScript 5
JavaScript 4
Java 2
More...
C++ 1

Status

Production/Stable 3
Beta 1

Showing 27 open source projects for "linux ai"

View related business solutions

Text to Speech Linux Clear Filters & Widen Search

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
1

AI Runner

Offline inference engine for art, real-time voice conversations

AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a...

Downloads: 12 This Week

Last Update: 2025-12-11
See Project
2

Luna AI

Virtual AI anchor that combines state-of-the-art technology

Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...

Downloads: 5 This Week

Last Update: 2025-11-28
See Project
3

Speech-AI-Forge

Speech-AI-Forge is a project developed around TTS generation model

Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it...

Downloads: 2 This Week

Last Update: 2026-02-02
See Project
4

NVIDIA NeMo

Toolkit for conversational AI

NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...

Downloads: 2 This Week

Last Update: 2026-03-23
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

Polyglot

Cross-platform AI language practice app

Polyglot is a cross platform AI language practice application that runs as a desktop app and also offers a web version. It is built around conversational large language models and Azure based text to speech services, turning them into an interactive environment for speaking practice in multiple languages. Users can define custom AI personas, choose languages, and configure their own OpenAI and Azure keys so they retain control over which backends they use. The app supports speech recognition...

Downloads: 5 This Week

Last Update: 2025-11-28
See Project
6

Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models

Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or...

Downloads: 15 This Week

Last Update: 2026-03-17
See Project
7

NVIDIA NeMo Framework

Scalable generative AI framework built for researchers and developers

NVIDIA NeMo is a scalable, cloud-native generative AI framework aimed at researchers and PyTorch developers working on large language models, multimodal models, and speech AI (ASR and TTS), with growing support for computer vision. It provides collections of domain-specific modules and reference implementations that make it easier to pre-train, fine-tune, and deploy very large models on multi-GPU and multi-node infrastructure. NeMo 2.0 introduces a Python-based configuration system,...

Downloads: 2 This Week

Last Update: 2026-03-23
See Project
8

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.

Downloads: 14 This Week

Last Update: 2026-02-24
See Project
9

FireRedTTS-2

Long-form streaming TTS system for multi-speaker dialogue generation

FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like...

Downloads: 3 This Week

Last Update: 2026-02-16
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
10

IndexTTS2

Industrial-level controllable zero-shot text-to-speech system

IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...

Downloads: 5 This Week

Last Update: 2025-11-27
See Project
11

FastRTC

The python library for real-time communication

FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application). This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat,...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
12

Readest

Readest is a modern, feature-rich ebook reader

Readest is a project meant to facilitate reading, studying, or consuming content by integrating reading tools with AI-powered assistance. Although the repository is not as widely documented or popular as some, the idea is that Readest supports features to help with reading comprehension — likely combining OCR / text retrieval, translation, note-taking, or summarization for reading materials (eBooks, articles, PDFs). The goal appears to be to let users feed in arbitrary reading material and...

Downloads: 40 This Week

Last Update: 2026-04-13
See Project
13

GLM-TTS

Controllable & emotion-expressive zero-shot TTS

GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice...

Downloads: 2 This Week

Last Update: 2026-04-10
See Project
14

OpenAI.fm

Code for openai.fm, a demo for the OpenAI Speech API

OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features...

Downloads: 11 This Week

Last Update: 2026-01-28
See Project
15

Auto Synced & Translated Dubs

Automatically translates the text of a video based on a subtitle file

Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken...

Downloads: 2 This Week

Last Update: 2025-11-28
See Project
16

Open Vision Agents by Stream

Build Vision Agents quickly with any model or video provider

Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...

Downloads: 4 This Week

Last Update: 6 days ago
See Project
17

EasyVoice

Open source text-to-speech tool, supports extra-long text

easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure...

Downloads: 1 This Week

Last Update: 2026-01-26
See Project
18

comfyui-mixlab-nodes

Workflow and speech recognition app

comfyui-mixlab-nodes is a large collection of custom nodes for ComfyUI that turns workflows into interactive apps and adds real-time multimedia, LLM, and TTS capabilities. It introduces a “Workflow-to-APP” concept, where a ComfyUI graph can be transformed into a Web App through an AppInfo node, complete with categories, batch prompts, and editable configurations. The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
19

Conversations

App in java for chatting to a generative A.I. (involving tts and stt)

Java application for chatting to generative AI Llama3. * The user can speak into the microphone (speechToText), edit the recognized text and send it to the AI. * The AI responds and the server returns that response in real time, and the sentences converted to audio (textToSpeech), and the application broadcasts them through the speaker. The application is prepared so that only one user occupies the server's resources, so if the server is busy, in theory it will not let you...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
20

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.

Downloads: 5 This Week

Last Update: 2025-03-19
See Project
21

AudioBC

Offline desktop app to convert EPUB to MP3 using Kokoro-82M neural TTS

AudioBC is a powerful desktop application designed to turn your digital library into a personal audiobook collection. Unlike most Text-to-Speech (TTS) tools that require expensive cloud API subscriptions or an active internet connection, AudioBC runs entirely on your local machine. Powered by the state-of-the-art Kokoro-82M neural engine, AudioBC produces natural, human-like speech that rivals premium cloud services. It is built with a focus on privacy and simplicity, offering a...

Downloads: 4 This Week

Last Update: 2026-03-22
See Project
22

Voice Accounting For Blind & Mute People

Free & Easy AI Voice Accounting Software For Blind & Speechless People

Just download the above zip file, extract it and then open the index.html file on internet browsers like Firefox ( preferable ) or Google Chrome. Also, please view and download my full collection of softwares for people with disabilities, here : https://sourceforge.net/projects/softwares-for-disabled-people/ This full collection also includes the Voice Accounting Software as well.

Downloads: 0 This Week

Last Update: 2024-04-30
See Project
23

Softwares For Blind, Deaf, Handicap

Easy AI Softwares for Blind, Deaf, Handicapped, Disabled People

Just download the above zip file, extract it first and then open the index.html file on internet browsers like Firefox ( preferable ) or Google Chrome. Also, keep NumLock ON while using the Numeric Keypad of any Keyboard. Can also attach an external USB keyboard, with seperate Numeric Keypad, if required. I have added some general guidelines for students, using these softwares, on the Wiki Page of this website. Please refer them for more instructions.

Downloads: 0 This Week

Last Update: 2026-01-18
See Project
24

Amica

Amica is an open source interface for interactive communication

Amica is an open source interface for interacting with fully animated 3D characters that combine voice chat, vision, and an emotion engine into a single experience. It lets you hold natural conversations with AI characters that can see, listen, and speak, while expressing emotional states through facial expressions and body language. Users can import VRM character models, adjust their appearance, tune the voice to match the character, and define behavior using different large language models...

Downloads: 12 This Week

Last Update: 2025-11-30
See Project
25

Piper TTS

A fast, local neural text to speech system

Piper is a fast, local neural text-to-speech (TTS) system developed by the Rhasspy team. Optimized for devices like the Raspberry Pi 4, Piper enables high-quality speech synthesis without relying on cloud services, making it ideal for privacy-conscious applications. It utilizes ONNX models trained with VITS to deliver natural-sounding voices across various languages and accents. Piper is particularly suited for offline voice assistants and embedded systems.

Downloads: 476 This Week

Last Update: 2025-06-03
See Project

Previous
You're on page 1
2
Next

Related Searches

sapi 5 voices

ai

piper_windows_amd64.zip

tts

speech

readest

ai offline

ai chatbot offline

piper

offline ai

Related Categories

Artificial Intelligence

Business

Multimedia

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise