Showing 19 open source projects for "text voice"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the...
    Downloads: 99 This Week
    Last Update:
    See Project
  • 2
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features like emotional range, tone, and real-time playback. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure parameters such as rate, pitch, and volume per role. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    II ElevenLabs UI

    II ElevenLabs UI

    Component library and custom registry built on top of shadcn/ui

    ElevenLabs UI is an open-source component library designed to accelerate the development of multimodal AI applications, particularly those involving voice agents and audio-based interactions. Built on top of modern frontend tooling such as React, Tailwind CSS, and shadcn/ui, it provides a collection of pre-built, customizable components that developers can easily integrate into their applications. The library includes specialized UI elements such as audio players, waveform visualizers,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Gemini Next Chat

    Gemini Next Chat

    Deploy your private Gemini application for free with one click

    Gemini Next Chat is an open-source web application that allows you to deploy your own private chat interface powered by Google’s Gemini models (e.g., Gemini 1.5, Gemini 2.0, etc.). It is built with Next.js/TypeScript and targets developers and hobbyists who want a self-hosted solution for interacting with advanced multimodal models (text, image, voice). It supports features like image recognition, voice-based conversation, plugins (web search, ArXiv search, weather, etc.), and client apps (tray app) for greater convenience. The project emphasizes “one-click” deployment, aiming to make it easy to spin up a custom chat front end without deep infra-setup. It’s licensed under MIT and has an active community of contributors; documentation and release notes note support for newer features like mixed image+text generation. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Big-AGI

    Big-AGI

    AI suite powered by state-of-the-art models and providing advanced AI

    ...It unifies access to multiple large language models (LLMs) and AI services through a modern web UI that emphasizes effi­cient interaction, flexibility, and extensibility, enabling users to conduct multi-model chats, execute code, generate images, and perform voice or text-based tasks all in one place. The workspace includes advanced features like Beam, which enables multi-model consensus and comparative responses to improve reliability and reduce hallucination, and robust persona management to tailor responses to specific roles or workflows. Big-AGI can be self-hosted or deployed in cloud environments, giving users full control over data and model access limits and avoiding vendor lock-in.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Short Video Factory

    Short Video Factory

    AI tool for automatic batch short video creation and editing

    Short Video Factory is an open source desktop application designed to simplify the creation of short-form videos using AI-driven automation. It enables users to generate product marketing clips and general content videos by combining simple prompt-based input with pre-prepared media assets. Short Video Factory integrates multiple stages of video production, including script generation, voice synthesis, video editing, and subtitle effects, into a single streamlined workflow. By leveraging AI...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    AI App Lab

    AI App Lab

    Implementing large models into scenario-based applications

    ...The project focuses on helping developers bridge the gap between AI models and practical business use cases by offering a structured environment for creating production-ready AI systems. It includes a high-level SDK called Arkitect, which provides workflows and tools for integrating models, plugins, and multimodal capabilities such as text, image, and voice processing. The repository also contains a large collection of prototype applications that demonstrate how AI can be applied to scenarios such as customer service, education, content generation, and mobile automation. These examples allow developers to quickly replicate and customize solutions for their own business needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Amical

    Amical

    Open Source AI Dictation App

    ...It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate to the platform. Users can enhance transcription accuracy with custom vocabulary tailored to industry jargon, proper nouns, and personal terms, and set up personalized voice shortcuts to trigger workflows or dictate across applications. Amical supports multilingual dictation with over 50 languages at native-level accuracy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all necessary dependencies, so users can focus on experimenting with voices instead of managing tooling. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Eliza

    Eliza

    Autonomous agents for everyone

    Build and deploy autonomous AI agents with consistent personalities across Discord, Twitter, and Telegram. Full support for voice, text, and media interactions. Built-in RAG memory system, document processing, media analysis, and autonomous trading capabilities. Supports multiple AI models including Llama, GPT-4, and Claude. Create custom actions, add new platform integrations, and extend functionality through a modular plugin system. Full TypeScript support.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Operit AI

    Operit AI

    Powerful Android AI agent with tools, automation, and Linux shell

    Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 14
    Gradient Bang

    Gradient Bang

    Gradient Bang is an online multiplayer universe

    ...The project serves both as a prototype and a conceptual playground for testing how conversational AI systems behave when embedded into dynamic, game-like environments rather than static chat interfaces. It leverages the broader Pipecat architecture for multimodal and conversational AI orchestration, meaning that interactions can potentially extend beyond text into voice, events, and real-time systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Ainee

    Ainee

    Ainee - AI Notetaking and Learning Companion

    Ainee is your ultimate AI-powered notetaking and learning companion. Capture lecture notes in real-time and effortlessly transform audio, text, files, and YouTube videos into formatted notes, mindmaps, quizzes, flashcards, podcasts, and more. Explore our AI meeting note taker, AI notes, video transcript generator, PDF to AI converter, and AI flashcard maker. Enhance your learning with our AI voice recorder, article summarizer AI, and AI quiz generator. Additionally, share your knowledge base with others to foster the flow of information and help new users benefit from collective insights. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Amica

    Amica

    Amica is an open source interface for interactive communication

    ...Under the hood, Amica leverages modern web and desktop technologies: three.js and three-vrm for 3D rendering, Transformers.js for running models in the browser, Whisper and Silero VAD for speech recognition and voice-activity detection, and a variety of LLM backends such as llama.cpp servers, ChatGPT-compatible APIs, Ollama, KoboldCpp, and others. It also integrates multiple text-to-speech providers, including ElevenLabs, OpenAI, Coqui, RVC, and AllTalkTTS.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    TTS-Vue

    TTS-Vue

    Microsoft speech synthesis tool, built with Electron

    TTS-Vue is a desktop text-to-speech application built with Electron, Vue, ElementPlus, and Vite, focused on using Microsoft’s official Speech API for high-quality neural synthesis. It wraps the Microsoft TTS WebSocket interface in a clean UI so users can paste or load text, choose voices, tweak parameters, and export audio without touching raw API calls. The app supports SSML (Speech Synthesis Markup Language), letting power users specify fine-grained control over pronunciation, pauses,...
    Downloads: 85 This Week
    Last Update:
    See Project
  • 18
    Chat with GPT

    Chat with GPT

    An open-source ChatGPT app with a voice

    ...Users can review past chat sessions, modify system prompts, and adjust model parameters such as temperature to control response creativity. The platform also integrates speech capabilities by connecting to text-to-speech systems and speech recognition engines, enabling voice-based conversations with the AI assistant. Additional features include message editing, response regeneration, and the ability to share conversations through public links.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    onepoint

    onepoint

    Assistant tool that integrates coding, writing, and reading functions

    Onepoint is an open-source AI assistant based on Electron, designed to create the ultimate desktop productivity tool. Its initial goal was to develop a smart floating window similar to Apple's intelligent assistant that does not take up desktop space or system performance and can be quickly accessed through global hotkeys for user convenience. With ChatGPT technology, users can continuously train onepoint to generate and reconstruct content with greater accuracy (onpoint), thereby improving...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo