mega voice command free download

Showing 39 open source projects for "mega voice command"

View related business solutions

Artificial Intelligence Linux Clear Filters & Widen Search

Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
1

sag

Like the macOS say command, but with a modern voice

sag is a command-line text-to-speech utility inspired by the macOS say command but powered by modern ElevenLabs voice synthesis technology. The project allows users to stream synthesized speech directly to speakers, save audio files, or list and manage available voices through a lightweight terminal interface. Designed for speed and convenience, sag supports voice selection, playback rate adjustments, output format inference, and configurable API endpoints for flexible deployment. ...

Downloads: 2 This Week

Last Update: 2026-06-11
See Project
2

Real-Time Voice Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

...In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that captures voice characteristics; this embedding is then used by a Tacotron-style synthesizer to generate spectrograms from text, which a WaveRNN-based vocoder finally turns into audio. The repo includes both a command-line demo and a graphical “toolbox” application where you can load reference voices, type text, and hear the synthesized results interactively.

Downloads: 2 This Week

Last Update: 2026-03-09
See Project
3

OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

The OmniVoice project is a cutting-edge multilingual text-to-speech system designed to generate high-quality speech across more than 600 languages. Built on a diffusion language model-style architecture, it combines scalability with strong performance, enabling both natural-sounding voice synthesis and efficient inference speeds. One of its most notable capabilities is zero-shot voice cloning, allowing users to replicate a speaker’s voice using only a short reference audio clip. In addition, it supports voice design through configurable attributes such as gender, accent, pitch, and speaking style, giving users fine-grained control over generated speech. ...

Downloads: 16 This Week

Last Update: 2026-04-28
See Project
4

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

...A Docker image is provided for one-command deployment, and environment variables can be used to configure default voice, language, response format, authentication, and logging options.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
5

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...

Downloads: 12 This Week

Last Update: 2025-11-30
See Project
6

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications.

Downloads: 15 This Week

Last Update: 2026-03-22
See Project
7

SafeClaw

Chat with it via text and voice

SafeClaw is an open-source, entirely local alternative to cloud-based AI assistants like OpenClaw, enabling users to build a personal assistant that runs on their own machine without incurring API usage charges or exposing data to third-party services. It emphasizes privacy and predictability by using traditional programming, rule-based intent parsing, and established machine learning tools rather than large language models, meaning there are no per-token API costs and deterministic...

Downloads: 0 This Week

Last Update: 2026-05-09
See Project
8

Harbor LLM

Run a full local LLM stack with one command using Docker

Harbor is an open source, containerized toolkit designed to simplify running local large language model (LLM) environments. It combines a CLI and companion app to launch backends, frontends, and supporting services with minimal setup. With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. ...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
9

Flowly AI

Flowly is 100x faster than OpenClaw

...Flowly also includes voice capabilities, enabling real-time phone interactions using speech-to-text and text-to-speech systems. Overall, it provides a powerful, extensible, and privacy-focused alternative to cloud-based AI assistants.

Downloads: 11 This Week

Last Update: 2026-03-29
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. ...

Downloads: 1 This Week

Last Update: 2026-06-06
See Project
11

MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server

MiniMax-MCP is the official Model Context Protocol (MCP) server for accessing MiniMax’s multimodal generative APIs from MCP-compatible clients. It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license,...

Downloads: 0 This Week

Last Update: 2026-05-21
See Project
12

Violin

Open-source Video Translation Skill

...It can be used from the command line, through a FastAPI web app, or as a Claude Code skill. Violin supports multilingual workflows and is useful for creators, educators, localization teams, and developers building automated video translation pipelines. It is especially practical for turning lectures, tutorials, interviews, demos, and social videos into accessible content for wider audiences.

Downloads: 0 This Week

Last Update: 2026-05-19
See Project
13

annyang!

Speech recognition for your site

annyang is a tiny javascript library that lets your visitors control your site with voice commands. annyang supports multiple languages, has no dependencies, weighs just 2kb and is free to use. annyang understands commands with named variables, splats, and optional words. Use named variables for one word arguments in your command. Use splats to capture multi-word text at the end of your command (greedy). Use optional words or phrases to define a part of the command as optional. annyang plays nicely with all browsers, progressively enhancing browsers that support SpeechRecognition, while leaving users with older browsers unaffected. ...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
14

Whisper-WebUI

A Web UI for easy subtitle using whisper model

Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.

Downloads: 8 This Week

Last Update: 2026-03-18
See Project
15

Open Interpreter

A natural language interface for computers

Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards. Runs locally or via configured remote LLM servers/inference backends, giving flexibility to use models you trust or have...

Downloads: 23 This Week

Last Update: 2025-09-12
See Project
16

Eris

A NodeJS Discord library

A Node.js wrapper for interfacing with Discord. You will need NodeJS 10.4+. If you need voice support you will also need Python 2.7 and a C++ compiler. Create a directory for your bot, and change to that directory in your command line. If you want to be more updated (at the expense of stability), you can install the beta builds instead. Eris supports a few optional libraries that could potentially improve bot performance but may require additional dependencies.

Downloads: 0 This Week

Last Update: 2024-09-22
See Project
17

Moltis

A Rust-native claw you can trust

Moltis is an open-source personal AI assistant platform written in Rust that is designed to run as a fully self-hosted, local-first agent environment. It compiles the entire assistant stack, including the web interface, model routing, memory, and tools, into a single self-contained binary with no external runtime dependencies. The system supports multiple large language model providers alongside local models, enabling users to maintain privacy while still accessing cloud capabilities when...

Downloads: 0 This Week

Last Update: 2026-06-04
See Project
18

VoiceClip

VoiceClip es una aplicación de asistencia a usuarios

VoiceClip es una aplicación de asistencia a usuarios diseñada para integrarse de manera fluida en su entorno de trabajo, proporcionando un acceso rápido y eficiente a diversas funcionalidades mediante comandos de voz y texto. Presentada como una barra de herramientas que permanece siempre visible en primer plano, VoiceClip busca simplificar tareas comunes, mejorar la productividad y facilitar la interacción con su sistema operativo y con tecnologías avanzadas de inteligencia artificial

1 Review

Downloads: 1 This Week

Last Update: 2025-04-30
See Project
19

KoboldCpp

Run GGUF models easily with a UI or API. One File. Zero Install.

KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.

Downloads: 325 This Week

Last Update: 9 hours ago
See Project
20

eGuideDog free software for the blind

eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.

16 Reviews

Downloads: 161 This Week

Last Update: 14 hours ago
See Project
21

Maia

MAIA (MyApp Intelligence Artificial) is designed to provide a foundation for building your own voice-controlled assistant with Python. It uses various libraries and modules for speech recognition, text-to-speech synthesis, and custom functionality.

Downloads: 0 This Week

Last Update: 2024-04-21
See Project
22

Audio Webui

A webui for different audio related Neural Networks

...For more advanced users, it exposes a rich set of command-line flags to control behavior such as skipping installation, disabling venv, changing model cache directories, sharing Gradio links, setting passwords, and specifying themes or ports.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
23

MyWingman

Personal AI Assistant For Windows , Linux

...Powered by the Facebook BlenderBot-1B-Distill model, Wingman excels in open-domain conversations, providing engaging and human-like interactions. 🔊 Play your favorite songs on YouTube or any online platform with just a voice command. 🌐 Open websites instantly, letting you access information and resources in a snap. 🔍 Perform quick Google searches and get relevant results without lifting a finger. 📖 Access the vast knowledge of Wikipedia with ease, as Wingman fetches you insightful information. 📸 Capture screenshots effortlessly, allowing you to save and share important moments...

Downloads: 1 This Week

Last Update: 2023-07-12
See Project
24

min(DALL·E)

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch

This is a fast, minimal port of Boris Dayma's DALL·E Mini (with mega weights). It has been stripped down for inference and converted to PyTorch. The only third-party dependencies are numpy, requests, pillow and torch. The required models will be downloaded to models_root if they are not already there. Set the dtype to torch.float16 to save GPU memory. If you have an Ampere architecture GPU you can use torch.bfloat16. Set the device to either cuda or "cpu". Once everything has finished...

Downloads: 0 This Week

Last Update: 2022-08-04
See Project
25

VoiceOver

VoiceOver is a web application that allows you to transcribe audio

VoiceOver is a web application that allows you to transcribe English audio and listen to it in another voice. Choose a source, an audio file (.wav) in English only. Transcribe audio, several algorithms will take care of it. Listen to the generated transcription, a man or a woman, it's up to you!

1 Review

Downloads: 0 This Week

Last Update: 2023-03-24
See Project