Showing 6596 open source projects for "audio linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 2
    NanoBoyAdvance

    NanoBoyAdvance

    A cycle-accurate Nintendo Game Boy Advance emulator

    NanoBoyAdvance is a cycle-accurate Game Boy Advance emulator that prioritizes precision and correctness in replicating original hardware behavior. It is designed to emulate the GBA at a very low level, including CPU timing, DMA operations, graphics processing, and memory behavior, ensuring that even edge cases and obscure hardware quirks are faithfully reproduced. The emulator achieves extremely high compatibility, passing multiple hardware test suites and accurately running games that rely...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 3
    BlogWizard

    BlogWizard

    Generate blog articles from video or audio

    BlogWizard is a demo/utility project built on top of Groq’s LLM infrastructure that converts video or audio content into well-structured blog posts, enabling creators to repurpose multimedia content into text — useful for SEO, accessibility, or reaching audiences that prefer reading. The tool uses transcription (e.g. via Whisper) to extract text from audio/video, then runs an LLM-based generation pipeline to transform that content into coherent, readable blog-format posts — with sections,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ChatTTS_colab is a wrapper project around the ChatTTS model that focuses on “one-click” deployment, especially in Google Colab. It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha”...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates. Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 6
    Amazon Chime SDK React Components

    Amazon Chime SDK React Components

    Chime React Component Library with integrations with the Amazon SDK

    The Amazon Chime SDK makes it easy to add collaborative audio calling, video calling, and screen share features to web applications by using the same infrastructure services that power millions of Amazon Chime online meetings. The Amazon Chime SDK React Component Library supplies client-side state management and reusable UI components for common web interfaces used in audio and video conferencing applications, including: video tile grids, microphone activity indicators, and call controls....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    stt

    stt

    Voice Recognition to Text Tool

    stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    OmniTools

    OmniTools

    Self-hosted collection of powerful web-based tools for everyday tasks

    OmniTools is a self-hosted web application that bundles a large collection of everyday utilities into a single clean interface you can run on your own infrastructure. It’s designed to replace the random assortment of “free online tools” people use for quick tasks, while avoiding ads, tracking, and the need to upload sensitive files to unknown servers. A key design choice is that file processing happens entirely on the client side, meaning your data stays in your browser instead of being sent...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    OpenAI Go

    OpenAI Go

    The official Go library for the OpenAI API

    OpenAI Go is the official Go client library for accessing the OpenAI API. It enables developers to integrate OpenAI’s models and features into Go applications with a clean and idiomatic interface. The library provides support for a wide range of API endpoints including chat completions, assistants, embeddings, image generation, audio processing, and batch jobs. It includes built-in tools for handling authentication, managing API requests, and parsing structured responses. The repository also...
    Downloads: 7 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    WPPConnect

    WPPConnect

    WPPConnect is an open source project

    WPPConnect is an open-source project developed by the JavaScript community with the aim of exporting functions from WhatsApp Web to the node, which can be used to support the creation of any interaction, such as customer service, media sending, intelligence recognition based on phrases artificial and many other things, use your imagination. We are the best WhatsApp automation solution you have been looking for. We are a team that started an OpenSource project that performs automation on...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    GLM-TTS

    GLM-TTS

    Controllable & emotion-expressive zero-shot TTS

    GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    HunyuanCustom

    HunyuanCustom

    Multimodal-Driven Architecture for Customized Video Generation

    HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images. The architecture builds on HunyuanVideo, with added modules for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    SameBoy

    SameBoy

    Game Boy and Game Boy Color emulator written in C

    SameBoy is a user friendly, powerful and open source Game Boy, Game Boy Color and Super Game Boy emulator for macOS, Windows and Unix-like platforms. SameBoy is extremely accurate and includes a wide range of both powerful debugging features and user-facing features, making it ideal for both casual players and developers. Of course, SameBoy also has every feature one would expect from an emulator – from save states to scaling filters. Supports Game Boy (DMG), Game Boy Pocket and Light (MGB),...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 14
    Unsloth Studio

    Unsloth Studio

    Unified web UI for training and running open models locally

    Unsloth Studio is a web-based interface for running and training AI models locally with a unified and user-friendly experience. It allows users to work with a wide range of models for text, audio, vision, embeddings, and more without relying heavily on cloud infrastructure. Built on top of the Unsloth framework, it focuses on high-performance training with reduced VRAM usage and faster speeds compared to traditional methods. The platform supports fine-tuning, pretraining, and reinforcement...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 15
    MARS5

    MARS5

    MARS5 speech model (TTS) from CAMB.AI

    MARS5-TTS is CAMB.AI’s open-source English speech model designed for high-quality text-to-speech and voice emulation. It uses a two-stage architecture that combines an autoregressive (AR) model with a non-autoregressive (NAR) model, giving it both expressiveness and speed. The model is built to handle prosodically challenging content such as sports commentary, anime dialogue, and other high-energy or highly varied speech patterns with realistic rhythm and intonation. To control speaker...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 17
    Rackula

    Rackula

    Drag and drop rack visualizer

    Rackula is a browser-based rack layout designer aimed at homelabbers, audio/video technicians, and equipment organizers who want a visual way to plan and document physical device racks. It runs entirely client-side with no backend server required, making it lightweight, fast, and easy to self-host or run locally without external dependencies. Users can drag and drop devices into customizable rack spaces, annotate equipment, set unit sizes, and manage complex layouts as their setup evolves....
    Downloads: 101 This Week
    Last Update:
    See Project
  • 18
    SimpleX

    SimpleX

    The first messaging platform operating without user identifiers

    Other apps have user IDs: Signal, Matrix, Session, Briar, Jami, Cwtch, etc. SimpleX does not, not even random numbers. This radically improves your privacy. The video shows how you connect to your friend via their 1-time QR-code, in person or via a video link. You can also connect by sharing an invitation link. Temporary anonymous pairwise identifiers SimpleX uses temporary anonymous pairwise addresses and credentials for each user contact or group member. It allows to deliver messages...
    Downloads: 129 This Week
    Last Update:
    See Project
  • 19
    Simple DirectMedia Layer

    Simple DirectMedia Layer

    Simple Directmedia Layer

    Simple DirectMedia Layer is a cross-platform development library designed to provide low-level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D. It is used by video playback software, emulators, and popular games including Valve's award-winning catalog and many Humble Bundle games. SDL officially supports Windows, macOS, Linux, iOS, and Android. Support for other platforms may be found in the source code.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 20
    Metadata Extractor

    Metadata Extractor

    Extracts Exif, IPTC, XMP, ICC and other metadata from image and video

    metadata-extractor is a Java library for reading metadata from media files. The library understands several formats of metadata, many of which may be present in a single image.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 21
    Hypackel Engine

    Hypackel Engine

    JavaScript-based game engine designed to empower developers

    Hypackel Engine is a JavaScript-based 2D game engine designed to provide beginner-friendly tools for creating simple games such as platformers, RPGs, and top-down shooters. It focuses on accessibility by offering a lightweight and easy-to-integrate script that developers can import directly into web-based projects. The engine includes built-in systems for handling physics, collisions, rendering, and animation, allowing developers to focus more on gameplay logic rather than low-level...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    Open Notebook

    Open Notebook

    An Open Source implementation of Notebook LM with more flexibility

    Open Notebook is an open-source, privacy-focused alternative to Google’s Notebook LM that gives users full control over their research and AI workflows. Designed to be self-hosted, it ensures complete data sovereignty by keeping your content local or within your own infrastructure. The platform supports 16+ AI providers—including OpenAI, Anthropic, Ollama, Google, and LM Studio—allowing flexible model choice and cost optimization. Open Notebook enables users to organize and analyze...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 23
    Web Scrobbler

    Web Scrobbler

    Scrobble music all around the web!

    Web Scrobbler helps music listeners to scrobble their online playback history. Web Scrobbler is a browser extension created for people who listen to music online through their browser, and would like to keep an updated playback history using scrobbling services, such as Last.fm, Libre.fm and ListenBrainz. Download and install the extension for your browser. You can use the download buttons above. Open the extension options, and expand the "Accounts" section, then sign in to a scrobbling...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    Competent Audio

    Competent Audio

    Machine graph audio engine for computer games

    Competent Audio (CA) is an audio engine suitable for video games. It is written in C, but is designed for interoperability with other languages. Windows and Linux binaries for x86 and amd64 are available. CA uses a machine graph model with support for arbitrary numbers of machines, limited only by the available system resources: - Samplers play back audio clips
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    ...Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends, follow social media sentiment, or do other tasks in data science. The system is programmer friendly, ready for creating and running the tools you use to measure and understand your world. In addition to search and GPT tools, you have what you need to take notes, write reports or presentations, record and edit audio or video.
    Downloads: 16 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB