Showing 6656 open source projects for "audio linux"

View related business solutions
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Sonic Pi

    Sonic Pi

    Sonic Pi is your free code-based music creation and performance tool

    Sonic Pi is a new kind of musical instrument. Instead of strumming strings or whacking things with sticks - you write code, live. Sonic Pi is a complete open source programming environment originally designed to explore and teach programming concepts within schools through the process of creating new sounds. In addition to being an engaging education resource it has evolved into an extremely powerful and performance-ready live coding instrument suitable for professional artists and DJs....
    Downloads: 17 This Week
    Last Update:
    See Project
  • 2
    Basic Pitch

    Basic Pitch

    A lightweight audio-to-MIDI converter with pitch bend detection

    Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence Lab. It's small, easy-to-use, pip install-able and npm install-able via its sibling repo. Basic Pitch may be simple, but it's is far from "basic"! basic-pitch is efficient and easy to use, and its multi pitch support, its ability to generalize across instruments, and its note accuracy compete with much larger and more resource-hungry AMT systems....
    Downloads: 41 This Week
    Last Update:
    See Project
  • 3
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 417 This Week
    Last Update:
    See Project
  • 4
    gm

    gm

    R Package for Music Score and Audio Generation

    Create music easily, and show musical scores and audio files in R Markdown documents, R Jupyter Notebooks and RStudio.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Feishin

    Feishin

    A modern self-hosted music player

    Feishin is an open-source social video platform project that aims to blend elements of community sharing and video discovery by letting users upload, browse, and interact with short and long-form content in a Web2-style experience. It includes features you’d expect from a modern video platform, such as profile pages, feeds, search functionality, and engagement tools like likes, comments, and follows, while maintaining a focus on performance and responsiveness. Built as a full-stack...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 6
    You-Get

    You-Get

    Dumb downloader that scrapes the web

    You-Get is a small command-line utility for downloading media (video, audio and images) from the Web when there are no other means to do so. It can download video and audio files from such popular web sites as YouTube, Twitter, Niconico, Vimeo, Flickr, Instagram and a whole lot more. You-Get is a great option for when you want to enjoy your favorite videos, audio or images from the internet without having to open any web browsers or get interrupted by ads. It’s also a good choice for...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    SenseVoice

    SenseVoice

    Multilingual speech recognition and audio understanding model

    SenseVoice is a speech foundation model designed to perform multiple voice understanding tasks from audio input. It provides capabilities such as automatic speech recognition, spoken language identification, speech emotion recognition, and audio event detection within a single system. SenseVoice is trained on more than 400,000 hours of speech data and supports over 50 languages for multilingual recognition tasks. It is built to achieve high transcription accuracy while maintaining efficient...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 9
    Lyrion Music Server

    Lyrion Music Server

    Server for Squeezebox and compatible players

    ...Administration happens through a friendly web interface, with options for library rescans, playlist management, and performance tuning on small devices like the Raspberry Pi. Written largely in Perl and designed to be cross-platform, it runs happily on Linux, macOS, Windows, and many NAS appliances, making it a dependable hub for whole-home
    Downloads: 20 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    YTM Desktop - YouTube Music Desktop App

    YTM Desktop - YouTube Music Desktop App

    A Desktop App for YouTube Music

    Free cross-platform Desktop Player for YouTube Music. Play, Pause, Stop, Previous, Next. Show/Hide window after double pressing the global play/pause media button. Show notification on track change. Media controls are embedded into the taskbar - ( for Windows ). Background music playing. Minimize the taskbar. See the lyrics of your favorite music. Settings to your choice. One-click and, done. You will be surprised at how easy it is. Always updated with the latest version. Control your music...
    Downloads: 87 This Week
    Last Update:
    See Project
  • 11
    LX Music Mobile

    LX Music Mobile

    A music software developed based on React native

    This is a mobile music application developed with React Native, created by the author “lyswhut”. It targets Android devices (Android 5 and above) and is designed for listening to and managing music, using Redux for state management and integrating custom music sources. The project supports “data-sync” so that users can deploy a server and keep playlists or libraries in sync across devices. The README clearly states that the UI and default behaviour aren’t especially geared toward new users —...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 12
    VibeVoice

    VibeVoice

    Open-source multi-speaker long-form text-to-speech model

    VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz,...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 13
    Spotifyd

    Spotifyd

    A spotify daemon

    An open source Spotify client running as a UNIX daemon. Spotifyd streams music just like the official client, but is more lightweight and supports more platforms. Spotifyd also supports the Spotify Connect protocol, which makes it show up as a device that can be controlled by the official clients.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    AudioLM - Pytorch

    AudioLM - Pytorch

    Implementation of AudioLM audio generation model in Pytorch

    Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch It also extends the work for conditioning with classifier free guidance with T5. This allows for one to do text-to-audio or TTS, not offered in the paper. Yes, this means VALL-E can be trained from this repository. It is essentially the same. This repository now also contains a MIT licensed version of SoundStream. It is also compatible with EnCodec, however, be aware that it...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    Swing Music

    Swing Music

    Swing Music is a beautiful, self-hosted music player

    Swing Music is a beautiful, self-hosted music player and streaming server that lets you bring your personal audio library online with a modern browser-based interface, giving you a private alternative to mainstream streaming services. Designed to be both elegant and powerful, the project scans your local music files (like MP3s or FLACs), organizes metadata, and streams them on-demand to any device with a browser or its Android client. It includes features like folder browsing, playlist...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    LiveKit

    LiveKit

    End-to-end stack for WebRTC. SFU media server and SDKs

    LiveKit is an open-source project that provides a scalable, multi-user conferencing system based on WebRTC, designed to offer real-time video, audio, and data capabilities for developers.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 17
    cmus -  C* Music Player

    cmus - C* Music Player

    Small, fast & powerful console music player for Unix-like systems

    cmus, also known as the C* Music Player is a small yet fast and powerful console music player for Unix-like operating systems. It comes with a number of great features, such as gapless playback, ReplayGain support, MP3 and Ogg streaming, easy-to-use directory browser, powerful playlist filters / live filtering and more. cmus also supports several input and output plugins. Input plugins include: Ogg Vorbis, MP3, FLAC, Opus, Musepack, WavPack, WAV, AAC, MP4, audio CD and more. Output plugins...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    AudioNotes

    AudioNotes

    Extract audio and video content and organize it into a Markdown note

    AudioNotes is an application (or proof-of-concept) that likely combines audio recording or playback with note-taking or annotation functionality — enabling users to record voice or audio and attach textual or timestamped notes, making it ideal for lectures, interviews, meetings, or personal memos. Such a tool offers a more expressive and flexible way to capture and revisit information: instead of just typed notes or raw audio, users get both audio context and structured notes. As an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. Its backend leverages OpenAI’s Whisper models for GPU-accelerated speech recognition and Parakeet V3 for efficient CPU-only transcription with automatic language detection. ...
    Downloads: 91 This Week
    Last Update:
    See Project
  • 21
    od4knb Linux

    od4knb Linux

    Linux® ISO's, Apps, Scripts and Howtos by Odie Pastunik

    The od4knb antiXradio distribution is based on antiX Linux® and runs very smooth on an old i386 (32 bit) computer. It contains a lot of dutch radio stations. You'll also find a lot of Howtos for novice Linux users on this site! [!] To continue, click on the Files tab, and scroll down. [Translate this site - Vertaal deze site - Traduire ce site - このサイトを翻訳する]...
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 22
    SuperCollider

    SuperCollider

    Audio server, programming language, and IDE for sound synthesis

    SuperCollider is a platform for audio synthesis and algorithmic composition, used by musicians, artists, and researchers working with sound. It is free and open source software available for Windows, macOS, and Linux. scsynth, a real-time audio server, forms the core of the platform. It features 400+ unit generators (“UGens”) for analysis, synthesis, and processing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Transformers

    Transformers

    State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

    Hugging Face Transformers provides APIs and tools to easily download and train state-of-the-art pre-trained models. Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 24
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    ebook2audiobook

    ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages

    ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of...
    Downloads: 30 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB