Search Results for "audio linux" - Page 11

Sort By:

Showing 850 open source projects for "audio linux"

View related business solutions

Python Clear Filters & Widen Search

Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
1

cdcover

cdcover allows the creation of inlay-sheets for jewel cd-cases. It is written in Python and uses Python-TK to provide an easy to use GUI. cdcover can access a CDDB-Server to get title and track-Info for audio CDs.

Downloads: 5 This Week

Last Update: 2022-10-05
See Project
2

WaveRNN

WaveRNN Vocoder + TTS

WaveRNN is a PyTorch implementation of DeepMind’s WaveRNN vocoder, bundled with a Tacotron-style TTS front end to form a complete text-to-speech stack. As a vocoder, WaveRNN models raw audio with a compact recurrent neural network that can generate high-quality waveforms more efficiently than many traditional autoregressive models. The repository includes scripts and code for preprocessing datasets such as LJSpeech, training Tacotron to produce mel spectrograms, training WaveRNN on those...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
3

nlpaug

Data augmentation for NLP

This Python library helps you with augmenting nlp for your machine learning projects. Visit this introduction to understand Data Augmentation in NLP. Augmenter is the basic element of augmentation while Flow is a pipeline to orchestra multi augmenters together.

Downloads: 0 This Week

Last Update: 2024-08-03
See Project
4

AI Atelier

Based on the Disco Diffusion, version of the AI art creation software

Based on the Disco Diffusion, we have developed a Chinese & English version of the AI art creation software "AI Atelier". We offer both Text-To-Image models (Disco Diffusion and VQGAN+CLIP) and Text-To-Text (GPT-J-6B and GPT-NEOX-20B) as options. Making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. When a modified version is used to provide a...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

castero

TUI podcast client for the terminal

castero is a TUI podcast client for the terminal.

Downloads: 6 This Week

Last Update: 2024-09-18
See Project
6

Tensorflow Transformers

State of the art faster Transformer with Tensorflow 2.0

Imagine auto-regressive generation to be 90x faster. tf-transformers (Tensorflow Transformers) is designed to harness the full power of Tensorflow 2, designed specifically for Transformer based architecture. These models can be applied on text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
7

NWT - Pytorch (wip)

Implementation of NWT, audio-to-video generation, in Pytorch

Implementation of NWT, audio-to-video generation, in Pytorch. The paper proposes a new discrete latent representation named Memcodes, which can be succinctly described as a type of multi-head hard-attention to learned memory (codebook) key/values. They claim the need for less codes and smaller codebook dimensions in order to achieve better reconstructions.

Downloads: 0 This Week

Last Update: 2023-03-22
See Project
8

AugLy

A data augmentations library for audio, image, text, and video

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity. AugLy is a great library to utilize for augmenting your data in model training, or to evaluate...

Downloads: 0 This Week

Last Update: 2022-03-29
See Project
9

Piano transcription

Task of transcribing piano recordings into MIDI files

Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool,...

Downloads: 7 This Week

Last Update: 2025-12-02
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
10

Music Source Separation

Separate audio recordings into individual sources

Music Source Separation is a PyTorch-based open-source implementation for the task of separating a music (or audio) recording into its constituent sources — for example isolating vocals, instruments, bass, accompaniment, or background from a mixed track. It aims to give users the ability to take any existing song and decompose it into separate stems (vocals, accompaniment, etc.), or to train custom separation models on their own datasets (e.g. for speech enhancement, instrument isolation, or...

Downloads: 2 This Week

Last Update: 2025-12-02
See Project
11

youtube-dl

Download videos from YouTube (and more sites)

youtube-dl is a command-line program to download videos from YouTube and a few more sites. It requires the Python interpreter (2.6, 2.7, or 3.2+), and it is not platform specific. We also provide a Windows executable that includes Python. youtube-dl should work in your Unix box, in Windows or in Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like. youtube-dl is a powerful, open-source command-line program designed to...

1 Review

Downloads: 86 This Week

Last Update: 2024-12-30
See Project
12

StreamTuner2 ♪♬#

Internet radio directory browser

Streamtuner2 is an internet radio station and video browser. It simply lists stations in categories from different directories. Launches your preferred media apps for playback. It's built in Python now, but retains UI similarity with the original StreamTuner 0.99

6 Reviews

Downloads: 56 This Week

Last Update: 2022-02-22
See Project
13

SVoice (Speech Voice Separation)

We provide a PyTorch implementation of the paper Voice Separation

SVoice is a PyTorch-based implementation of Facebook Research’s study on speaker voice separation as described in the paper “Voice Separation with an Unknown Number of Multiple Speakers.” This project presents a deep learning framework capable of separating mixed audio sequences where several people speak simultaneously, without prior knowledge of how many speakers are present. The model employs gated neural networks with recurrent processing blocks that disentangle voices over multiple...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
14

VoiceFixer

General Speech Restoration

VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that...

Downloads: 11 This Week

Last Update: 2025-11-28
See Project
15

Mocking Bird

Clone a voice in 5 seconds to generate arbitrary speech in real-time

MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English....

1 Review

Downloads: 1 This Week

Last Update: 2023-03-23
See Project
16

pytube

A lightweight, dependency-free Python library

Pytube is a lightweight, dependency-free Python library that enables downloading YouTube videos and audio streams with minimal setup. It supports video resolution selection, progressive or adaptive streams, and caption downloads. Pytube is ideal for automation scripts, archiving tools, and media applications that need to interface with YouTube content programmatically.

Downloads: 2 This Week

Last Update: 2025-07-01
See Project
17

pydatascope

Software oscilloscope using Python and tkinter

Software oscilloscope using Python and tkinter. Supports multiple sources: socket, file, audio, USB. Displays data by samples, time or frequency. Scales the input automatically or manually.

1 Review

Downloads: 4 This Week

Last Update: 2021-09-25
See Project
18

VidCutter

A modern yet simple multi-platform video cutter and joiner

A modern, simple to use, constantly evolving and hella fast MEDIA CUTTER + JOINER w/ frame-accurate SmartCut technology, chapter support, media stream selection for audio + subtitle channels and blackdetect video filter support to automatically detect scene changes or skip commercials in digital TV recordings. Chapter support allows scene chapter names to be included in final media metadata. NOTE: results will only work in media players that support chapters. Flatpak release includes the...

Downloads: 15 This Week

Last Update: 2024-06-29
See Project
19

DeepSpeech

Open source embedded speech-to-text engine

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the...

Downloads: 11 This Week

Last Update: 2021-04-08
See Project
20

Denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

Denoiser is a real-time speech enhancement model operating directly on raw waveforms, designed to clean noisy audio while running efficiently on CPU. It uses a causal encoder-decoder architecture with skip connections, optimized with losses defined both in the time domain and frequency domain to better suppress noise while preserving speech. Unlike models that operate on spectrograms alone, this design enables lower latency and coherent waveform output. The implementation includes data...

Downloads: 2 This Week

Last Update: 2025-10-07
See Project
21

Pydub

Manipulate audio with a simple and easy high level interface

Manipulate audio with a simple and easy high level interface. You can pass an optional bitrate argument to export using any syntax ffmpeg supports. Any further arguments supported by ffmpeg can be passed as a list in a 'parameters' argument, with switch first, argument second. Note that no validation takes place on these parameters, and you may be limited by what your particular build of ffmpeg/avlib supports. You can open and save WAV files with pure python. For opening and saving non-wav...

Downloads: 1 This Week

Last Update: 2021-10-08
See Project
22

NetEase-MusicBox

NetEase cloud music command line version

The high-quality command line version of NetEase Cloud Music is simple, elegant, silky and smooth, and is written based on Python. 320kbps high-quality music. Song, artist, album search. NetEase 22 song charts. Netease new disc recommendation. NetEase Featured Playlist. NetEase Anchor Radio. Private playlist, recommended daily. DJing, local collection, add at any time. Play progress and play mode display. Now playing and desktop lyrics display. Song comment display. One-click to enter the...

Downloads: 0 This Week

Last Update: 2021-06-25
See Project
23

HiFi-GAN

Generative Adversarial Networks for Efficient and High Fidelity Speech

HiFi-GAN is a GAN-based neural vocoder designed to generate high-fidelity speech waveforms from mel spectrograms with exceptional efficiency. It introduces a generator architecture tailored to model the periodic structure of speech and a set of discriminators that focus on different scales and periods of the waveform to better capture naturalness. The model targets a sweet spot between sample quality and generation speed, outperforming many previous GAN vocoders while being far faster than...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
24

OpenDAFF

Directional Audio File Format

OpenDAFF is a free, open-source software package for directional audio data - like the directivity of microphones, speakers, as well as head-related transfer functions (HRTFs)

Downloads: 1 This Week

Last Update: 2021-01-08
See Project
25

Youtube Video Downloader

Youtube Video Downloader is Open Source GUI tool

Youtube Video Downloader is Open Source GUI tool to download Youtube video. It is Developed with Python, Qt, and Pytube Library. It is Multi-thread Application. Best Available Option download video in highly available Quality . Download Videos in 720p, 480p, 360p etc.

2 Reviews

Downloads: 6 This Week

Last Update: 2021-01-06
See Project