Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "audio processing" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 374
Windows 325
Mac 291
More...
BSD 198
ChromeOS 146
Desktop Operating Systems 16
Mobile Operating Systems 10
Server Operating Systems 2
Game Consoles 1

Category

Multimedia 245
Artificial Intelligence 99
Software Development 57
Scientific/Engineering 32
System 19
Games 12
Business 8
Text Editors 8
Communications 6
Internet 6
Education 4
Database 2
Desktop Environment 2
Formats and Protocols 1
Social sciences 1

License

OSI-Approved Open Source 311
Creative Commons Attribution License 9
Other License 5
GNU Free Documentation License 2
More...
Public Domain 2

Translations

Programming Language

Status

Production/Stable 54
Beta 51
Pre-Alpha 24
Alpha 23
More...
Planning 14
Mature 8
Inactive 6

Showing 374 open source projects for "audio processing"

View related business solutions

Linux Clear Filters & Widen Search

Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.

Sign Up Free
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
1

Oboe

Oboe is a C++ library that makes it easy to build high-performance

oboe is a C++ library for building high-performance audio apps on Android, providing a unified, low-latency API over AAudio and OpenSL ES. It abstracts device and API-version differences so developers can focus on audio processing instead of platform quirks. The library emphasizes minimal latency and glitch-free playback/recording via tuned buffer strategies and callback-driven I/O. It supports features like floating-point audio, channel configuration, sample-rate negotiation, and stream sharing to match device capabilities. ...

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
2

FFmpeg Batch AV Converter

FFmpeg Batch AV Converter

FFmpeg Batch AV Converter is a graphical front-end for FFmpeg designed to simplify advanced multimedia processing through an intuitive interface while preserving full access to FFmpeg’s capabilities. It allows users to perform complex encoding, conversion, and editing operations using drag-and-drop workflows instead of command-line input. The application supports both single and batch processing, enabling users to handle large volumes of media files efficiently. It includes tools for...

Downloads: 2 This Week

Last Update: 2026-04-24
See Project
3

Competent Audio

Machine graph audio engine for computer games

.... - Mixers combine audio signals and optionally perform signal processing. - Sinks send audio signals to an output device. Stereo and mono sound output is supported via a slightly customized version of libsoundio 2.0. Audio clips can have arbitrary channel counts, and can be queued for streaming or dynamic music. CA contains a very simple embedded VM for running custom signal processors, allowing you to add custom DSP code (currently assembly language only) without compiling native code. ...

Downloads: 0 This Week

Last Update: 2026-01-10
See Project
4

Verticals v3

Automated YouTube Shorts pipeline

...The pipeline emphasizes automation, allowing users to produce short-form content at scale with minimal manual intervention. It integrates FFmpeg and other media processing tools to handle video transformations, resizing, and encoding. The system also supports adding overlays, captions, and audio enhancements to improve engagement. Designed for creators and developers, it enables repeatable workflows for generating social media content efficiently. Its modular structure allows customization of each stage in the pipeline, making it adaptable to different content strategies.

Downloads: 1 This Week

Last Update: 5 days ago
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

Handy STT

A free, open source, and extensible speech-to-text application

Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active...

Downloads: 46 This Week

Last Update: 2026-04-27
See Project
6

VERT.sh

The next-generation file converter

VERT is a modern, privacy-focused file conversion platform that leverages WebAssembly to perform conversions entirely on the user’s device rather than relying on cloud-based processing. Built with Svelte and TypeScript, it provides a clean and responsive interface for converting a wide variety of file types, including images, audio, video, and documents. One of its defining characteristics is its local-first approach, which eliminates the need to upload files to external servers, thereby improving both privacy and performance. ...

Downloads: 3 This Week

Last Update: 2026-04-08
See Project
7

ScreenPipe

AI app store powered by 24/7 desktop history. open source

Screenpipe is an AI app store powered by continuous desktop history recording. It operates entirely locally, offering developers a platform to build, distribute, and monetize AI applications that leverage comprehensive contextual data from users' desktop activities.

Downloads: 33 This Week

Last Update: 13 hours ago
See Project
8

LiveAvatar

Streaming Real-time Audio-Driven Avatar Generation

LiveAvatar is an open-source research and implementation project that provides a unified framework for real-time, streaming, interactive avatar video generation driven by audio and other control signals. It implements techniques from state-of-the-art diffusion-based avatar modeling to support infinite-length continuous video generation with low latency, enabling interactive AI avatars that maintain continuity and realism over extended sessions. The project co-designs algorithms and system optimizations, such as block-wise autoregressive processing and fast sampling strategies, to deliver real-time frame rates (e.g., ~45 FPS on appropriate GPU clusters) while handling non-stop generation without quality degradation. ...

Downloads: 0 This Week

Last Update: 17 hours ago
See Project
9

miniaudio

Audio playback and capture library written in C,

miniaudio is written in C with no dependencies except the standard library and should compile cleanly on all major compilers without the need to install any additional development packages. All major desktop and mobile platforms are supported. miniaudio gives you complete flexibility. With the low-level API, just initialize a connection to the device and send or receive raw audio data. The modular design of miniaudio allows you to use the low-level API without compromising your ability to...

Downloads: 3 This Week

Last Update: 2026-03-03
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentations. ...

Downloads: 1 This Week

Last Update: 2026-04-16
See Project
11

LTX-Video

Official repository for LTX-Video

LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation.

Downloads: 2 This Week

Last Update: 2026-01-11
See Project
12

PyAV

Pythonic bindings for FFmpeg's libraries

...While powerful, it requires a solid understanding of FFmpeg concepts, as it prioritizes flexibility and control over abstraction. Overall, PyAV is a robust tool for developers building advanced video and audio processing systems in Python.

Downloads: 0 This Week

Last Update: 2026-04-24
See Project
13

KrillinAI

Video translation and dubbing tool powered by LLMs

...It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed audio tracks. KrillinAI supports both landscape and portrait videos, which makes it suitable for a wide range of platforms — from YouTube to TikTok or other vertical-video sites — and ensures correct formatting and layout for the final video. The tool offers “one-click” workflows and desktop versions, lowering the barrier for users who may not be familiar with video editing or audio processing pipelines.

Downloads: 10 This Week

Last Update: 2025-11-28
See Project
14

Scriberr

Self-hosted AI audio transcription

Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. ...

Downloads: 3 This Week

Last Update: 2026-03-19
See Project
15

VideoCaptioner

AI-powered tool for generating, optimizing, and translating subtitles

VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps.

Downloads: 20 This Week

Last Update: 2 days ago
See Project
16

JamTools

JamTools is a cross-platform gadget set software

JamTools is a multifunctional desktop utility suite designed to provide a collection of tools for productivity, media processing, and system enhancements within a single application. It integrates various features such as file management, multimedia handling, and system utilities into a unified interface. The project emphasizes ease of use while offering advanced functionality for handling common tasks efficiently. It includes support for media-related operations, often leveraging FFmpeg for processing video and audio content. ...

Downloads: 0 This Week

Last Update: 2026-04-28
See Project
17

Ultravox

Fast multimodal LLM for real-time voice interaction and AI apps

Ultravox is an open source multimodal large language model designed specifically for real-time voice-based interactions. It is built to process both text and spoken audio directly, eliminating the need for a separate speech recognition stage and enabling more seamless conversational experiences. Ultravox works by combining text prompts with encoded audio inputs, allowing it to understand spoken language alongside written instructions in a unified pipeline. Internally, it leverages pretrained...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
18

Speech Note

Speech Note Linux app. Note taking, reading and translating

...It combines speech-to-text, text-to-speech, and machine translation in a single interface, allowing users to dictate notes, listen back to them, and translate them without ever sending data to the cloud. All processing is done locally, which means audio, text, and translations never leave the device, emphasizing strong privacy guarantees. The application supports multiple STT engines such as Coqui STT (DeepSpeech fork), Vosk, whisper.cpp, Faster Whisper, and april-asr, giving users flexibility in accuracy, speed, and hardware requirements. For text-to-speech, it can plug into a wide range of engines including espeak-ng, MBROLA, Piper, RHVoice, Coqui TTS, Mimic 3, WhisperSpeech, Kokoro, Parler-TTS, F5-TTS, and even classic S.A.M., making it highly customizable in terms of voices and languages.

Downloads: 23 This Week

Last Update: 2026-04-15
See Project
19

ffmpeg-commander

A web-based GUI for quickly generating common FFmpeg command-line

...The interface is inspired by tools like HandBrake, aiming to lower the barrier to entry for FFmpeg usage. Overall, it acts as a bridge between ease of use and powerful multimedia processing capabilities.

Downloads: 5 This Week

Last Update: 2026-04-28
See Project
20

FFmate

FFmate is a modern and powerful automation layer

FFmate is a graphical utility designed to simplify the use of FFmpeg by providing an intuitive interface for building and executing multimedia processing commands. It allows users to perform tasks such as transcoding, trimming, and format conversion without needing to memorize command-line syntax. The tool dynamically generates FFmpeg commands based on user input, making complex workflows more accessible. It supports a wide range of audio and video formats, enabling flexible media processing. ffmate is designed for both beginners and advanced users, offering a balance between simplicity and customization. ...

Downloads: 0 This Week

Last Update: 2026-05-02
See Project
21

SALMONN family

A suite of advanced multi-modal LLMs

SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...

Downloads: 0 This Week

Last Update: 2026-05-14
See Project
22

MoviePy

Video editing with Python

MoviePy is a Python module for video editing, which can be used for basic operations (like cuts, concatenations, title insertions), video compositing (a.k.a. non-linear editing), video processing, or to create advanced effects. It can read and write the most common video formats, including GIF. MoviePy is an open source software originally written by Zulko and released under the MIT licence. It works on Windows, Mac, and Linux, with Python 2 or Python 3. The code is hosted on Github, where...

Downloads: 20 This Week

Last Update: 2025-05-21
See Project
23

Datasets

Hub of ready-to-use datasets for ML models

Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency. ...

Downloads: 0 This Week

Last Update: 2026-04-27
See Project
24

Music-bot

A complete code to download for a cool Discord music bot

Music-bot is a Discord bot designed to stream and manage music playback within voice channels, providing users with an interactive audio experience. It supports playing music from various online sources, including streaming platforms and direct URLs. The bot includes queue management features that allow users to add, remove, and reorder tracks during playback. It integrates audio processing tools to ensure smooth streaming and consistent playback quality. Music-bot also supports commands for controlling playback, such as pause, resume, skip, and volume adjustment. ...

Downloads: 1 This Week

Last Update: 2026-04-27
See Project
25

FastRTC

The python library for real-time communication

FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application). This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. ...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

speech

handy

dubbing

transcribe audio to srt

transcribe

speech to text

ai video

multilingual language

speech note appimage

clever ffmpeg gui

Related Categories

Multimedia

Artificial Intelligence

Software Development

Scientific/Engineering

System

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise