Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "audio processing" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

BSD 200
Linux 198
Windows 171
More...
Mac 159
ChromeOS 146
Desktop Operating Systems 13
Mobile Operating Systems 3
Server Operating Systems 2
Game Consoles 1

Category

Multimedia 138
Artificial Intelligence 54
Software Development 30
Scientific/Engineering 20
System 14
Games 6
Business 5
Communications 5
Text Editors 5
Internet 3
Database 1
Desktop Environment 1
Education 1
Security 1
Social sciences 1

License

OSI-Approved Open Source 177
Other License 3
Creative Commons Attribution License 2
Public Domain 1

Translations

English 60
German 12
French 8
Italian 4
More...
Russian 4
Dutch 3
Japanese 3
Portuguese 3
Spanish 3
Brazilian Portuguese 2
Catalan 2
Chinese (Simplified) 2
Estonian 2
Polish 2
Turkish 2
Arabic 1
Croatian 1
Czech 1
Danish 1
Finnish 1
Galician 1
Greek 1
Hebrew 1
Hungarian 1
Romanian 1
Slovak 1
Swedish 1
Telugu 1
Ukrainian 1

Programming Language

Status

Production/Stable 34
Beta 30
Pre-Alpha 19
Alpha 18
More...
Planning 10
Inactive 5
Mature 4

200 projects for "audio processing" with 1 filter applied:

BSD Clear Filters & Widen Search

$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
1

Scriberr

Self-hosted AI audio transcription

Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. ...

Downloads: 4 This Week

Last Update: 2026-03-19
See Project
2

VERT.sh

The next-generation file converter

VERT is a modern, privacy-focused file conversion platform that leverages WebAssembly to perform conversions entirely on the user’s device rather than relying on cloud-based processing. Built with Svelte and TypeScript, it provides a clean and responsive interface for converting a wide variety of file types, including images, audio, video, and documents. One of its defining characteristics is its local-first approach, which eliminates the need to upload files to external servers, thereby improving both privacy and performance. ...

Downloads: 2 This Week

Last Update: 2026-04-08
See Project
3

PyAV

Pythonic bindings for FFmpeg's libraries

...While powerful, it requires a solid understanding of FFmpeg concepts, as it prioritizes flexibility and control over abstraction. Overall, PyAV is a robust tool for developers building advanced video and audio processing systems in Python.

Downloads: 0 This Week

Last Update: 2026-04-24
See Project
4

ffmpeg-commander

A web-based GUI for quickly generating common FFmpeg command-line

...The interface is inspired by tools like HandBrake, aiming to lower the barrier to entry for FFmpeg usage. Overall, it acts as a bridge between ease of use and powerful multimedia processing capabilities.

Downloads: 6 This Week

Last Update: 2026-04-28
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
5

Ultravox

Fast multimodal LLM for real-time voice interaction and AI apps

Ultravox is an open source multimodal large language model designed specifically for real-time voice-based interactions. It is built to process both text and spoken audio directly, eliminating the need for a separate speech recognition stage and enabling more seamless conversational experiences. Ultravox works by combining text prompts with encoded audio inputs, allowing it to understand spoken language alongside written instructions in a unified pipeline. Internally, it leverages pretrained...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
6

VideoCaptioner

AI-powered tool for generating, optimizing, and translating subtitles

VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps.

Downloads: 15 This Week

Last Update: 13 hours ago
See Project
7

FFmate

FFmate is a modern and powerful automation layer

FFmate is a graphical utility designed to simplify the use of FFmpeg by providing an intuitive interface for building and executing multimedia processing commands. It allows users to perform tasks such as transcoding, trimming, and format conversion without needing to memorize command-line syntax. The tool dynamically generates FFmpeg commands based on user input, making complex workflows more accessible. It supports a wide range of audio and video formats, enabling flexible media processing. ffmate is designed for both beginners and advanced users, offering a balance between simplicity and customization. ...

Downloads: 0 This Week

Last Update: 2026-05-02
See Project
8

SALMONN family

A suite of advanced multi-modal LLMs

SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...

Downloads: 0 This Week

Last Update: 2026-05-14
See Project
9

FastRTC

The python library for real-time communication

FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application). This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. ...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

Orpheus TTS

Towards Human-Sounding Speech

...It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. Inference is provided through a Python package that uses vLLM under the hood for high-throughput, low-latency generation, including streaming examples that show how to generate audio chunks in real time. The maintainers provide Colab notebooks, a standardized prompting format, and one-click deployment via Baseten for production-grade, FP8/FP16 optimized inference with ~200 ms streaming latency.

Downloads: 6 This Week

Last Update: 2025-12-05
See Project
11

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common...

Downloads: 35 This Week

Last Update: 2026-03-22
See Project
12

FFmpegCommand

Command library suitable for Android. It implements audio and video

FFmpegCommand is a graphical utility designed to simplify the generation and execution of FFmpeg commands for multimedia processing tasks. It provides an interface where users can configure parameters such as codecs, bitrates, and formats without manually writing command-line instructions. The tool dynamically builds FFmpeg commands based on user selections, making complex workflows more accessible. It supports common operations such as transcoding, trimming, and format conversion....

Downloads: 1 This Week

Last Update: 2026-05-03
See Project
13

Agili Hacker Podcast

AI tool that turns Hacker News posts into daily podcast updates

Hacker Podcast is an AI-powered project that turns top Hacker News stories into a Chinese podcast. It automatically fetches trending posts each day, processes the content with AI, and generates concise summaries before converting them into audio. This creates a hands-free way to stay updated on tech, startups, and developer discussions without reading long threads. Hacker Podcast combines content aggregation, natural language processing, and text-to-speech to deliver clear and digestible updates. Users can listen through web interfaces or podcast platforms, while also accessing written summaries for deeper reading. ...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
14

AnalysisAVP

Encode decode, rgb yuv h264 aac flv mp4 rtmp

AnalysisAVP is a comprehensive educational repository focused on audio and video technology concepts, providing structured knowledge across multimedia systems and processing pipelines. It covers foundational topics such as encoding, decoding, color formats like RGB and YUV, and widely used codecs including H.264 and AAC. The project also explores media container formats like MP4 and FLV, along with streaming protocols such as RTMP and WebRTC, offering a broad understanding of media transmission. ...

Downloads: 0 This Week

Last Update: 2026-04-27
See Project
15

Music-bot

A complete code to download for a cool Discord music bot

Music-bot is a Discord bot designed to stream and manage music playback within voice channels, providing users with an interactive audio experience. It supports playing music from various online sources, including streaming platforms and direct URLs. The bot includes queue management features that allow users to add, remove, and reorder tracks during playback. It integrates audio processing tools to ensure smooth streaming and consistent playback quality. Music-bot also supports commands for controlling playback, such as pause, resume, skip, and volume adjustment. ...

Downloads: 0 This Week

Last Update: 2026-04-27
See Project
16

Markdownify MCP Server

Convert files and web content into clean, usable Markdown easily

...It also allows retrieval of existing Markdown files, making it useful for documentation, research, and AI-assisted workflows. By standardizing content into Markdown, it helps unify inputs across different sources for better processing and integration with AI tools and developer environments.

Downloads: 0 This Week

Last Update: 2026-05-02
See Project
17

TADA

Open Source Speech Language Model

TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
18

NanoBoyAdvance

A cycle-accurate Nintendo Game Boy Advance emulator

NanoBoyAdvance is a cycle-accurate Game Boy Advance emulator that prioritizes precision and correctness in replicating original hardware behavior. It is designed to emulate the GBA at a very low level, including CPU timing, DMA operations, graphics processing, and memory behavior, ensuring that even edge cases and obscure hardware quirks are faithfully reproduced. The emulator achieves extremely high compatibility, passing multiple hardware test suites and accurately running games that rely on precise timing conditions. In addition to accuracy, it introduces enhancements such as a high-quality audio mixer that improves sound output without altering internal emulation behavior. ...

Downloads: 5 This Week

Last Update: 2026-05-10
See Project
19

Live API Web Console

A react-based starter app for using the Live API over websockets

...It ships with demo branches that show grounded search, function calling, and visualization—one example has the model calling a function that renders Vega/Altair graphs directly in the UI. Under the hood there’s an event-emitting WebSocket client, an audio in/out processing layer, and a minimal scaffolded view so you can focus on your app logic rather than wiring.

Downloads: 0 This Week

Last Update: 2025-10-14
See Project
20

Pipecat

Framework for building real-time voice and multimodal AI agents

Pipecat is an open source Python framework designed for building real-time voice and multimodal conversational AI agents. It provides developers with tools to orchestrate complex pipelines that combine speech recognition, language models, audio processing, and speech synthesis into a cohesive conversational system. Pipecat focuses on low-latency interactions so voice conversations with AI feel natural and responsive during live use. Pipecat allows applications to integrate multiple AI services and transports, enabling flexible deployment across different environments and communication channels. ...

Downloads: 0 This Week

Last Update: 2026-05-16
See Project
21

AutoSubs

Instantly generate AI-powered subtitles on your device

...Users can customize subtitle styling, adjust timing, and export results in multiple formats, making it suitable for content creators, filmmakers, and editors. AutoSubs is designed with performance in mind, offering efficient processing through a Rust-based backend and supporting multiple operating systems including Windows, macOS, and Linux.

Downloads: 12 This Week

Last Update: 2026-04-30
See Project
22

MediaPipe Solutions

Cross-platform, customizable ML solutions

MediaPipe is an open-source framework developed by Google for building cross-platform machine learning pipelines that process audio, video, and other streaming data in real time. The system provides developers with tools and reusable components that allow them to combine multiple machine learning models with preprocessing and postprocessing logic into efficient perception pipelines. These pipelines can run on a wide variety of platforms including mobile devices, desktop systems, web...

Downloads: 1 This Week

Last Update: 2026-04-23
See Project
23

ffmpeg_develop_doc

2023, the latest audio and video learning materials, projects

ffmpeg_develop_doc is a curated repository that aggregates a comprehensive collection of learning resources related to FFmpeg and multimedia development. It includes command references, technical articles, academic papers, tutorials, and example projects covering audio and video processing concepts. The repository is structured as a knowledge base, offering materials on encoding, decoding, streaming protocols, and real-time media systems. It also contains interview preparation resources and practical case studies, making it useful for both learning and professional development. In addition to documentation, it links to open-source projects and implementation examples that demonstrate real-world usage of FFmpeg. ...

Downloads: 0 This Week

Last Update: 2026-04-24
See Project
24

Spring AI Alibaba Examples

Spring AI Alibaba examples for building and testing AI apps

...It is designed to help developers understand core concepts, explore practical implementations, and follow best practices when building AI-powered systems using the Spring ecosystem. Each module focuses on a specific use case such as chat, image processing, audio handling, graph workflows, and retrieval-augmented generation. The examples highlight how to integrate AI models, manage prompts, handle memory, and build multi-model or multi-agent workflows. Developers can explore individual project folders for detailed instructions and implementation guidance. Spring AI Alibaba Examples also supports experimentation through playground modules and encourages contributions to expand real-world AI use cases and improve development practices.

1 Review

Downloads: 2 This Week

Last Update: 2 days ago
See Project
25

clip-js

online video editor built with nextjs, remotion and ffmpeg

clip-js is a browser-based video editor built with modern web technologies such as Next.js and Remotion, designed to provide real-time editing and rendering directly in the browser. It enables users to create and edit video compositions using a timeline interface, combining video, audio, images, and text layers into a single project. The system uses a WebAssembly port of FFmpeg to perform high-quality rendering, allowing export of videos without relying on server-side processing. It includes interactive controls for trimming, splitting, and arranging media elements with precise timing. The editor supports dynamic adjustments such as opacity, positioning, and layering to fine-tune compositions. ...

Downloads: 0 This Week

Last Update: 2026-04-27
See Project

Previous
1
You're on page 2
3
4
5
6
Next

Related Searches

tts

gba emulator for chromebook

srt to speech

clever ffmpeg gui

multilingual language

chromebook game emulator

videocaptioner

transcribe video

transcribe audio to srt free

transcribe and translate audio

Related Categories

Multimedia

Artificial Intelligence

Software Development

Scientific/Engineering

System

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise