2041 projects for "audio for mac" with 1 filter applied:

  • Outgrown Windows Task Scheduler? Icon
    Outgrown Windows Task Scheduler?

    Free diagnostic identifies where your workflow is breaking down—with instant analysis of your scheduling environment.

    Windows Task Scheduler wasn't built for complex, cross-platform automation. Get a free diagnostic that shows exactly where things are failing and provides remediation recommendations. Interactive HTML report delivered in minutes.
    Download Free Tool
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    HeartMuLa

    HeartMuLa

    A Family of Open Sourced Music Foundation Models

    HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 3
    Shairport Sync

    Shairport Sync

    AirPlay audio player

    Shairport Sync adds multi-room capability with audio synchronization. Shairport Sync is an AirPlay 1 audio player. Switch to the development branch for a version with limited AirPlay 2 functionality. Shairport Sync plays audio streamed from iTunes, iOS, Apple TV and macOS devices and AirPlay sources such as Quicktime Player and OwnTone, among others. Audio played by a Shairport Sync-powered device stays synchronized with the source and hence with similar devices playing the same source. In...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    NovaSR

    NovaSR

    A lightning fast audio upsampler

    NovaSR is an extremely lightweight and high-performance audio upsampling model that transforms low-quality 16 kHz audio into clearer, high-fidelity 48 kHz audio with remarkable speed and efficiency. At only about 50 KB in size, the model is orders of magnitude smaller than typical audio super-resolution networks, yet it achieves high quality and realtime performance thanks to its compact architecture and efficient convolutional design. NovaSR is especially valuable for post-processing tasks...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    HLS.js

    HLS.js

    HLS.js is a JavaScript library that plays HLS in browsers

    HLS.js is a JavaScript library that implements an HTTP Live Streaming client. It relies on HTML5 video and MediaSource Extensions for playback. It works by transmuxing MPEG-2 Transport Stream and AAC/MP3 streams into ISO BMFF (MP4) fragments. Transmuxing is performed asynchronously using a Web Worker when available in the browser. HLS.js also supports HLS + fmp4, as announced during WWDC2016. HLS.js works directly on top of a standard HTML<video> element. HLS.js is written in ECMAScript6...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    MusicPlayer2

    MusicPlayer2

    Audio player that can play common audio formats

    MusicPlayer2 is a simple music-player application (or prototype) implemented in — presumably — a web or desktop environment, intended to give users a clean, functional interface for managing and playing audio files. The project likely implements basic playlist management, playback controls (play, pause, skip), and possibly UI features to browse or organize music. Because many smaller music-player projects aim for simplicity, MusicPlayer2 may focus on providing a lightweight,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    SFBAudioEngine

    SFBAudioEngine

    A powerhouse of audio functionality for macOS, iOS, and tvOS

    SFBAudioEngine is an advanced audio engine designed for macOS and iOS, focusing on high-quality playback, precise audio control, and support for a wide range of audio formats. Built for modern Apple platforms, it provides developers with a robust tool for integrating sophisticated audio functionalities into their applications. It emphasizes extensibility, performance, and clean API design.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution Icon
    Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution

    K-12 Schools, Higher Education, Businesses, Restaurants

    Rise Vision is the #1 digital signage company, offering easy-to-use cloud digital signage software compatible with any player across multiple screens. Forget about static displays. Save time and boost sales with 500+ customizable content templates for your screens. If you ever need help, get free training and exceptionally fast support.
    Learn More
  • 10
    AudioNotes

    AudioNotes

    Extract audio and video content and organize it into a Markdown note

    AudioNotes is an application (or proof-of-concept) that likely combines audio recording or playback with note-taking or annotation functionality — enabling users to record voice or audio and attach textual or timestamped notes, making it ideal for lectures, interviews, meetings, or personal memos. Such a tool offers a more expressive and flexible way to capture and revisit information: instead of just typed notes or raw audio, users get both audio context and structured notes. As an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 12
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    SFML

    SFML

    Simple and Fast Multimedia Library

    SFML provides a simple interface to the various components of your PC, to ease the development of games and multimedia applications. It is composed of five modules: system, window, graphics, audio and network. Discover their features more in detail in the tutorials and the API documentation. With SFML, your application can compile and run out of the box on the most common operating systems: Windows, Linux, macOS and soon Android & iOS. Pre-compiled SDKs for your favorite OS are available on...
    Downloads: 52 This Week
    Last Update:
    See Project
  • 14
    bfxr

    bfxr

    Flash + AIR sound effects generator. Based on Sfxr.

    The bfxr project by increpare is a sound-effects generator tool originally built using Flash + AIR, based on the earlier Sfxr project. Its purpose is to enable users, especially game developers and sound designers, to quickly generate retro, 8-bit/“chiptune” style sound effects (“bleeps”, “booms”, “zaps”, etc.) without deep knowledge of audio signal processing. It offers an interactive GUI through which you can tweak many parameters (oscillators, envelopes, filters, etc.) to sculpt custom...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 15
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    Amphion

    Amphion

    Toolkit for audio, music, and speech generation

    Amphion is a toolkit from OpenMMLab dedicated to audio, music, and speech generation, aimed at both reproducible research and helping newcomers get started in generative audio. It provides standardized implementations and recipes for classic and state-of-the-art generative models in audio, including TTS, music generation, and voice conversion. A distinctive feature of Amphion is its emphasis on visualization: it offers interactive visualizations of model architectures and generation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    WavTokenizer

    WavTokenizer

    SOTA discrete acoustic codec models with 40/75 tokens per second

    WavTokenizer is a state-of-the-art discrete acoustic codec designed specifically for audio language modeling, capable of compressing 24 kHz audio into just 40 or 75 tokens per second while preserving high perceptual quality. It is built to represent speech, music, and general audio with extremely low bitrate, making it ideal as a front-end for large audio language models like GPT-4o and similar architectures. The model uses a single-quantizer design together with temporal compression to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Oboe

    Oboe

    Oboe is a C++ library that makes it easy to build high-performance

    oboe is a C++ library for building high-performance audio apps on Android, providing a unified, low-latency API over AAudio and OpenSL ES. It abstracts device and API-version differences so developers can focus on audio processing instead of platform quirks. The library emphasizes minimal latency and glitch-free playback/recording via tuned buffer strategies and callback-driven I/O. It supports features like floating-point audio, channel configuration, sample-rate negotiation, and stream...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SerenityOS

    SerenityOS

    The Serenity Operating System

    SerenityOS is an open source Unix-like operating system project with its own custom kernel, graphical user interface, system libraries, and userland tools. It combines a nostalgic “90s UI aesthetic” with modern system capabilities: a preemptive, multi-threaded kernel, own browsers, network stack, file systems, IPC, security features, and a suite of graphical / developer applications. The project is both a hobbyist OS and a polished engineering sandbox.
    Downloads: 33 This Week
    Last Update:
    See Project
  • 21
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    AudioLM - Pytorch

    AudioLM - Pytorch

    Implementation of AudioLM audio generation model in Pytorch

    Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch It also extends the work for conditioning with classifier free guidance with T5. This allows for one to do text-to-audio or TTS, not offered in the paper. Yes, this means VALL-E can be trained from this repository. It is essentially the same. This repository now also contains a MIT licensed version of SoundStream. It is also compatible with EnCodec, however, be aware that it...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Remotion

    Remotion

    Make videos programmatically with React

    Remotion is a cutting-edge library that lets developers create real videos programmatically using React components, transforming familiar UI paradigms into a flexible, code-driven video production workflow. Instead of traditional timeline editors, Remotion leverages HTML, CSS, and JavaScript to define video frames, animations, and transitions, which means developers can use states, props, loops, and component hierarchies to automate complex motion graphics. Because it integrates with the...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 24
    OpenCorePkg

    OpenCorePkg

    OpenCore bootloader

    OpenCorePkg is an open-source, modular UEFI (Unified Extensible Firmware Interface) bootloader and development framework, primarily designed to enable macOS booting on non-Apple hardware (Hackintosh). It includes Apple-specific UEFI drivers, utilities for macOS installation support, and shared libraries used across Acidanthera projects. Apple disk image loading support. Apple keyboard input aggregation. Apple PE image signature verification. Apple UEFI secure boot supplemental code. Audio...
    Downloads: 132 This Week
    Last Update:
    See Project
  • 25
    VibeVoice

    VibeVoice

    Open-source multi-speaker long-form text-to-speech model

    VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next