Showing 1086 open source projects for "dvd-audio"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 1
    Filt8- v3.6 alerts for WSJT-X FT8

    Filt8- v3.6 alerts for WSJT-X FT8

    Alert and filter QSO for WSJT-X FT8 FT4

    THIS VERSION HAS A TABLE VIEW FOR SYSTEMS THAT DO NOT SUPPORT LARGE GRAPHIC DISPLAYS SUCH AS LOW-END LAPTOPS. CHECK OUT ft8mapper FOR BETTER GRAPHICS AND GRID TRACKING. Filt8 - v3.6 release - Requires Python 3.x (3.10 or higher on Mac) Scrollable maps for small screens State data as of Feb 25, 2024. Map saves settings. Map features: filter by band, click to Lookup. Offline Maps to show station grid locations as you decode them in real-time (Check out js8mapper project on ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    VALL-E X

    VALL-E X

    Open source implementation of Microsoft's VALL-E X zero-shot TTS model

    ...It is capable of synthesizing speech in English, Chinese, and Japanese from text while mimicking the voice characteristics of a speaker given only a short 3–10 second prompt. The model attempts to match not just timbre, but also tone, pitch, emotion, and prosody of the reference audio, resulting in highly personalized output. VALL-E-X supports zero-shot cross-lingual synthesis, meaning a monolingual speaker’s voice can be used to speak other languages without additional training. It also preserves aspects of the acoustic environment, such as background noise or reverb, making the generated audio feel more like it came from the same setting as the prompt. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DeepSearch5Plus

    DeepSearch5Plus

    Search recursively all files, text inside files, and bookmarks

    ...It allows to rename single file or change upper/lower case, join spaces in bulk mode with multiple level of undo and redo capability. You are also able to copy selected files to another location as single element or with its orginal parent folder usuful for audio and video files. You can launch the program associate with the file or open it in its container folder. For audio/video/text files you can configure for each of them a program to use and configure options to pass to the program. For example you could use, for example, Notepad++ and pass in its option "-n<number>" to open at certain row.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    vits_chinese

    vits_chinese

    Best practice TTS based on BERT and VITS

    ...By customizing or porting VITS for Chinese, this project aims to produce high-quality TTS outputs in a language that can be challenging due to tones, pronunciation variability, and prosody. The repository offers full training and inference pipelines: preprocessing, mel-spectrogram generation, training scripts, and audio synthesis. For users who don’t train their own models, the project provides pre-trained checkpoints (or instructions) and expects integration with a vocoder during speech synthesis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    Parallel WaveGAN

    Parallel WaveGAN

    Unofficial Parallel WaveGAN

    Parallel WaveGAN is an unofficial PyTorch implementation of several state-of-the-art non-autoregressive neural vocoders, centered on Parallel WaveGAN but also including MelGAN, Multiband-MelGAN, HiFi-GAN, and StyleMelGAN. Its main goal is to provide a real-time neural vocoder that can turn mel spectrograms into high-quality speech audio efficiently. The repository is designed to work hand-in-hand with ESPnet-TTS and NVIDIA Tacotron2-style front ends, so you can build complete TTS or singing voice synthesis pipelines. It includes a large collection of “Kaldi-style” recipes for many datasets such as LJSpeech, LibriTTS, VCTK, JSUT, CMU Arctic, and multiple singing voice corpora in Japanese, Mandarin, Korean, and more. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Asteroid

    Asteroid

    The PyTorch-based audio source separation toolkit for researchers

    The PyTorch-based audio source separation toolkit for researchers. Pytorch-based audio source separation toolkit that enables fast experimentation on common datasets. It comes with a source code thats supports a large range of datasets and architectures, and a set of recipes to reproduce some important papers. Building blocks are thought and designed to be seamlessly plugged together.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Mehrauli

    Mehrauli

    A surrealist narrative-driven walking simulator

    Mehrauli is a surrealist narrative-driven walking simulator set on a deserted, fictitious island of the same name. The player has stumbled upon the island. On the way, the player finds a trail of cassette tapes lying around that tell the story of a previous inhabitant on the island. He was a man who spent his childhood in the Mehrauli neighbourhood of South West Delhi. The man had stolen some of the neglected monuments from Mehrauli and brought them to the island, with the intention to turn...
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    SoftVC VITS Singing Voice Conversion

    SoftVC VITS Singing Voice Conversion

    SoftVC VITS Singing Voice Conversion

    ...The project leverages neural network architectures derived from VITS and SoftVC research to achieve high-quality voice transformation. It is commonly used in creative audio workflows, especially in communities experimenting with synthetic singing and character voices. The repository includes training and inference pipelines that enable users to build and apply custom voice models. Overall, so-vits-svc serves as a specialized toolkit for neural singing voice conversion and audio synthesis research.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    ...Demucs supports GPU-accelerated inference and can process multi-channel audio with chunked streaming for real-time or batch operation. It also provides training scripts and utilities to fine-tune on custom datasets, along with remixing and enhancement tools.
    Downloads: 111 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 10
    Audio Webui

    Audio Webui

    A webui for different audio related Neural Networks

    Audio Webui is a Gradio-based web user interface that unifies a wide range of audio-related neural networks under a single, accessible front end. It is designed as an “all-in-one” environment where users can experiment with text-to-speech, voice cloning, generative music, and other neural audio models without writing boilerplate code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    MahaKurawa.My.ID MP4 VA Extract

    MahaKurawa.My.ID MP4 VA Extract

    MahaKurawa.My.ID MP4 VA Extract is a tool to extract mp4 file content

    MahaKurawa.My.ID MP4 VA Extract is a tool to extract MP4 file video and audio content. It also have ability to extract MKV file and single SSA Subtitle file. This software will not convert any video and audio file from MP4 file. This software just extract them as it is. This tool is made for that specific purpose. This tool "MahaKurawa.My.ID MP4 VA Extract v.1.0.3.1" can be obtained for free on https://www.mahakurawa.my.id.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Fansly Downloader

    Fansly Downloader

    Easy to use fansly.com content downloading tool

    Fansly Downloader is the go-to app for all your bulk media downloading needs. Download photos, videos, audio, or any other media from Fansly, this powerful tool has got you covered! Say goodbye to the hassle of individually downloading each piece of media, now you can download them all or just some, with just a few clicks.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 13
    Text to Waveform

    Text to Waveform

    Create synth presets from words

    Convert words to waveforms you can load into a synthesizer oscillator to create synth presets. Have fun turning your name, your friends' names, your city name, your pet's name, your team's name into synth presets you can use to produce a track.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DarkAudacity

    DarkAudacity

    A customized version of Audacity

    A free sound editor, DarkAudacity is the well known Audacity sound editor now with a darker more modern theme - and a few small tweaks. The audio engine underneath is the same audio engine. The same code powers it. Like Audacity it is completely free. It's not a cut down trial evaluation version. You can record and play sounds, edit sounds, apply audio effects and save what you create for ringtones, podcasts and more. DarkAudacity is Open Source, free for you to download and use on your PC. ...
    Downloads: 65 This Week
    Last Update:
    See Project
  • 15
    MusicLM - Pytorch

    MusicLM - Pytorch

    Implementation of MusicLM music generation model in Pytorch

    Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch. They are basically using text-conditioned AudioLM, but surprisingly with the embeddings from a text-audio contrastive learned model named MuLan. MuLan is what will be built out in this repository, with AudioLM modified from the other repository to support the music generation needs here.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    StoryTeller

    StoryTeller

    Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.

    ...Given a prompt as an opening line of a story, GPT writes the rest of the plot; Stable Diffusion draws an image for each sentence; a TTS model narrates each line, resulting in a fully animated video of a short story, replete with audio and visuals. To develop locally, install dev dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each commit. The final video will be saved as /out/out.mp4, alongside other intermediate images, audio files, and subtitles. For more advanced use cases, you can also directly interface with Story Teller in Python code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    find-similar

    find-similar

    User-friendly library to find similar objects

    The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers developers to efficiently find similar objects and perform comparisons across a variety of data types. Whether dealing with texts, images, audio, or more, our project aims to simplify the process of identifying similarities and enhancing decision-making. https://github.com/findsimilar/find-similar - GitHub repo http://demo.findsimilar.org/ - Demo project and tutorial https://docs.findsimilar.org/ - Documentation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    PMS for REGZA

    A DLNA-compliant UPnP Media Server

    PMS for REGZA is a DLNA-compliant Media Server. As a fork build of well-known "PS3 Media Server", This aims especially to improve functionality on TOSHIBA REGZA TVs With preserving applicabilities to other Renderers. Details: Home Page: http://www32.atwiki.jp/pms_regza
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    auto-subtitle

    auto-subtitle

    Automatically generate and overlay subtitles for any video

    auto-subtitle is a Python-based command-line tool that automatically generates and overlays subtitles on video files using AI-driven speech recognition. It combines FFmpeg with OpenAI’s Whisper model to transcribe spoken audio into text and synchronize it with video playback. The tool processes video input, extracts audio, and produces subtitle files that can be either exported separately or burned directly into the final video output. It supports multiple transcription models with varying accuracy and performance, allowing users to balance speed and quality depending on their needs. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    MahaKurawa MP4V-A Extractor

    MahaKurawa MP4V-A Extractor

    This software is a tool to extract video and audio file that contained

    This software is a tool to extract video and audio file that contained by a .MP4 format. This software will not convert any video and audio file from yout .mp4 file. This software just extract them as it is. This tool is made for that specific purpose. This tool "MahaKurawa MP4 V-A Extractor V.10" can be obtained for free on https://www.mahakurawa.my.id.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Pyst consists of a set of interfaces and libraries to allow programming of Asterisk from python. The library currently supports AGI, AMI, and the parsing of Asterisk configuration files. The library also includes debugging facilities for AGI. 2014-04-17: Moved the version control to GIT. To check out see the tab "Code". Note that the whole history including ancient CVS, then some time in monotone, then subversion was united into one GIT repository thanks to ESR's...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    audio-diffusion-pytorch

    audio-diffusion-pytorch

    Audio generation using diffusion models, in PyTorch

    A fully featured audio diffusion library, for PyTorch. Includes models for unconditional audio generation, text-conditional audio generation, diffusion autoencoding, upsampling, and vocoding. The provided models are waveform-based, however, the U-Net (built using a-unet), DiffusionModel, diffusion method, and diffusion samplers are both generic to any dimension and highly customizable to work on other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Video 2 Audio The Converter [I.S.A]

    Video 2 Audio The Converter [I.S.A]

    Video 2 Audio The Converter [Improved.Simplified.Alternative]

    'Vido 2 Audio : The converter' is an desktop application developed using python 3.6.8 and other add-on libaries. Converts video file into audio file. Vido 2 Audio : The converter has two modes: 1) Single file - Convert one video file into audio file. 2) Multiple files - Convert more than one video files into audio files from a folder\directory.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    VALL-E

    VALL-E

    PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

    We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    PyExe - YT DL I.S.A

    PyExe - YT DL I.S.A

    PyExe - YT DL [Improved.Simplified.Alternative]

    'PyExe - YT DL' is an desktop application developed using python 3.6.8 and other add-on libaries. Can download YouTube videos and audios. 'PyExe - YT DL' has two parts: 1) Download Video - downloads YouTube video (.mp4) 2) Download Audio - downloads YouTube video (.mp3) Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo