Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
Unofficial Parallel WaveGAN
The PyTorch-based audio source separation toolkit for researchers
A webui for different audio related Neural Networks
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Implementation of MusicLM music generation model in Pytorch
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
User-friendly library to find similar objects
Audio generation using diffusion models, in PyTorch
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
No-code tool for creating a neural search solution in minutes
A walk along memory lane
Implementation of NÜWA, attention network for text to video synthesis
Audio generation using diffusion models
WaveRNN Vocoder + TTS
Data augmentation for NLP
Based on the Disco Diffusion, version of the AI art creation software
Implementation of NWT, audio-to-video generation, in Pytorch
Task of transcribing piano recordings into MIDI files
Separate audio recordings into individual sources
General Speech Restoration
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Open source embedded speech-to-text engine
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)