Robust Speech Recognition via Large-Scale Weak Supervision
Provides code for running inference with the SegmentAnything Model
TorchMultimodal is a PyTorch library
A Conversational Speech Generation Model
Code release for "Masked-attention Mask Transformer
PyTorch implementation of MAE
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
A general-purpose encoder-decoder framework for Tensorflow
toneDetect