Robust Speech Recognition via Large-Scale Weak Supervision
Provides code for running inference with the SegmentAnything Model
A Family of Open Foundation Models for Code Intelligence
Accurate × Fast × Comprehensive
Industrial-level controllable zero-shot text-to-speech system
End-to-end speech processing toolkit
Pretrained time-series foundation model developed by Google Research
Multimodal model achieving SOTA performance
A Conversational Speech Generation Model
DeepSeek LLM: Let there be answers
Transformer related optimization, including BERT, GPT
A High Performance Library for Sequence Processing and Generation
Singing Voice Synthesis via Shallow Diffusion Mechanism
Code release for "Masked-attention Mask Transformer
PyTorch implementation of MAE
Facebook AI Research Sequence-to-Sequence Toolkit
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Adversarial Latent Autoencoders
Toolkit for efficient experimentation with Speech Recognition
Flexible text-to-text transformer model for multilingual NLP tasks
Summarization model fine-tuned on CNN/DailyMail articles