Robust Speech Recognition via Large-Scale Weak Supervision
Accurate × Fast × Comprehensive
Industrial-level controllable zero-shot text-to-speech system
Extremely fast compression algorithm
OpenAI swift async text to image for SwiftUI app using OpenAI
Multimodal model achieving SOTA performance
End-to-end speech processing toolkit
LLM training code for MosaicML foundation models
Open-source industrial-grade ASR models
TorchMultimodal is a PyTorch library
Create C structures from USB HID Report Descriptors
A Conversational Speech Generation Model
110+ developer tools as native MacOS, Linux & Windows desktop apps.
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
Basaran, an open-source alternative to the OpenAI text completion API
A Swiss Army knife for developers
Implementation of NÜWA, attention network for text to video synthesis
Text-conditional image generation model based on OpenAI's unCLIP
CPT: A Pre-Trained Unbalanced Transformer
Singing Voice Synthesis via Shallow Diffusion Mechanism
in progress...
file splitter and rejoiner
ALIbaba's Collection of Encoder-decoders from MinD
Multipurpose Encoder/decoder
Toolkit for Machine Learning, Natural Language Processing