Robust Speech Recognition via Large-Scale Weak Supervision
Provides code for running inference with the SegmentAnything Model
A Family of Open Foundation Models for Code Intelligence
A standalone, portable generic Ada package for decoding images
Accurate × Fast × Comprehensive
A HEVC/H.265 Web Player
Industrial-level controllable zero-shot text-to-speech system
A Foundation Model for the Language of Financial Markets
End-to-end speech processing toolkit
Boilerplate-free Kotlin config library for loading configuration files
An incredibly fast, pure Elixir JSON library
Pretrained time-series foundation model developed by Google Research
Multimodal model achieving SOTA performance
TorchMultimodal is a PyTorch library
AV1 Image File Format Specification - ISO-BMFF/HEIF derivative
Audio codecs extracted from Android Open Source Project
A Conversational Speech Generation Model
DeepSeek LLM: Let there be answers
Blazing fast and correct x86/x64 disassembler, assembler, decoder, etc
An Ada 2012 library for reading and writing PNG image files
Transformer related optimization, including BERT, GPT
A High Performance Library for Sequence Processing and Generation
Singing Voice Synthesis via Shallow Diffusion Mechanism