Dataset of GPT-2 outputs for research in detection, biases, and more
A Conversational Speech Generation Model
Open Multilingual Multimodal Chat LMs
800,000 step-level correctness labels on LLM solutions to MATH problem
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Learning to Act by Watching Unlabeled Online Videos
PyTorch implementation of MAE
An implementation of model parallel GPT-2 and GPT-3-style models
A mix of GAN implementations including progressive growing
Learning Continuous Signed Distance Functions for Shape Representation
Generate embeddings from large-scale graph-structured data
A library for Multilingual Unsupervised or Supervised word Embeddings
Code for reproducing key results in the paper
Dual LSTM Encoder for Dialog Response Generation
JetBrains’ 4B parameter code model for completions
CTC-based forced aligner for audio-text in 158 languages
High-compute ultra-reasoning model surpassing model surpassing GPT-5