Code release for ConvNeXt V2 model
Learning to Act by Watching Unlabeled Online Videos
Code release for "Masked-attention Mask Transformer
PyTorch implementation of MAE
The official pytorch implementation of our paper
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Generate embeddings from large-scale graph-structured data
A library for Multilingual Unsupervised or Supervised word Embeddings
Code for reproducing key results in the paper
Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201
Dia-1.6B generates lifelike English dialogue and vocal expressions
JetBrains’ 4B parameter code model for completions
OpenAI’s compact 20B open model for fast, agentic, and local use
Tencent’s 36-language state-of-the-art translation model
CTC-based forced aligner for audio-text in 158 languages
Vision-language-action model for robot control via images and text