4M: Massively Multimodal Masked Modeling
ICLR2024 Spotlight: curation/training code, metadata, distribution
High-Fidelity and Controllable Generation of Textured 3D Assets
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
A SOTA open-source image editing model
Large-language-model & vision-language-model based on Linear Attention
Code for the paper Hybrid Spectrogram and Waveform Source Separation
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Repo for external large-scale work
A method to increase the speed and lower the memory footprint
Implementation of model parallel autoregressive transformers on GPUs
Code release for ConvNeXt V2 model
A minimal PyTorch re-implementation of the OpenAI GPT
PyTorch implementation of MAE
Per-Pixel Classification is Not All You Need for Semantic Segmentation
A mix of GAN implementations including progressive growing
Learning Continuous Signed Distance Functions for Shape Representation
A library for Multilingual Unsupervised or Supervised word Embeddings
High-compute ultra-reasoning model surpassing model surpassing GPT-5
High-efficiency reasoning and agentic intelligence model
Tencent’s 36-language state-of-the-art translation model