Collection of Gemma 3 variants that are trained for performance
Implementation of "MobileCLIP" CVPR 2024
Large Multimodal Models for Video Understanding and Editing
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Pretrained time-series foundation model developed by Google Research
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
The ChatGPT Retrieval Plugin lets you easily find personal documents
FlashMLA: Efficient Multi-head Latent Attention Kernels
Encoder of greater-than-word length text trained on a variety of data
Open Multilingual Multimodal Chat LMs
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Dataset of GPT-2 outputs for research in detection, biases, and more
Official repo for consistency models
Repo for external large-scale work
800,000 step-level correctness labels on LLM solutions to MATH problem
Learning to Act by Watching Unlabeled Online Videos
PyTorch implementation of MAE
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Generate embeddings from large-scale graph-structured data
A library for Multilingual Unsupervised or Supervised word Embeddings
Dual LSTM Encoder for Dialog Response Generation
LL model providing reasoning and conversational capabilities
Open language model developed by NVIDIA as part of Nemotron-3 family
CTC-based forced aligner for audio-text in 158 languages