Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Unified Multimodal Understanding and Generation Models
Language modeling in a sentence representation space
Large Multimodal Models for Video Understanding and Editing
MiniMax-M2, a model built for Max coding & agentic workflows
Towards Real-World Vision-Language Understanding
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Pushing the Limits of Mathematical Reasoning in Open Language Models
The ChatGPT Retrieval Plugin lets you easily find personal documents
Open-source, high-performance Mixture-of-Experts large language model
Open Multilingual Multimodal Chat LMs
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Dataset of GPT-2 outputs for research in detection, biases, and more
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Repo for external large-scale work
Official PyTorch Implementation of "Scalable Diffusion Models"
Implementation of model parallel autoregressive transformers on GPUs
A latent text-to-image diffusion model
Open-source code agent designed for Lean 4
LL model providing reasoning and conversational capabilities
Open language model developed by NVIDIA as part of Nemotron-3 family
Model that fuses instruct, reasoning and agentic skills
JetBrains’ 4B parameter code model for completions
Vision-language-action model for robot control via images and text