Diversity-driven optimization and large-model reasoning ability
Generate Any 3D Scene in Seconds
An experimental version of DeepSeek model
Collection of Gemma 3 variants that are trained for performance
Inference code for scalable emulation of protein equilibrium ensembles
Repo for SeedVR2 & SeedVR
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
OCR expert VLM powered by Hunyuan's native multimodal architecture
PyTorch code and models for the DINOv2 self-supervised learning
A Systematic Framework for Interactive World Modeling
Ling is a MoE LLM provided and open-sourced by InclusionAI
CLIP, Predict the most relevant text snippet given an image
Global weather forecasting model using graph neural networks and JAX
A Powerful Native Multimodal Model for Image Generation
4M: Massively Multimodal Masked Modeling
Large Multimodal Models for Video Understanding and Editing
GLM-4 series: Open Multilingual Multimodal Chat LMs
Official implementation of DreamCraft3D
The official PyTorch implementation of Google's Gemma models
Repo of Qwen2-Audio chat & pretrained large audio language model
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
High-Fidelity and Controllable Generation of Textured 3D Assets
The ChatGPT Retrieval Plugin lets you easily find personal documents
Inference script for Oasis 500M
Implementation of the Surya Foundation Model for Heliophysics