NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
DeepMind model for tracking arbitrary points across videos & robotics
Tooling for the Common Objects In 3D dataset
code for Mesh R-CNN, ICCV 2019
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Dataset of GPT-2 outputs for research in detection, biases, and more
The ChatGPT Retrieval Plugin lets you easily find personal documents
Designed for text embedding and ranking tasks
Implementation of the Surya Foundation Model for Heliophysics
Inference framework for 1-bit LLMs
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Chinese and English multimodal conversational language model
GLM-4 series: Open Multilingual Multimodal Chat LMs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
High-Resolution Image Synthesis with Latent Diffusion Models
Let us control diffusion models
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Suite with Real-ESRGAN, BSRGAN , IRCNN, GFPGAN & RIFE. v4.3
A Conversational Speech Generation Model
Open-Source Financial Large Language Models!
Qwen2.5-Coder is the code version of Qwen2.5, the large language model