Models for object and human mesh reconstruction
Tooling for the Common Objects In 3D dataset
Large Multimodal Models for Video Understanding and Editing
Code for running inference and finetuning with SAM 3 model
Uncommon Objects in 3D dataset
Code for running inference with the SAM 3D Body Model 3DB
Provides convenient access to the Anthropic REST API from any Python 3
Official implementation of Watermark Anything with Localized Messages
Chat & pretrained large vision language model
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Qwen2.5-VL is the multimodal large language model series
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
code for Mesh R-CNN, ICCV 2019
A SOTA open-source image editing model
Official code for Style Aligned Image Generation via Shared Attention
Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201