VGGSfM: Visual Geometry Grounded Deep Structure From Motion
An AI-powered security review GitHub Action using Claude
Inference script for Oasis 500M
FAIR Sequence Modeling Toolkit 2
A PyTorch library for implementing flow matching algorithms
CogView4, CogView3-Plus and CogView3(ECCV 2024)
OCR expert VLM powered by Hunyuan's native multimodal architecture
Chat & pretrained large audio language model proposed by Alibaba Cloud
A state-of-the-art open visual language model
Chinese and English multimodal conversational language model
Implementation of "MobileCLIP" CVPR 2024
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
A series of math-specific large language models of our Qwen2 series
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Official implementation of DreamCraft3D
Open-source large language model family from Tencent Hunyuan
Inference code for scalable emulation of protein equilibrium ensembles
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
The official PyTorch implementation of Google's Gemma models
DeepMind model for tracking arbitrary points across videos & robotics
Sharp Monocular Metric Depth in Less Than a Second
code for Mesh R-CNN, ICCV 2019