CogView4, CogView3-Plus and CogView3(ECCV 2024)
Open-source large language model family from Tencent Hunyuan
Official implementation of Watermark Anything with Localized Messages
CLIP, Predict the most relevant text snippet given an image
The official PyTorch implementation of Google's Gemma models
AlphaFold 3 inference pipeline
Release for Improved Denoising Diffusion Probabilistic Models
Designed for text embedding and ranking tasks
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
VMZ: Model Zoo for Video Modeling
Towards Real-World Vision-Language Understanding
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Phi-3.5 for Mac: Locally-run Vision and Language Models
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A Powerful Native Multimodal Model for Image Generation
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Repo of Qwen2-Audio chat & pretrained large audio language model
The Clay Foundation Model - An open source AI model and interface
Sharp Monocular Metric Depth in Less Than a Second
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
A series of math-specific large language models of our Qwen2 series
Implementation of the Surya Foundation Model for Heliophysics
Inference framework for 1-bit LLMs