GLM-4 series: Open Multilingual Multimodal Chat LMs
Repo for SeedVR2 & SeedVR
Multimodal-Driven Architecture for Customized Video Generation
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Pokee Deep Research Model Open Source Repo
Sharp Monocular Metric Depth in Less Than a Second
Provides convenient access to the Anthropic REST API from any Python 3
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
A Powerful Native Multimodal Model for Image Generation
RGBD video generation model conditioned on camera input
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
An Efficient Agentic Model for Computer Use
Phi-3.5 for Mac: Locally-run Vision and Language Models
Qwen3-ASR is an open-source series of ASR models
Fast-stable-diffusion + DreamBooth
A Pragmatic VLA Foundation Model
Block Diffusion for Ultra-Fast Speculative Decoding
Access to Anthropic's safety-first language model APIs
Multimodal Diffusion with Representation Alignment
Bidirectional token-classification model for identifiable info
Project Lyra: Open Generative 3D World Models
Pretrained time-series foundation model developed by Google Research
Fast, Sharp & Reliable Agentic Intelligence
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image