Generate Any 3D Scene in Seconds
LLM-based Reinforcement Learning audio edit model
An Efficient Agentic Model for Computer Use
New family of code large language models (LLMs)
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Tiny vision language model
Inference code for scalable emulation of protein equilibrium ensembles
Chinese and English multimodal conversational language model
26m function call model that runs on incredibly small devices
Fast-stable-diffusion + DreamBooth
Multimodal embedding and reranking models built on Qwen3-VL
Official implementation of Watermark Anything with Localized Messages
High-resolution models for human tasks
Tool for exploring and debugging transformer model behaviors
CLIP, Predict the most relevant text snippet given an image
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Project Lyra: Open Generative 3D World Models
Pretrained time-series foundation model developed by Google Research
Inference script for Oasis 500M
Fast and Universal 3D reconstruction model for versatile tasks
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
A Production-ready Reinforcement Learning AI Agent Library
A PyTorch library for implementing flow matching algorithms
Memory-efficient and performant finetuning of Mistral's models