Multimodal embedding and reranking models built on Qwen3-VL
Z80-μLM is a 2-bit quantized language model
Collection of Gemma 3 variants that are trained for performance
VMZ: Model Zoo for Video Modeling
Official implementation of Watermark Anything with Localized Messages
High-resolution models for human tasks
Tool for exploring and debugging transformer model behaviors
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Personalize Any Characters with a Scalable Diffusion Transformer
Pretrained time-series foundation model developed by Google Research
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Inference script for Oasis 500M
Generate Any 3D Scene in Seconds
Fast and Universal 3D reconstruction model for versatile tasks
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
Foundation Models for Time Series
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution
A Production-ready Reinforcement Learning AI Agent Library
A PyTorch library for implementing flow matching algorithms
Hackable and optimized Transformers building blocks
GLM-4-Voice | End-to-End Chinese-English Conversational Model