Multimodal embedding and reranking models built on Qwen3-VL
Z80-μLM is a 2-bit quantized language model
Implementation of "MobileCLIP" CVPR 2024
VMZ: Model Zoo for Video Modeling
Official implementation of Watermark Anything with Localized Messages
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Personalize Any Characters with a Scalable Diffusion Transformer
Genome modeling and design across all domains of life
Achieving 3+ generation speedup on reasoning tasks
Ultra-Efficient LLMs on End Device
Pretrained time-series foundation model developed by Google Research
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Inference script for Oasis 500M
Generate Any 3D Scene in Seconds
Fast and Universal 3D reconstruction model for versatile tasks
This repository contains the official implementation of FastVLM
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution
A Production-ready Reinforcement Learning AI Agent Library
A PyTorch library for implementing flow matching algorithms
Hackable and optimized Transformers building blocks