Implementation of "MobileCLIP" CVPR 2024
Official implementation of Watermark Anything with Localized Messages
Video understanding codebase from FAIR for reproducing video models
Open-weight, large-scale hybrid-attention reasoning model
Qwen3-omni is a natively end-to-end, omni-modal LLM
Bidirectional token-classification model for identifiable info
Genome modeling and design across all domains of life
Ultra-Efficient LLMs on End Device
Pretrained time-series foundation model developed by Google Research
General-purpose image editing model that delivers high-fidelity
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Generate Any 3D Scene in Seconds
Fast and Universal 3D reconstruction model for versatile tasks
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
Foundation Models for Time Series
A Production-ready Reinforcement Learning AI Agent Library
A PyTorch library for implementing flow matching algorithms
PyTorch code and models for the DINOv2 self-supervised learning
Memory-efficient and performant finetuning of Mistral's models
Official implementation of DreamCraft3D
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Diffusion Transformer with Fine-Grained Chinese Understanding
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming