Large Audio Language Model built for natural interactions
95% token savings. 155x faster queries. 16 languages
Chinese XLNet pre-trained model
Framework for building neural networks
StreamSpeech is a seamless model for offline speech recognition
Advanced techniques for RAG systems
Fast and Universal 3D reconstruction model for versatile tasks
A secure sandbox environment for malware developers and red teamers
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
Refer and Ground Anything Anywhere at Any Granularity
Set of tools to assess and improve LLM security
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
MobileLLM Optimizing Sub-billion Parameter Language Models
A Production-ready Reinforcement Learning AI Agent Library
A PyTorch library for implementing flow matching algorithms
An implementation of a deep learning recommendation model (DLRM)
Self-supervised visual learning using momentum contrast in PyTorch
ImageBind One Embedding Space to Bind Them All
[CVPR 2025 Best Paper Award] VGGT
Memory-efficient and performant finetuning of Mistral's models
Diffusion Transformer with Fine-Grained Chinese Understanding
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
The Memory layer for AI Agents