PyTorch code and models for the DINOv2 self-supervised learning
Block Diffusion for Ultra-Fast Speculative Decoding
Tongyi Deep Research, the Leading Open-source Deep Research Agent
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Open-source large language model family from Tencent Hunyuan
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
CLIP, Predict the most relevant text snippet given an image
An Efficient Agentic Model for Computer Use
Phi-3.5 for Mac: Locally-run Vision and Language Models
Generating Immersive, Explorable, and Interactive 3D Worlds
Tiny vision language model
Official implementation of DreamCraft3D
Tooling for the Common Objects In 3D dataset
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
GLM-4 series: Open Multilingual Multimodal Chat LMs
Open-weight, large-scale hybrid-attention reasoning model
FAIR Sequence Modeling Toolkit 2
A Production-ready Reinforcement Learning AI Agent Library
Official DeiT repository
Diffusion Transformer with Fine-Grained Chinese Understanding
Large-language-model & vision-language-model based on Linear Attention
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
FlashMLA: Efficient Multi-head Latent Attention Kernels
Example Discord bot written in Python that uses the completions API
Towards Ultimate Expert Specialization in Mixture-of-Experts Language