DeepSeek Coder: Let the Code Write Itself
CLIP, Predict the most relevant text snippet given an image
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
High-Resolution Image Synthesis with Latent Diffusion Models
Open-source large language model family from Tencent Hunyuan
An Efficient Agentic Model for Computer Use
Phi-3.5 for Mac: Locally-run Vision and Language Models
Tiny vision language model
Generating Immersive, Explorable, and Interactive 3D Worlds
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Official implementation of DreamCraft3D
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Tooling for the Common Objects In 3D dataset
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
GLM-4 series: Open Multilingual Multimodal Chat LMs
Open-weight, large-scale hybrid-attention reasoning model
FAIR Sequence Modeling Toolkit 2
A Production-ready Reinforcement Learning AI Agent Library
Official DeiT repository
Diffusion Transformer with Fine-Grained Chinese Understanding
Large-language-model & vision-language-model based on Linear Attention
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
FlashMLA: Efficient Multi-head Latent Attention Kernels
Example Discord bot written in Python that uses the completions API
Towards Ultimate Expert Specialization in Mixture-of-Experts Language