A Unified Framework for Text-to-3D and Image-to-3D Generation
Implementation of "MobileCLIP" CVPR 2024
Video understanding codebase from FAIR for reproducing video models
PyTorch code and models for the DINOv2 self-supervised learning
An AI-powered security review GitHub Action using Claude
DeepSeek Coder: Let the Code Write Itself
Implementation of the Surya Foundation Model for Heliophysics
Production-tested AI infrastructure tools
New set of lightweight state-of-the-art, open foundation models
ICLR2024 Spotlight: curation/training code, metadata, distribution
DeepSeek LLM: Let there be answers
A Customizable Image-to-Video Model based on HunyuanVideo
Open-source large language model family from Tencent Hunyuan
Multimodal-Driven Architecture for Customized Video Generation
Multimodal Diffusion with Representation Alignment
A Family of Open Foundation Models for Code Intelligence
High-resolution models for human tasks
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Official code for Style Aligned Image Generation via Shared Attention
The official PyTorch implementation of Google's Gemma models
4M: Massively Multimodal Masked Modeling
Sharp Monocular Metric Depth in Less Than a Second
This repository contains the official implementation of FastVLM
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Language modeling in a sentence representation space