Models for object and human mesh reconstruction
Open-source deep-learning framework
Video Object and Interaction Deletion
Recovering the Visual Space from Any Views
Diversity-driven optimization and large-model reasoning ability
PyTorch code and models for the DINOv2 self-supervised learning
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Designed for text embedding and ranking tasks
Large Multimodal Models for Video Understanding and Editing
Fast-stable-diffusion + DreamBooth
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Ultra-Efficient LLMs on End Device
HY-Motion model for 3D character animation generation
A PyTorch library for implementing flow matching algorithms
Official implementation of DreamCraft3D
Tiny vision language model
The official PyTorch implementation of Google's Gemma models
Inference code for scalable emulation of protein equilibrium ensembles
Programmatic access to the AlphaGenome model
Sharp Monocular Metric Depth in Less Than a Second
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Repo of Qwen2-Audio chat & pretrained large audio language model
Large-language-model & vision-language-model based on Linear Attention