ChatGPT interface with better UI
Lets make video diffusion practical
Code for running inference with the SAM 3D Body Model 3DB
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Official inference repo for FLUX.2 models
An experimental version of DeepSeek model
Visual Causal Flow
Release for Improved Denoising Diffusion Probabilistic Models
PyTorch code and models for the DINOv2 self-supervised learning
Models for object and human mesh reconstruction
CLIP, Predict the most relevant text snippet given an image
Large Multimodal Models for Video Understanding and Editing
LTX-Video Support for ComfyUI
Diversity-driven optimization and large-model reasoning ability
Designed for text embedding and ranking tasks
A SOTA open-source image editing model
A Powerful Native Multimodal Model for Image Generation
Easy Docker setup for Stable Diffusion with user-friendly UI
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Customizable Image-to-Video Model based on HunyuanVideo
4M: Massively Multimodal Masked Modeling
Accurate × Fast × Comprehensive
Recovering the Visual Space from Any Views
Block Diffusion for Ultra-Fast Speculative Decoding
Z80-μLM is a 2-bit quantized language model