2^x Image Super-Resolution
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Official Python inference and LoRA trainer package
Official inference repo for FLUX.2 models
High-performance code intelligence MCP server
Synthesizing and manipulating 2048x1024 images with conditional GANs
PyTorch implementation of JiT
This repository contains the official implementation of FastVLM
Recovering the Visual Space from Any Views
Repo for SeedVR2 & SeedVR
OCRmyPDF adds an OCR text layer to scanned PDF files
Reverse engineering Gemini's SynthID detection
High-Resolution Image Synthesis with Latent Diffusion Models
Give Claude the ability to watch and understand videos
Generate high-definition story short videos with one click using AI
Qwen2.5-VL is the multimodal large language model series
Knowledge Graph Generation from Any Text
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Stable Diffusion web UI
A Customizable Image-to-Video Model based on HunyuanVideo
Official repository for LTX-Video
Reference PyTorch implementation and models for DINOv3
Native and Compact Structured Latents for 3D Generation
GPT4V-level open-source multi-modal model based on Llama3-8B
PyTorch extensions for fast R&D prototyping and Kaggle farming