A Unified Framework for Text-to-3D and Image-to-3D Generation
A Powerful Native Multimodal Model for Image Generation
Node.js example app from the OpenAI API quickstart tutorial
Official implementation of DreamCraft3D
The no-nonsense RAG chunking library
Discover pretrained models for deep learning in MATLAB
Recovering the Visual Space from Any Views
Contexts Optical Compression
A Unified Framework for Image Customization
Official repository for LTX-Video
Reference PyTorch implementation and models for DINOv3
Document Image Parsing via Heterogeneous Anchor Prompting”
A Customizable Image-to-Video Model based on HunyuanVideo
Multimodal model achieving SOTA performance
Flexible Photo Recrafting While Preserving Your Identity
Multimodal-Driven Architecture for Customized Video Generation
Diffusion Transformer with Fine-Grained Chinese Understanding
This repo contains the code for 1D tokenizer and generator
LTX-Video Support for ComfyUI
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Openclaw as your girlfriend
RGBD video generation model conditioned on camera input
A Systematic Framework for Interactive World Modeling
Sharp Monocular Metric Depth in Less Than a Second
Virtual AI anchor that combines state-of-the-art technology