Unified Multimodal Understanding and Generation Models
Official implementation of DreamCraft3D
AI assistant based on large models that can actively think and plan
RGBD video generation model conditioned on camera input
Project Lyra: Open Generative 3D World Models
A Universal Customization Method for Single and Multi Conditioning
Open source personal AI Assistant for Linux, Windows and Mac
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
An extensive node suite that enables ComfyUI to process 3D inputs
LISA: Reasoning Segmentation via Large Language Model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Simplest working implementation of Stylegan2
Jittor is a high-performance deep learning framework
Stable Diffusion with Core ML on Apple Silicon
Plug-n-play module turning text-to-image models into animation
AI Suite for upscaling, interpolating & restoring images/videos
Multi-Voice and Prompt-Controlled TTS Engine
Implementation of Dreambooth
Fast ODE Solver for Diffusion Probabilistic Model Sampling
Generate 3D objects conditioned on text or images
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting
Text-to-Image generation. The repo for NeurIPS 2021 paper
An open-source framework for training large multimodal models
Official repo for consistency models