Native and Compact Structured Latents for 3D Generation
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Towards Real-World Vision-Language Understanding
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Seamlessly extend your preferred base images to be Lambda compatible
Automatically find issues in image datasets
Implementation of a U-net complete with efficient attention
State-of-the-art diffusion models for image and audio generation
Offline inference engine for art, real-time voice conversations
Lets make video diffusion practical
Gracefully face hCaptcha challenge with multimodal llms
An open-source photo thumbnail service by globo.com
AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim
Official MiniMax Model Context Protocol (MCP) server
Reference PyTorch implementation and models for DINOv3
Official implementation of Watermark Anything with Localized Messages
Download and manage Bilibili Manga chapters with GUI downloader
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
AutoGluon: AutoML for Image, Text, and Tabular Data
Stable Diffusion with Core ML on Apple Silicon
Command-line program to download image galleries and collections
Python data, Leaflet.js maps
The electronic structure package for quantum computers
Fast image augmentation library and an easy-to-use wrapper