Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Open-Sora: Democratizing Efficient Video Production for All
A Unified Framework for Image Customization
Chinese and English multimodal conversational language model
Tensor search for humans
Implementation of 'lightweight' GAN, proposed in ICLR 2021
A set of Docker images for training and serving models in TensorFlow
"Big Model" trains a visual multimodal VLM with 26M parameters
Simplifies the local serving of AI models from any source
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
MII makes low-latency and high-throughput inference possible
Capable of understanding text, audio, vision, video
A Universal Customization Method for Single and Multi Conditioning
The data structure for multimodal data
Advancing Open-source World Models
Easy Docker setup for Stable Diffusion with user-friendly UI
Geometric deep learning extension library for PyTorch
RGBD video generation model conditioned on camera input
A python library for self-supervised learning on images
Gemma open-weight LLM library, from Google DeepMind
YOLOv5 is the world's most loved vision AI
A state-of-the-art open visual language model
A Customizable Image-to-Video Model based on HunyuanVideo
AI Suite for upscaling, interpolating & restoring images/videos
Run GGUF models easily with a UI or API. One File. Zero Install.