Reference PyTorch implementation and models for DINOv3
2D and 3D Face alignment library build using pytorch
InvokeAI is a leading creative engine for Stable Diffusion models
An open source library for GPU-accelerated robot learning
Text and image to video generation: CogVideoX and CogVideo
High-Resolution Image Synthesis with Latent Diffusion Models
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Simplest working implementation of Stylegan2
Effortless data labeling with AI support from Segment Anything
Diffusion Transformer with Fine-Grained Chinese Understanding
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Open-Sora: Democratizing Efficient Video Production for All
A Unified Framework for Image Customization
Chinese and English multimodal conversational language model
Tensor search for humans
Implementation of 'lightweight' GAN, proposed in ICLR 2021
A set of Docker images for training and serving models in TensorFlow
"Big Model" trains a visual multimodal VLM with 26M parameters
Simplifies the local serving of AI models from any source
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
MII makes low-latency and high-throughput inference possible
Capable of understanding text, audio, vision, video
Ready-to-run Docker images containing Jupyter applications
A Universal Customization Method for Single and Multi Conditioning
The data structure for multimodal data