File and Image Management Application for django
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
State-of-the-art diffusion models for image and audio generation
A python tool for downloading manga from Toonily
Multimodal-Driven Architecture for Customized Video Generation
Reference PyTorch implementation and models for DINOv3
GPT4V-level open-source multi-modal model based on Llama3-8B
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
CogView4, CogView3-Plus and CogView3(ECCV 2024)
An open-source photo thumbnail service by globo.com
Official Python inference and LoRA trainer package
Diffusion Transformer with Fine-Grained Chinese Understanding
Lets make video diffusion practical
Chinese and English multimodal conversational language model
Code for running inference with the SAM 3D Body Model 3DB
Train machine learning models within Docker containers
Blender addons to make the bridge between Blender and geographic data
Generating Immersive, Explorable, and Interactive 3D Worlds
An unsupervised and free tool for image and video dataset analysis
21 Lessons, Get Started Building with Generative AI
Towards Real-World Vision-Language Understanding
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Automatically find issues in image datasets
Implementation of Imagen, Google's Text-to-Image Neural Network
Fast image augmentation library and an easy-to-use wrapper