Wan2.1: Open and Advanced Large-Scale Video Generative Model
Synthesizing and manipulating 2048x1024 images with conditional GANs
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Modular AI image and video generation web UI with extensible tools
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
A Powerful Native Multimodal Model for Image Generation
ImageBind One Embedding Space to Bind Them All
Director, Screenwriter, Producer, and Video Generator All-in-One
HunyuanVideo: A Systematic Framework For Large Video Generation Model
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Tooling for the Common Objects In 3D dataset
Text and image to video generation: CogVideoX and CogVideo
Generating Immersive, Explorable, and Interactive 3D Worlds
Capable of understanding text, audio, vision, video
Sharp Monocular Metric Depth in Less Than a Second
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
An unsupervised and free tool for image and video dataset analysis
Language modeling in a sentence representation space
RGBD video generation model conditioned on camera input
Easily compute clip embeddings and build a clip retrieval system
Tensor search for humans
A lightweight vision library for performing large object detection
Implementation of "MobileCLIP" CVPR 2024
A Pioneering Open-Source Alternative to GPT-4o