Weaving the Digital Agent Galaxy
Agent-ready RPA suite with visual workflow automation tools engine
A Python visual Flow Based Programming library
Detects phishing and lookalike domains using DNS fuzzing techniques
Generate audiobooks from e-books
A Pioneering Open-Source Alternative to GPT-4o
Generating Immersive, Explorable, and Interactive 3D Worlds
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Multimodal Diffusion with Representation Alignment
Driving with Graph Visual Question Answering
Autoregressive Model Beats Diffusion
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
This repository contains the official implementation of FastVLM
Refer and Ground Anything Anywhere at Any Granularity
Self-supervised visual learning using momentum contrast in PyTorch
Wan2.1: Open and Advanced Large-Scale Video Generative Model
The book 5 of statistics in simplicity
GPT Image 2 prompt gallery, image prompt library, agentic skill
Windrecorder is a memory search app by records everything
Recovering the Visual Space from Any Views
Reference PyTorch implementation and models for DINOv3
The Iris Book: Addition, Subtraction, Multiplication, and Division
From Addition, Subtraction, Multiplication, and Division to ML
A neural network that transforms a design mock-up into static websites
SAPIEN Manipulation Skill Framework