Diffusion Transformer with Fine-Grained Chinese Understanding
Contexts Optical Compression
Node.js example app from the OpenAI API quickstart tutorial
Discover pretrained models for deep learning in MATLAB
Multimodal model achieving SOTA performance
Official implementation of DreamCraft3D
Sharp Monocular Metric Depth in Less Than a Second
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Document Image Parsing via Heterogeneous Anchor Prompting”
Python SDK for the Computer Use model Lux, developed by OpenAGI
Large-language-model & vision-language-model based on Linear Attention
ADAMS is a workflow engine for building complex knowledge workflows.
A Customizable Image-to-Video Model based on HunyuanVideo
Detect faces in an image
A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
A python module for hyperspectral image processing
Tools to train Image Operators automatically from a set of samples.
Edit the OCR text layer of DjVu documents in a web browser
Computer vision and image processing library for Qt.