ComfyUI wrapper nodes for HunyuanVideo
A new kind of Progress Bar, with real-time throughput, ETA
A beautiful, powerful, self-hosted rom manager and player
About 24 Lessons, 12 Weeks, Get Started as a Web Developer
Jupyter magics and kernels for working with remote Spark clusters
Create HTML profiling reports from pandas DataFrame objects
Multilingual Document Layout Parsing in a Single Vision-Language Model
An on-premises, OCR-free unstructured data extraction
Handwritten Text Recognition (HTR) system implemented with TensorFlow
RAG-Anything: All-in-One RAG Framework
Marrying Grounding DINO with Segment Anything & Stable Diffusion
Motion-controllable Video Generation via Latent Trajectory Guidance
Multimodal embedding and reranking models built on Qwen3-VL
"Big Model" trains a visual multimodal VLM with 26M parameters
PaddlePaddle End-to-End Development Toolkit
Modular quant framework
A theme for Sublime Text 3 by Mattia Astorino
Cross-platform API testing client for humans
OCR expert VLM powered by Hunyuan's native multimodal architecture
Open multimodal web agent built by Ai2
Learning agent trained in a diffusion world model
Fast, powerful, git-native ticket tracking in a single bash script
An AI-powered data science team of agents
A Python library for extracting structured information
TorchMultimodal is a PyTorch library