Declarative engine for generating AI-powered infographic visuals
Go package for computer vision using OpenCV 4 and beyond
Phi-3.5 for Mac: Locally-run Vision and Language Models
Vision AI browser agent for automation, testing, and extraction
Gemma open-weight LLM library, from Google DeepMind
Behavior tree AI for Godot Engine
free online AI resume editor
From Addition, Subtraction, Multiplication, and Division to ML
OpenUI let's you describe UI using your imagination
A computer vision closed-loop learning platform
Open Data, more than 50 financial data
Python package for AutoML on Tabular Data with Feature Engineering
Streaming markdown renderer for AI apps with smooth updates
Multilingual Document Layout Parsing in a Single Vision-Language Model
An on-premises, OCR-free unstructured data extraction
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Repository containing notebooks of my posts on Medium
Based on the LangChain/LangGraph framework
A frontier, first-principles handbook
Marrying Grounding DINO with Segment Anything & Stable Diffusion
Motion-controllable Video Generation via Latent Trajectory Guidance
Multimodal embedding and reranking models built on Qwen3-VL
"Big Model" trains a visual multimodal VLM with 26M parameters
AI Product Design Agent
Claude MCP, MCP Servers, MCP Clients