Analyzing Hacker News discussions from a decade ago in hindsight
Marrying Grounding DINO with Segment Anything & Stable Diffusion
Ultimate meta-skill for generating best-in-class Claude Code skills
Motion-controllable Video Generation via Latent Trajectory Guidance
A tool to use the Ai2 Open Coding Agents Soft-Verified Agents
Persistent context and multi-instance coordination
Multimodal embedding and reranking models built on Qwen3-VL
A New Axis of Sparsity for Large Language Models
A pretty sweet vulnerability scanner
Collection of reference environments, offline reinforcement learning
Simple and easily configurable grid world environments
Spanish-language course repository that teaches fundamentals of SQL
Implementation of "MobileCLIP" CVPR 2024
Code release for Cut and Learn for Unsupervised Object Detection
CoreNet: A library for training deep neural networks
High-resolution models for human tasks
CLIP, Predict the most relevant text snippet given an image
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Multimodal Diffusion with Representation Alignment
Powerful and highly extensible command-line based document
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
Google AI Studio Starter Apps