Multimodal embedding and reranking models built on Qwen3-VL
"Big Model" trains a visual multimodal VLM with 26M parameters
Implementation of "MobileCLIP" CVPR 2024
Official implementation of Watermark Anything with Localized Messages
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
Pymunk is a easy-to-use pythonic 2d physics library
Conditional GAN for generating synthetic tabular data
Multi-Joint dynamics with Contact. A general purpose physics simulator
Ct.js is a desktop game engine that makes learning programming fun
PS2 Covers Collection
Automatically translates the text of a video based on a subtitle file
MCP server that integrates Confluence and Jira
OpenDILab Decision AI Engine
Data and tools for generating and inspecting OLMo pre-training data
Efficient Retrieval Augmentation and Generation Framework
A fresh & lightweight javascript game engine
Welcome to the Bot Framework SDK for JavaScript repository
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GenAI Processors is a lightweight Python library
The Ultimate Collection of 700+ Agentic Skills for Claude Code
Convert codebases into structured prompts optimized for LLM analysis
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion