Fast-stable-diffusion + DreamBooth
Ultimate meta-skill for generating best-in-class Claude Code skills
Motion-controllable Video Generation via Latent Trajectory Guidance
Hunyuan Translation Model Version 1.5
Persistent context and multi-instance coordination
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
"Big Model" trains a visual multimodal VLM with 26M parameters
Automatically translates the text of a video based on a subtitle file
Fast and accurate AI powered file content types detection
Implementation of "MobileCLIP" CVPR 2024
Official implementation of Watermark Anything with Localized Messages
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
Conditional GAN for generating synthetic tabular data
Operating LLMs in production
TokenSpeed is a speed-of-light LLM inference engine
MCP server that integrates Confluence and Jira
Benchmarking Multimodal Agents for Open-Ended Tasks
OpenDILab Decision AI Engine
Data and tools for generating and inspecting OLMo pre-training data
Efficient Retrieval Augmentation and Generation Framework
A general fine-tuning kit geared toward image/video/audio diffusion
ETL framework to index data for AI, such as RAG