Python inference and LoRA trainer package for the LTX-2 audio–video
PS2 Covers Collection
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
A full-featured, hackable tiling window manager written in Python
Lets make video diffusion practical
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
No-code in the front, Python in the back. An open-source framework
Master the fundamentals of machine learning, deep learning
Open-source evaluation toolkit of large multi-modality models (LMMs)
Full-stack AI Red Teaming platform
The most powerful Android RPA agent framework
Official implementation of Watermark Anything with Localized Messages
Label Studio is a multi-type data labeling and annotation tool
Azure command-line interface
An open phone agent model & framework
Agent S: an open agentic framework that uses computers like a human
3D Engine with Blender Integration
Create beautiful slides on the web using Claude's frontend skills
AI tool that converts GitHub repositories into interactive diagrams
Extension of Google Research’s PaperBanana
Multimodal Agents as Smartphone Users, an LLM-based multimodal agent
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Agent Skill for generating 2D sprite sheets and map, transparent PNG
Open-Source Python3 tool for recognizing layouts, tables, and math
A frontier, first-principles handbook