Medical imaging toolkit for deep learning
Autonomous Agents (LLMs) research papers. Updated Daily
Refer and Ground Anything Anywhere at Any Granularity
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
Project Lyra: Open Generative 3D World Models
Unsupervised Learning for Image Registration
A high performance implementation of HDBSCAN clustering
Models for object and human mesh reconstruction
Visual Causal Flow
code for Mesh R-CNN, ICCV 2019
A Systematic Framework for Interactive World Modeling
HeavyDB (formerly MapD/OmniSciDB)
Qwen2.5-VL is the multimodal large language model series
Video understanding codebase from FAIR for reproducing video models
Open-source 2D IDE for managing AI agents in native CLIs
Claw3D is an open source 3D engine built on OpenClaw
Build your own AI application system for free
Foundational Models for State-of-the-Art Speech and Text Translation
Unifying 3D Mesh Generation with Language Models
Gracefully face hCaptcha challenge with multimodal llms
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
The world's only naturally intelligent knowledge technology
Learning multi-scale deep model correcting over- and under- exposed
Let us control diffusion models
Navigation mesh generation and pathfinding toolkit for game AI systems