The most powerful Android RPA agent framework
A fast, powerful, and simple hierarchical vision transformer
Official python implementation of UTCP. UTCP is an open standard
Superduper: Integrate AI models and machine learning workflows
Implementation of Make-A-Video, new SOTA text to video generator
Transformers4Rec is a flexible and efficient library
Phi-3.5 for Mac: Locally-run Vision and Language Models
Official inference library for Mistral models
DeepSeek Coder: Let the Code Write Itself
Sharp Monocular Metric Depth in Less Than a Second
A dev-first open source autonomous AI agent framework
Python framework for building scalable multi-agent systems
Video understanding codebase from FAIR for reproducing video models
The NVIDIA AgentIQ toolkit is an open-source library
Meta Agents Research Environments is a comprehensive platform
Code for the paper Language Models are Unsupervised Multitask Learners
SwarmZero's SDK for building AI agents, swarms of agents and much more
Benchmarking Multimodal Agents for Open-Ended Tasks
An API standard for single-agent reinforcement learning environments
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
A solution to build and deploy MCP agents and applications
A Customizable Image-to-Video Model based on HunyuanVideo
Research code artifacts for Code World Model (CWM)
Multimodal-Driven Architecture for Customized Video Generation
⚡ Building applications with LLMs through composability ⚡