Agent S: an open agentic framework that uses computers like a human
Master the fundamentals of machine learning, deep learning
Open-source evaluation toolkit of large multi-modality models (LMMs)
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
Automate native Android apps with AI using accessibility APIs
Gemma open-weight LLM library, from Google DeepMind
A Pioneering Open-Source Alternative to GPT-4o
3D plotting and mesh analysis through a streamlined interface
All-in-one AI productivity platform with agents, workflows, and IM
Browse the web, directly from Cursor etc.
PDF to Markdown with vision models
Label Studio is a multi-type data labeling and annotation tool
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
Open-source and free to self-host
Extension of Google Research’s PaperBanana
Multimodal Agents as Smartphone Users, an LLM-based multimodal agent
Doom-based AI research platform for reinforcement learning
GPT Image 2 prompt gallery, image prompt library, agentic skill
A frontier, first-principles handbook
Modular quant framework
3D Engine with Blender Integration
A beautiful, powerful, self-hosted rom manager and player
Expressive Portrait Image Animation for Live Streaming
About 24 Lessons, 12 Weeks, Get Started as a Web Developer
Taming Stable Diffusion for Lip Sync