benchmark free download

rLLM

Democratizing Reinforcement Learning for LLMs

rLLM is an open-source framework for building and training post-training language agents via reinforcement learning — that is, using reinforcement signals to fine-tune or adapt language models (LLMs) into customizable agents for real-world tasks. With rLLM, developers can define custom “agents” and “environments,” and then train those agents via reinforcement learning workflows, possibly surpassing what vanilla fine-tuning or supervised learning might provide. The project is designed to...

Downloads: 0 This Week

Last Update: 2026-04-30

See Project

Habitat-Lab

A modular high-level library to train embodied AI agents

...Providing algorithms for single and multi-agent training (via imitation or reinforcement learning, or no learning at all as in SensePlanAct pipelines), as well as tools to benchmark their performance on the defined tasks using standard metrics.

Downloads: 0 This Week

Last Update: 2026-05-07

See Project

Agent S

Agent S: an open agentic framework that uses computers like a human

...Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. ...

Downloads: 1 This Week

Last Update: 2025-12-16

See Project

OWL

Optimized Workforce Learning for General Multi-Agent Assistance

Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation. OWL (Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation) is an advanced framework designed to enhance multi-agent collaboration, improving task automation across various domains. By utilizing dynamic agent interactions, OWL aims to streamline and optimize complex workflows, making AI collaboration more natural, efficient, and adaptable. It is built on...

1 Review

Downloads: 0 This Week

Last Update: 2025-03-13

See Project

Search Results for "benchmark"

Showing 4 open source projects for "benchmark"

rLLM

Habitat-Lab

Agent S

OWL

Search Results for "benchmark"

Showing 4 open source projects for "benchmark"

rLLM

Habitat-Lab

Agent S

OWL

Related Searches

Related Categories