Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "reinforcement learning" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 168
Mac 160
Windows 158
More...
BSD 91
ChromeOS 91
Mobile Operating Systems 3

Category

Artificial Intelligence 172
Software Development 11
Scientific/Engineering 5
Education 3
Games 3
Business 2
Database 1
Formats and Protocols 1
Multimedia 1
System 1

License

OSI-Approved Open Source 154
Creative Commons Attribution License 1

Translations

Chinese (Simplified) 1
Chinese (Traditional) 1
English 1

Programming Language

Python 172
C++ 6
Unix Shell 3
C 2
Java 1
More...
MATLAB 1

Status

Alpha 3
Beta 2
Planning 1
Pre-Alpha 1

Showing 172 open source projects for "reinforcement learning"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
1

robosuite

A Modular Simulation Framework and Benchmark for Robot Learning

Robosuite is a modular and extensible simulation framework for robotic manipulation tasks, built on top of MuJoCo. Developed by the ARISE Initiative, Robosuite offers a set of standardized benchmarks and customizable environments designed to advance research in robotic manipulation, control, and imitation learning. It emphasizes realistic simulations and ease of use for both single-task and multi-task learning.

Downloads: 0 This Week

Last Update: 2025-12-23
See Project
2

verl-agent

Designed for training LLM/VLM agents via RL

verl-agent is an open-source reinforcement learning framework designed to train large language model agents and vision-language model agents for complex interactive environments. Built as an extension of the veRL reinforcement learning infrastructure, the project focuses on enabling scalable training for agents that perform multi-step reasoning and decision-making tasks.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
3

Humanoid-Gym

Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training. ...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
4

Diffusion for World Modeling

Learning agent trained in a diffusion world model

Diffusion for World Modeling is an experimental reinforcement learning system that trains intelligent agents inside a simulated environment generated by a diffusion-based world model. The project introduces the idea of using diffusion models, commonly used for image generation, to simulate the dynamics of an environment and predict future states based on previous observations and actions.

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.

Sign Up Free
5

RLHF-Reward-Modeling

Recipes to train reward model for RLHF

RLHF-Reward-Modeling is an open-source research framework focused on training reward models used in reinforcement learning from human feedback for large language models. In RLHF pipelines, reward models are responsible for evaluating generated responses and assigning scores that guide the model toward outputs that better match human preferences. The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. ...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
6

CUDA Agent

Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

CUDA Agent is a research-driven agentic reinforcement learning system designed to automatically generate and optimize high-performance CUDA kernels for GPU workloads. The project addresses the long-standing challenge that efficient CUDA programming typically requires deep hardware expertise by training an autonomous coding agent capable of iterative improvement through execution feedback.

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
7

ReCall

Learning to Reason with Search for LLMs via Reinforcement Learning

...Instead of relying purely on static knowledge stored inside the model, ReCall allows the language model to dynamically decide when it should retrieve information or invoke external capabilities during the reasoning process. The framework uses reinforcement learning to train models to perform these tool calls effectively while solving multi-step reasoning tasks.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
8

Reco-papers

Classic papers and resources on recommendation

Reco-papers is a curated repository that collects influential research papers, technical resources, and industry materials related to recommender systems and recommendation algorithms. The project organizes a large body of literature into thematic sections such as classic recommender systems, exploration-exploitation strategies, deep learning–based recommendation models, and cold-start mitigation techniques. It serves as a reference library for researchers and engineers who want to explore foundational and cutting-edge work in recommendation technologies. The repository includes papers from academic institutions and industry organizations and groups them according to topics such as retrieval and reranking, reinforcement learning in recommendation, and feature engineering infrastructure. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
9

highway-env

A minimalist environment for decision-making in autonomous driving

HighwayEnv is an OpenAI Gym-compatible environment focused on autonomous driving scenarios. It provides flexible simulations for testing decision-making algorithms in highway, intersection, and merging traffic situations.

Downloads: 0 This Week

Last Update: 2025-10-18
See Project
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
10

Sapiens

High-resolution models for human tasks

...It includes simulation environments, datasets, and benchmarks for testing grounded understanding, imitation learning, and decision-making. The system’s modular pipeline supports both imitation-based and reinforcement-based training strategies, allowing flexible experimentation with different embodiments and tasks.

Downloads: 1 This Week

Last Update: 2025-10-07
See Project
11

Tongyi DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

...The model is about 30.5 billion parameters in size, though at any given token only ~3.3B parameters are active. It uses a mix of synthetic data generation, fine-tuning and reinforcement learning; supports benchmarks like web search, document understanding, question answering, “agentic” tasks; provides inference tools, evaluation scripts, and “web agent” style interfaces. The aim is to enable more autonomous, agentic models that can perform sustained knowledge gathering, reasoning, and synthesis across multiple modalities (web, files, etc.).

Downloads: 6 This Week

Last Update: 2026-02-27
See Project
12

R1-V

Witness the aha moment of VLM with less than $3

R1-V is an initiative aimed at enhancing the generalization capabilities of Vision-Language Models (VLMs) through Reinforcement Learning in Visual Reasoning (RLVR). The project focuses on building a comprehensive framework that emphasizes algorithm enhancement, efficiency optimization, and task diversity to achieve general vision-language intelligence and visual/GUI agents. The team's long-term goal is to contribute impactful open-source research in this domain.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
13

MetaClaw

Just talk to your agent

MetaClaw is an AI or agent-oriented system that appears to focus on advanced control, coordination, or training of autonomous agents, potentially within reinforcement learning or tool-using environments. The project likely emphasizes meta-level reasoning, where agents are not only executing tasks but also adapting their strategies based on feedback and performance signals. It may incorporate mechanisms for learning from interactions, improving decision-making over time, and generalizing across different domains. ...

Downloads: 0 This Week

Last Update: 2026-04-11
See Project
14

PKU Beaver

Constrained Value Alignment via Safe Reinforcement Learning

PKU Beaver is an open-source research project focused on improving the safety alignment of large language models through reinforcement learning from human feedback under explicit safety constraints. The framework introduces techniques that separate helpfulness and harmlessness signals during training, allowing models to optimize for useful responses while minimizing harmful behavior. To support this process, the project provides datasets containing human-labeled examples that encode both performance preferences and safety constraints across multiple dimensions. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
15

Ring

Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI

...Its design emphasizes reasoning, efficiency, and modular expert activation. In its “flash” variant (Ring-flash-2.0), it optimizes inference by activating only a subset of experts. It applies reinforcement learning/reasoning optimization techniques. Its architectures and training approaches are tuned to enable efficient and capable reasoning performance. Reasoning-optimized model with reinforcement learning enhancements. Efficient architecture and memory design for large-scale reasoning. If you are located in mainland China, we also provide the model on ModelScope.cn to speed up the download process.

Downloads: 0 This Week

Last Update: 2025-09-30
See Project
16

Youtu-Agent

A simple yet powerful agent framework that delivers with models

...The framework supports automated generation of agent components, enabling the system to synthesize prompts, tool interfaces, and workflow configurations automatically. Youtu-Agent also incorporates hybrid learning strategies that combine experience accumulation with reinforcement learning to improve agent performance over time. These learning mechanisms allow agents to refine their reasoning, coding, and search capabilities as they interact with environments and tasks.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
17

LiteMultiAgent

The Library for LLM-based multi-agent applications

LiteMultiAgent is a lightweight and extensible multi-agent reinforcement learning (MARL) platform designed for rapid experimentation. It allows researchers to design and test coordination, competition, and collaboration scenarios in simulated environments.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
18

Minigrid

Simple and easily configurable grid world environments

Minigrid is a lightweight, minimalistic grid-world environment library for reinforcement learning (RL) research. It provides a suite of simple 2D grid-based tasks (e.g., navigating mazes, unlocking doors, carrying keys) where an agent moves in discrete steps and interacts with objects. The design emphasizes speed (agents can run thousands of steps per second), low dependency overhead, and high customizability — making it easy to define new maps, new tasks, or wrappers. ...

Downloads: 0 This Week

Last Update: 2025-11-25
See Project
19

PilottAI

Python framework for building scalable multi-agent systems

pilottai is an AI-based autonomous drone navigation system utilizing reinforcement learning for real-time decision-making. It is designed for simulating and training drones to fly safely through dynamic environments using AI-based controllers.

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
20

AgentEvolver

Towards Efficient Self-Evolving Agent System

...The system focuses on improving the efficiency and scalability of training autonomous agents by allowing them to generate tasks, explore environments, and refine strategies without heavy reliance on manually curated datasets. Its architecture combines reinforcement learning with LLM-driven reasoning mechanisms to guide exploration and learning. The framework introduces several key mechanisms, including self-questioning to create new learning tasks, self-navigating to improve exploration through experience reuse, and self-attributing to assign rewards based on the usefulness of actions. ...

Downloads: 0 This Week

Last Update: 2026-03-28
See Project
21

Agent Behavior Monitoring

The open source post-building layer for agents

...Developers can use the framework to observe agent actions in both online production environments and offline evaluation settings, making it useful for debugging and performance analysis. Judgeval transforms agent interaction trajectories into structured evaluation datasets that can be used for reinforcement learning, supervised fine-tuning, or other forms of post-training improvement. The framework includes tools that analyze agent behavior patterns and group interaction trajectories by behavior type or topic, allowing researchers to detect weaknesses or unexpected behaviors.

Downloads: 9 This Week

Last Update: 2026-04-09
See Project
22

Unsloth Studio

Unified web UI for training and running open models locally

...Built on top of the Unsloth framework, it focuses on high-performance training with reduced VRAM usage and faster speeds compared to traditional methods. The platform supports fine-tuning, pretraining, and reinforcement learning workflows, making it suitable for both experimentation and production use. Users can interact with models through chat, upload files like PDFs or images, and execute code within the environment to improve outputs. By combining powerful optimization techniques with an intuitive UI, Unsloth Studio simplifies the process of building and customizing AI models locally.

Downloads: 13 This Week

Last Update: 2026-04-23
See Project
23

Pearl

A Production-ready Reinforcement Learning AI Agent Library

Pearl is a production-ready reinforcement learning and contextual bandit agent library built for real-world sequential decision making. It is organized around modular components—policy learners, replay buffers, exploration strategies, safety modules, and history summarizers—that snap together to form reliable agents with clear boundaries and strong defaults. The library implements classic and modern algorithms across two regimes: contextual bandits (e.g., LinUCB, LinTS, SquareCB, neural bandits) and fully sequential RL (e.g., DQN, PPO-style policy optimization), with attention to practical concerns like nonstationarity and dynamic action spaces. ...

Downloads: 0 This Week

Last Update: 2026-04-23
See Project
24

MiniOneRec

Minimal reproduction of OneRec

...The framework provides an end-to-end pipeline for building generative recommender systems, including semantic identifier construction, supervised fine-tuning, and reinforcement learning-based optimization. Semantic IDs are created using techniques such as quantized variational autoencoders to convert item features into token sequences that can be modeled by transformer architectures. Developers can train and evaluate recommendation models using different backbone language models while benefiting from the generative framework’s parameter efficiency and scalability.

Downloads: 0 This Week

Last Update: 2026-03-31
See Project
25

AgentScope

Build and run agents you can see, understand and trust

...With built-in support for ReAct agents, memory, planning, human-in-the-loop control, and real-time voice interaction, developers can create powerful agents in minutes. AgentScope integrates seamlessly with tools, long-term memory systems, MCP, A2A (Agent-to-Agent) protocols, and observability frameworks. It also supports reinforcement learning workflows for tuning agents and improving performance across complex tasks. Deployable locally, serverless in the cloud, or on Kubernetes with OpenTelemetry support, AgentScope is built for both experimentation and production environments.

Downloads: 3 This Week

Last Update: 5 days ago
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

robot simulation

cuda

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Education

Games

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise