Page 4 | Best Open Source Linux Reinforcement Learning Algorithms 2025

Transformer Reinforcement Learning X

A repo for distributed training of language models with Reinforcement

trlX is a distributed training framework designed from the ground up to focus on fine-tuning large language models with reinforcement learning using either a provided reward function or a reward-labeled dataset. Training support for Hugging Face models is provided by Accelerate-backed trainers, allowing users to fine-tune causal and T5-based language models of up to 20B parameters, such as facebook/opt-6.7b, EleutherAI/gpt-neox-20b, and google/flan-t5-xxl. For models beyond 20B parameters, trlX provides NVIDIA NeMo-backed trainers that leverage efficient parallelism techniques to scale effectively.

Downloads: 0 This Week

Last Update: 2024-08-03

See Project

Trax

Deep learning with clear code and speed

Trax is an end-to-end library for deep learning that focuses on clear code and speed. It is actively used and maintained in the Google Brain team. Run a pre-trained Transformer, create a translator in a few lines of code. Features and resources, API docs, where to talk to us, how to open an issue and more. Walkthrough, how Trax works, how to make new models and train on your own data. Trax includes basic models (like ResNet, LSTM, Transformer) and RL algorithms (like REINFORCE, A2C, PPO). It is also actively used for research and includes new models like the Reformer and new RL algorithms like AWR. Trax has bindings to a large number of deep learning datasets, including Tensor2Tensor and TensorFlow datasets. You can use Trax either as a library from your own python scripts and notebooks or as a binary from the shell, which can be more convenient for training large models. It runs without any changes on CPUs, GPUs and TPUs.

Downloads: 0 This Week

Last Update: 2021-10-26

See Project

Verve: General Purpose Agents

General purpose agents using reinforcement learning. Combines radial basis functions, temporal difference learning, planning, uncertainty estimations, and curiosity. Intended to be an out-of-the-box solution for roboticists and game developers.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-24

See Project

Vowpal Wabbit

Machine learning system which pushes the frontier of machine learning

Vowpal Wabbit is a machine learning system that pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. There is a specific focus on reinforcement learning with several contextual bandit algorithms implemented and the online nature lending to the problem well. Vowpal Wabbit is a destination for implementing and maturing state-of-the-art algorithms with performance in mind. The input format for the learning algorithm is substantially more flexible than might be expected. Examples can have features consisting of free-form text, which is interpreted in a bag-of-words way. There can even be multiple sets of free-form text in different namespaces. Similar to the few other online algorithm implementations out there. There are several optimization algorithms available with the baseline being sparse gradient descent (GD) on a loss function.

Downloads: 0 This Week

Last Update: 2024-08-01

See Project

cerrla

The CERRLA algorithm, developed by Sam Sarjant

This project contains the files required to run the Cross-Entropy Relational Reinforcement Learning Agent (CERRLA) algorithm. Note that a copy of the JESS rules engine will also be required.

Downloads: 0 This Week

Last Update: 2013-05-30

See Project

dm_control

DeepMind's software stack for physics-based simulation

DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo. DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics. The MuJoCo Python bindings support three different OpenGL rendering backends: EGL (headless, hardware-accelerated), GLFW (windowed, hardware-accelerated), and OSMesa (purely software-based). At least one of these three backends must be available in order render through dm_control. Hardware rendering with a windowing system is supported via GLFW and GLEW. On Linux these can be installed using your distribution's package manager. "Headless" hardware rendering (i.e. without a windowing system such as X11) requires EXT_platform_device support in the EGL driver. While dm_control has been largely updated to use the pybind11-based bindings provided via the mujoco package, at this time it still relies on some legacy components that are automatically generated.

Downloads: 0 This Week

Last Update: 2025-09-18

See Project

tic tac toe AI

simplest AI programme of tic-tac-toe game

This is a program of tic tac toe game it currently is the 1.0 version of this this is my program - an AI program which plays tic-tac-toe, it is an AI program which is given knowledge on the basis of my previous analysis and knowledge about playing tic-tac-toe. I have made it to be playable with players right now but I can make it for AI vs AI, AI vs player, player vs player as well. Using a settings option. I think this program has enough IQ to defeat a normal person. This is the update 1.1 of this game. My future visions about this program is: v 1.0.1 --> bug fixes v 1.1 --> (added) click interaction _______________________________________________________________________________________________________________________________________________ v 1.2 --> addition of reinforcement learning (cache data different for each computer unlike v1.3). v 1.3 --> addition of cloud reinforcement learning (optional; chosen from settings). ... & more

Downloads: 0 This Week

Last Update: 2024-11-14

See Project

Open Source Linux Reinforcement Learning Algorithms - Page 4

Reinforcement Learning Algorithms for Linux

Transformer Reinforcement Learning X

Trax

Verve: General Purpose Agents

Vowpal Wabbit

cerrla

dm_control

tic tac toe AI

Open Source Linux Reinforcement Learning Algorithms - Page 4

Reinforcement Learning Algorithms for Linux

Transformer Reinforcement Learning X

Trax

Verve: General Purpose Agents

Vowpal Wabbit

cerrla

dm_control

tic tac toe AI

Related Searches