feedback free download

PaLM + RLHF - Pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback)

PaLM-rlhf-pytorch is a PyTorch implementation of Pathways Language Model (PaLM) with Reinforcement Learning from Human Feedback (RLHF). It is designed for fine-tuning large-scale language models with human preference alignment, similar to OpenAI’s approach for training models like ChatGPT.

Downloads: 0 This Week

Last Update: 2025-09-19

See Project

verl

Volcano Engine Reinforcement Learning for LLMs

...It ships with reference implementations of popular alignment algorithms and clear examples that make it straightforward to reproduce baselines before customizing. Data pipelines treat human feedback, simulated environments, and synthetic preferences as interchangeable sources, which helps with rapid experimentation. VERL is meant for both research and production hardening: logging, checkpointing, and evaluation suites are built in so you can track learning dynamics and regressions over time.

Downloads: 0 This Week

Last Update: 2026-06-01

See Project

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework

OpenRLHF is an easy-to-use, scalable, and high-performance framework for Reinforcement Learning with Human Feedback (RLHF). It supports various training techniques and model architectures.

Downloads: 0 This Week

Last Update: 2026-06-08

See Project

Atropos

Language Model Reinforcement Learning Environments frameworks

...It provides foundational tooling for asynchronous RL loops where environment services communicate with trainers and inference engines, enabling complex workflow orchestration in distributed and parallel setups. This framework facilitates experimentation with RLHF (Reinforcement Learning from Human Feedback), RLAIF, or multi-turn training approaches by abstracting environment logic, scoring, and logging into reusable components.

Downloads: 0 This Week

Last Update: 2026-03-10

See Project

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.

Downloads: 0 This Week

Last Update: 2026-04-20

See Project

Astrape

Optical-packet node transceiver frequency allocation

...Then, configuration of the local transceiver laser frequencies of the controlled pluggable devices takes place, for facilitating the connectivity in-between the ROADM network. Also, the agent records and reports back telemetry data (feedback) which is used by the PacketCTL's resource-allocating mechanism to improve efficiency within the network topology.

Downloads: 0 This Week

Last Update: 2025-03-14

See Project

Search Results for "feedback"

Showing 6 open source projects for "feedback"

PaLM + RLHF - Pytorch

verl

OpenRLHF

Atropos

MedicalGPT

Astrape

Search Results for "feedback"

Showing 6 open source projects for "feedback"

PaLM + RLHF - Pytorch

verl

OpenRLHF

Atropos

MedicalGPT

Astrape

Related Categories