| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-02-02 | 1.8 kB | |
| SLM-Lab v5.0.0 - Gymnasium Migration _ Complete Benchmark Suite source code.tar.gz | 2026-02-02 | 6.5 MB | |
| SLM-Lab v5.0.0 - Gymnasium Migration _ Complete Benchmark Suite source code.zip | 2026-02-02 | 6.6 MB | |
| Totals: 3 Items | 13.1 MB | 0 | |
Major modernization release that updates SLM-Lab from OpenAI Gym to Gymnasium, migrates to modern Python tooling (uv), and validates all algorithms across 70+ environments.
Key Changes
- Gymnasium migration with correct
terminated/truncatedhandling - Modern toolchain:
uv+pyproject.toml, Python 3.12+, PyTorch 2.8+ - Simplified specs: No more
bodysection or array wrappers - Complete benchmark validation: 7 algorithms × 4 environment categories
- Cloud training support via dstack + HuggingFace
Benchmark Results
| Algorithm | Classic | Box2D | MuJoCo | Atari |
|---|---|---|---|---|
| REINFORCE | ✅ | — | — | — |
| SARSA | ✅ | — | — | — |
| DQN | ✅ | ✅ | — | — |
| DDQN+PER | ✅ | ✅ | — | — |
| A2C | ✅ | ⚠️ | ⚠️ | ✅ 54 games |
| PPO | ✅ | ✅ | ✅ 11 envs | ✅ 54 games |
| SAC | ✅ | ✅ | ✅ 11 envs | — |
Atari benchmarks use ALE v5 with sticky actions (repeat_action_probability=0.25), following Machado et al. (2018) research best practices.
Breaking Changes
- Environment names:
CartPole-v0→CartPole-v1,PongNoFrameskip-v4→ALE/Pong-v5 - Spec format simplified:
agent: [{...}]→agent: {...} bodysection removed, attributes moved toagent- Roboschool → MuJoCo (
RoboschoolHopper-v1→Hopper-v5)
Quick Start
:::bash
# Install
uv sync && uv tool install --editable .
# Run
slm-lab run spec.json spec_name train
Book Readers
For exact code from Foundations of Deep Reinforcement Learning, use:
:::bash
git checkout v4.1.1
See CHANGELOG.md (github.com) for full details.