| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-12-17 | 5.2 kB | |
| v0.6.0 source code.tar.gz | 2025-12-17 | 2.8 MB | |
| v0.6.0 source code.zip | 2025-12-17 | 3.8 MB | |
| Totals: 3 Items | 6.6 MB | 3 | |
Oumi v0.6.0 Changelog
We’re excited to announce Oumi v0.6.0! This release brings Python 3.13 support, a powerful new CLI for dataset analysis, the TRL GOLD trainer for preference learning, and first-class Kubernetes deployment support.
Highlights
Python 3.13 Support
Oumi now officially supports Python 3.13, letting you take advantage of the latest Python performance improvements and features.
(#2092)
New oumi analyze CLI Command
Understanding your training data just got easier. The new oumi analyze command lets you inspect and analyze datasets directly from the command line—no code required.
:::bash
# Analyze a local dataset
oumi analyze -c configs/examples/analyze/analyze.yaml
:::bash
# Export results in different formats
oumi analyze -c configs/examples/analyze/analyze.yaml --format parquet --output ./my_results
Create a simple config to analyze any HuggingFace dataset:
:::yaml
# hf_analyze.yaml
dataset_name: argilla/databricks-dolly-15k-curated-en
split: train
sample_count: 1000
analyzers:
- id: length
Check out the analyze documentation for more details.
(#2069, [#2071])
TRL GOLD Trainer
We’ve added support for the GOLD (Generalized Online Learning from Demonstrations trainer from TRL. GOLD is an online preference learning algorithm that improves upon DPO by generating responses on-the-fly during training, leading to better alignment with less distribution shift.
:::bash
# Run GOLD training with the example config
oumi train -c configs/examples/gold/train.yaml
Or configure it in your own training config:
:::yaml
training:
trainer_type: "TRL_GOLD"
gold:
teacher_model_name_or_path: "HuggingFaceTB/SmolLM2-360M-Instruct"
temperature: 0.9
max_completion_length: 512
lmbda: 0.5 # 50% on-policy, 50% off-policy
This requires TRL 0.26+, which is now the default.
(#2095, [#2097])
Code Evaluation Judges
New LLM-as-judge evaluators specifically designed for assessing code quality. These judges can evaluate generated code for correctness, style, security, and other software engineering best practices—perfect for evaluating coding assistants and code generation models.
Thanks to @N-45div for this contribution!
(#2087)
Kubernetes Deployment
You can now deploy Oumi training jobs on Kubernetes clusters.
Option 1: Using SkyPilot (new in this release)
:::yaml
# k8s_job.yaml
name: my-training-job
resources:
cloud: k8s
accelerators: "A100:1"
run: |
oumi train -c configs/recipes/llama3_1/sft/8b_lora/train.yaml
:::bash
oumi launch up -c k8s_job.yaml --cluster my-k8s-cluster
Option 2: Direct kubectl deployment
For existing K8s clusters, you can deploy Oumi directly using kubectl. See the Kubernetes deployment guide for detailed instructions including platform-specific examples for EKS, GKE, and AKS.
Thanks to @min-oumi!
(#2054, [#2068])
Custom Master Port for Distributed Training
Running multiple distributed training jobs on the same node? You can now specify a custom master port to avoid conflicts.
Thanks to @monnetb!
(#2021)
ARM Docker Images for Mac
Apple Silicon users rejoice! We now publish ARM64 Docker images, so you can run Oumi containers natively on M1/M2/M3 Macs without emulation overhead. (#2049)
Bug Fixes
- Fix Docker release action (#2023)
- Fix length analyzer column naming and add comprehensive message summary tests (#2057)
- Fix "too many files open" error when processing large datasets (#2060)
- Fix lm_eval multi-GPU integration for distributed evaluation (#2064)
- Fix mutable default argument in conversation handling (#2048)
Documentation
- Add news item on OpenEnv notebook (#2022)
- Add docs for missing inference params and how to serve LoRA adapters (#2047)
- Add local Docker guide (#2058)
Deprecations
- Cambrian model: The experimental Cambrian model has been deprecated (#2034)
- target_col: Removed deprecated target_col field mentions (#2056)
Dependencies
- TRL upgraded to 0.26 (#2097)
- datasets library upgraded (#2091)
- wandb >=0.21,<0.24 (#2032)
- safetensors >=0.6,<0.8 (#2031)
- bitsandbytes >=0.47,<0.49 (#2038)
- torchao >=0.12,<0.15 (#2079)
- deepspeed >=0.17.0,<0.19.0 (#2080)
- pydantic >=2.11,<2.13 (#2081)
- skypilot >=0.10.2,<0.12 (#2089)
- torchdata is now optional (#2066)
New Contributors
- @monnetb made their first contribution in [#2021]
- @dependabot[bot] made their first contribution in [#2029]
- @min-oumi made their first contribution in [#2054]
- @N-45div made their first contribution in [#2087]
Full Changelog:
https://github.com/oumi-ai/oumi/compare/v0.5.0...v0.6.0