Download Latest Version v0.6.0 source code.tar.gz (2.8 MB)
Email in envelope

Get an email when there's a new version of Oumi

Home / v0.6.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-12-17 5.2 kB
v0.6.0 source code.tar.gz 2025-12-17 2.8 MB
v0.6.0 source code.zip 2025-12-17 3.8 MB
Totals: 3 Items   6.6 MB 3

Oumi v0.6.0 Changelog

We’re excited to announce Oumi v0.6.0! This release brings Python 3.13 support, a powerful new CLI for dataset analysis, the TRL GOLD trainer for preference learning, and first-class Kubernetes deployment support.


Highlights

Python 3.13 Support

Oumi now officially supports Python 3.13, letting you take advantage of the latest Python performance improvements and features.
(#2092)


New oumi analyze CLI Command

Understanding your training data just got easier. The new oumi analyze command lets you inspect and analyze datasets directly from the command line—no code required.

:::bash
# Analyze a local dataset
oumi analyze -c configs/examples/analyze/analyze.yaml

:::bash
# Export results in different formats
oumi analyze -c configs/examples/analyze/analyze.yaml --format parquet --output ./my_results

Create a simple config to analyze any HuggingFace dataset:

:::yaml
# hf_analyze.yaml
dataset_name: argilla/databricks-dolly-15k-curated-en
split: train
sample_count: 1000
analyzers:
  - id: length

Check out the analyze documentation for more details.
(#2069, [#2071])


TRL GOLD Trainer

We’ve added support for the GOLD (Generalized Online Learning from Demonstrations trainer from TRL. GOLD is an online preference learning algorithm that improves upon DPO by generating responses on-the-fly during training, leading to better alignment with less distribution shift.

:::bash
# Run GOLD training with the example config
oumi train -c configs/examples/gold/train.yaml

Or configure it in your own training config:

:::yaml
training:
  trainer_type: "TRL_GOLD"
  gold:
    teacher_model_name_or_path: "HuggingFaceTB/SmolLM2-360M-Instruct"
    temperature: 0.9
    max_completion_length: 512
    lmbda: 0.5  # 50% on-policy, 50% off-policy

This requires TRL 0.26+, which is now the default.
(#2095, [#2097])


Code Evaluation Judges

New LLM-as-judge evaluators specifically designed for assessing code quality. These judges can evaluate generated code for correctness, style, security, and other software engineering best practices—perfect for evaluating coding assistants and code generation models.

Thanks to @N-45div for this contribution!
(#2087)


Kubernetes Deployment

You can now deploy Oumi training jobs on Kubernetes clusters.

Option 1: Using SkyPilot (new in this release)

:::yaml
# k8s_job.yaml
name: my-training-job
resources:
  cloud: k8s
  accelerators: "A100:1"
run: |
  oumi train -c configs/recipes/llama3_1/sft/8b_lora/train.yaml

:::bash
oumi launch up -c k8s_job.yaml --cluster my-k8s-cluster

Option 2: Direct kubectl deployment

For existing K8s clusters, you can deploy Oumi directly using kubectl. See the Kubernetes deployment guide for detailed instructions including platform-specific examples for EKS, GKE, and AKS.

Thanks to @min-oumi!
(#2054, [#2068])


Custom Master Port for Distributed Training

Running multiple distributed training jobs on the same node? You can now specify a custom master port to avoid conflicts.

Thanks to @monnetb!
(#2021)


ARM Docker Images for Mac

Apple Silicon users rejoice! We now publish ARM64 Docker images, so you can run Oumi containers natively on M1/M2/M3 Macs without emulation overhead. (#2049)


Bug Fixes

  • Fix Docker release action (#2023)
  • Fix length analyzer column naming and add comprehensive message summary tests (#2057)
  • Fix "too many files open" error when processing large datasets (#2060)
  • Fix lm_eval multi-GPU integration for distributed evaluation (#2064)
  • Fix mutable default argument in conversation handling (#2048)

Documentation

  • Add news item on OpenEnv notebook (#2022)
  • Add docs for missing inference params and how to serve LoRA adapters (#2047)
  • Add local Docker guide (#2058)

Deprecations

  • Cambrian model: The experimental Cambrian model has been deprecated (#2034)
  • target_col: Removed deprecated target_col field mentions (#2056)

Dependencies

  • TRL upgraded to 0.26 (#2097)
  • datasets library upgraded (#2091)
  • wandb >=0.21,<0.24 (#2032)
  • safetensors >=0.6,<0.8 (#2031)
  • bitsandbytes >=0.47,<0.49 (#2038)
  • torchao >=0.12,<0.15 (#2079)
  • deepspeed >=0.17.0,<0.19.0 (#2080)
  • pydantic >=2.11,<2.13 (#2081)
  • skypilot >=0.10.2,<0.12 (#2089)
  • torchdata is now optional (#2066)

New Contributors

  • @monnetb made their first contribution in [#2021]
  • @dependabot[bot] made their first contribution in [#2029]
  • @min-oumi made their first contribution in [#2054]
  • @N-45div made their first contribution in [#2087]

Full Changelog:
https://github.com/oumi-ai/oumi/compare/v0.5.0...v0.6.0

Source: README.md, updated 2025-12-17