Download Latest Version v0.15.0 source code.tar.gz (642.0 kB)
Email in envelope

Get an email when there's a new version of Datapipe

Home / v0.15.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2026-06-07 2.5 kB
v0.15.0 source code.tar.gz 2026-06-07 642.0 kB
v0.15.0 source code.zip 2026-06-07 695.4 kB
Totals: 3 Items   1.3 MB 0

0.15.0

Important new stuff:

MetaPlane

See meta-plane.md for motivation

  • Introduced MetaPlane/TableMeta/TransformMeta interfaces to decouple metadata management from the compute plane
  • Added SQL reference implementation (SQLMetaPlane, SQLTableMeta, SQLTransformMeta) and rewired DataStore, DataTable, and batch transform steps to consume the new meta plane API
  • Added meta-plane design doc and removed legacy MetaTable plumbing in lints, migrations, and tests

InputSpec and key mapping

See key-mapping.md for motivation

  • Renamed JoinSpec to InputSpec
  • Added keys parameter to InputSpec and ComputeInput to support joining tables with different key names
  • Added OutputSpec and ComputeOutput.keys to explicitly map transform keys to output table primary keys
  • Fixed batch transform cleanup for aliased output keys and incomplete transform keys

Step name overrides and uniform hash-based naming

  • Extracted make_mungled_step_name(cls, base_name, input_dts, output_dts) as a public helper in compute.py; it encodes the step class, function name, and table names into a short shake-128 hash suffix (e.g. my_func_9762dd6bae)
  • ComputeStep.name is now a plain stored attribute instead of a computed property, so the name is fixed at construction time and readable without re-hashing
  • All PipelineStep types now accept an optional name: str | None parameter; when provided it overrides the auto-generated hash name, making it easy to pin a stable name for a step independent of its inputs/outputs
  • DatatableTransform and UpdateExternalTable were previously using plain names (e.g. update_item); they now use make_mungled_step_name for consistency with the batch step types
  • pipeline_input_to_compute_input() extracted from BatchTransform into a module-level helper in compute.py and reused by DatatableBatchTransform
  • DatatableBatchTransform.inputs now accepts PipelineInput (same as BatchTransform), enabling Required/InputSpec wrappers
  • build_compute() now raises immediately on duplicate step names

Python3.9 support is deprecated

Improvements and fixes

  • Fixed dtypes mapping for TableStoreExcel, TableStoreJsonLine
  • Fixed meta changes compute logic for Required tables
Source: README.md, updated 2026-06-07