| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-04-08 | 3.3 kB | |
| v0.7.8 source code.tar.gz | 2026-04-08 | 8.1 MB | |
| v0.7.8 source code.zip | 2026-04-08 | 9.3 MB | |
| Totals: 3 Items | 17.4 MB | 5 | |
What's Changed 🚀
💥 Breaking Changes
- chore!: Remove unused
max_task_backlogparameter @srilman (#6591)
✨ Features
- feat: Expose Subscribers through nicer APIs @srilman (#6631)
- feat(dashboard): heatmap coloring for nodes. @universalmind303 (#6628)
- feat(dataframe): add DataFrame.skew() global aggregation method @kerwin-zk (#6619)
- feat(dashboard): add arrows indicating pipeline direction @universalmind303 (#6625)
- feat: wire CheckpointId through Flotilla execution pipeline @rohitkulshreshtha (#6567)
- feat: add image_hash() for image deduplication @chenghuichen (#6485)
- feat: support get function from catalog @gavin9402 (#6524)
- feat(dataframe): add var() method to DataFrame and GroupedDataFrame @kerwin-zk (#6584)
- feat: checkpoint based on distributed key-existence filter @everySympathy (#5931)
- feat(distributed): make flotilla worker actor startup timeout configurable @desmondcheongzx (#6592)
🐛 Bug Fixes
- fix: Fix nightly Daft version resolution in Ray runtime env @jeevb (#6630)
- fix(dashboard): pass partition sets to repr_json so plan matches execution topology @samstokes (#6576)
- fix(scan): skip getting bytes when range start equals end in daft async reader @gweaverbiodev (#6602)
- fix(io): retry transient errors on initial GET request @desmondcheongzx (#6544)
- fix(dashboard): use smart per-node stats aggregation for distributed execution @samstokes (#6574)
- fix(io): handle schema-evolved Iceberg columns in Parquet predicate pushdown @sankarreddy-atlan (#6551)
- fix(dashboard): prevent flotilla workers from sending spurious lifecycle events @samstokes (#6573)
🚀 Performance
- perf: Optimize GroupBy Map Building \& List-Agg @srilman (#6613)
- perf(flight): Read local shuffle data directly from disk @srilman (#6436)
- perf(inline-agg): add min/max accumulator types @BABTUNA (#6604)
- perf: inline vectorized aggregation for grouped count/sum @desmondcheongzx (#6345)
♻️ Refactor
- refactor: migrate PaimonScanOperator to DataSource API @chenghuichen (#6600)
- refactor(subscriber): collapse trait to single sync on_event method @cckellogg (#6593)
- refactor: consolidate BatchManager as single buffer abstraction @universalmind303 (#6566)
- refactor: Use DAFT_REF_NAME and DAFT_SHA env vars for benchmark run metadata @jeevb (#6610)
- refactor: Minor parameterization for benchmarking workflows @jeevb (#6609)
- refactor(distributed): unify repartition exchange write flow across ray and flight @ohbh (#6499)
📖 Documentation
- docs: add custom SQL filter documentation for Lance @Jay-ju (#5916)
👷 CI
- ci(test): reduce Ray resource requests in chained skip_existing tests @desmondcheongzx (#6633)
- ci(test): reduce Ray resource requests in skip_existing tests @desmondcheongzx (#6629)
- ci(deps): fix CI failures from dependabot bump [#6570] @desmondcheongzx (#6596)
🔧 Maintenance
- chore(cargo): inherit edition/version from workspace in root package @yew1eb (#6329)
- chore!: Remove unused
max_task_backlogparameter @srilman (#6591) - chore(dashboard): add ?debug query param for SSE event console logging @samstokes (#6577)
Full Changelog: https://github.com/Eventual-Inc/Daft/compare/v0.7.7...v0.7.8