lakeFS - Git-like capabilities for your object storage
Real-time, incremental ETL library for ML with record-level depend
Pentaho offers comprehensive data integration and analytics platform.
Streaming reactive and dataflow graphs in Python
Design, automate, operate and publish data pipelines at scale
osDQ dedicated to create apache spark based data pipeline using JSON