SeaTunnel is a distributed, high-performance data integration platform
A distributed and extensible workflow scheduler platform
Conduit streams data between data stores. Kafka Connect replacement
Kestra is an infinitely scalable orchestration and scheduling platform
Python module that helps you build complex pipelines of batch jobs
Backstage is an open platform for building developer portals
lakeFS - Git-like capabilities for your object storage
Open-source data observability for analytics engineers
The open standard for data logging
Build, run, and manage data pipelines for integrating data
AutoGluon: AutoML for Image, Text, and Tabular Data
Code review for data in dbt
Streaming reactive and dataflow graphs in Python
BitSail is a distributed high-performance data integration engine
Use SQL to build ELT pipelines on a data lakehouse
Connect processes into powerful data pipelines
osDQ dedicated to create apache spark based data pipeline using JSON
Mirror of Apache Kafka