Build, run, and manage data pipelines for integrating data
A ranked list of awesome Python open-source libraries
A distributed and extensible workflow scheduler platform
Conduit streams data between data stores. Kafka Connect replacement
Python module that helps you build complex pipelines of batch jobs
A lightweight stream processing library for Go
A fast script language for Go
Pythonic tool for running machine-learning/high performance workflows
Privacy and Security focused Segment-alternative, in Golang
Build data pipelines, the easy way
Streaming reactive and dataflow graphs in Python
Kestra is an infinitely scalable orchestration and scheduling platform
Open Source Data Orchestration for the Cloud
Automated Tool for Optimized Modelling
Making DAG construction easier
Code review for data in dbt
Open-source data observability for analytics engineers
BitSail is a distributed high-performance data integration engine
The open standard for data logging
Next-Generation Event Processing Platform
SeaTunnel is a distributed, high-performance data integration platform
Light-weight, flexible, expressive statistical data testing library
Open source annotation and labeling tool for image and video assets
Backstage is an open platform for building developer portals
lakeFS - Git-like capabilities for your object storage