Kestra is an infinitely scalable orchestration and scheduling platform
A distributed and extensible workflow scheduler platform
SeaTunnel is a distributed, high-performance data integration platform
Backstage is an open platform for building developer portals
StarRocks is a next-gen sub-second MPP database for full analytics
Privacy and Security focused Segment-alternative, in Golang
Python module that helps you build complex pipelines of batch jobs
lakeFS - Git-like capabilities for your object storage
A ranked list of awesome Python open-source libraries
Open-source data observability for analytics engineers
A lightweight stream processing library for Go
The open standard for data logging
Light-weight, flexible, expressive statistical data testing library
Build, run, and manage data pipelines for integrating data
A fast script language for Go
Producer and consumer actors with back-pressure for Elixir
Next-Generation Event Processing Platform
Automated Tool for Optimized Modelling
Making DAG construction easier
AutoGluon: AutoML for Image, Text, and Tabular Data
Conduit streams data between data stores. Kafka Connect replacement
Pentaho offers comprehensive data integration and analytics platform.
Code review for data in dbt
Real-time, incremental ETL library for ML with record-level depend
Open source annotation and labeling tool for image and video assets