Conduit streams data between data stores. Kafka Connect replacement
StarRocks is a next-gen sub-second MPP database for full analytics
Kestra is an infinitely scalable orchestration and scheduling platform
Producer and consumer actors with back-pressure for Elixir
A distributed and extensible workflow scheduler platform
lakeFS - Git-like capabilities for your object storage
Pythonic tool for running machine-learning/high performance workflows
A lightweight stream processing library for Go
A fast script language for Go
Open-source data observability for analytics engineers
The open standard for data logging
Next-Generation Event Processing Platform
SeaTunnel is a distributed, high-performance data integration platform
Backstage is an open platform for building developer portals
A ranked list of awesome Python open-source libraries
Automated Tool for Optimized Modelling
Making DAG construction easier
Light-weight, flexible, expressive statistical data testing library
Privacy and Security focused Segment-alternative, in Golang
Build, run, and manage data pipelines for integrating data
AutoGluon: AutoML for Image, Text, and Tabular Data
Python module that helps you build complex pipelines of batch jobs
Open Source Data Orchestration for the Cloud
Pentaho offers comprehensive data integration and analytics platform.
Real-time, incremental ETL library for ML with record-level depend