data processing free download

Dolphin Scheduler

A distributed and extensible workflow scheduler platform

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dedicated to solving the complex task dependencies in data processing, making the scheduler system out of the box for data processing. Decentralized multi-master and multi-worker, HA is supported by itself, overload processing. All process definition operations are visualized, Visualization process defines key information at a glance, One-click deployment. ...

Downloads: 5 This Week

Last Update: 2026-03-01

See Project

Kestra

Kestra is an infinitely scalable orchestration and scheduling platform

Build reliable workflows, blazingly fast, deploy in just a few clicks. Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence. Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate...

Downloads: 2 This Week

Last Update: 20 hours ago

See Project

BitSail

BitSail is a distributed high-performance data integration engine

BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day. BitSail has been widely used and...

Downloads: 0 This Week

Last Update: 2023-06-12

See Project

apache spark data pipeline osDQ

osDQ dedicated to create apache spark based data pipeline using JSON

This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark.

Downloads: 0 This Week

Last Update: 2019-01-20

See Project

Apache Kafka

Mirror of Apache Kafka

...Producers append records to partitions, brokers replicate them for durability, and consumer groups read them at their own pace while balancing work across instances. The commit/offset model and retention policies support patterns from real-time processing to event sourcing and audit trails. Exactly-once processing semantics, idempotent producers, and transactions help prevent duplicates across complex dataflows. Kafka Streams and Kafka Connect extend the core: Streams provides a library for stateful stream processing within applications, while Connect standardizes integration with external systems. ...

Downloads: 8 This Week

Last Update: 2025-09-05

See Project

Data Pipeline

A graphical data manipulation and processing system including data import, numerical analysis and visualisation. The software is written in Java and built upon the Netbeans platform to provide a modular desktop data manipulation application.

Downloads: 0 This Week

Last Update: 2014-07-13

See Project

Search Results for "data processing"

Showing 6 open source projects for "data processing"

Dolphin Scheduler

Kestra

BitSail

apache spark data pipeline osDQ

Apache Kafka

Data Pipeline

Search Results for "data processing"

Showing 6 open source projects for "data processing"

Dolphin Scheduler

Kestra

BitSail

apache spark data pipeline osDQ

Apache Kafka

Data Pipeline

Related Searches

Related Categories