Python scripts for ETL (extract, transform and load) jobs for Ethereum
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
Centralize, transform and stash your data
Library providing end-to-end GPU-accelerated recommender systems
ETL framework to index data for AI, such as RAG
Python ETL framework for stream processing, real-time analytics, LLM
Concurrent Python made simple
lakeFS - Git-like capabilities for your object storage
Easy integration with Athena, Glue, Redshift, Timestream, Neptune
Real-time, incremental ETL library for ML with record-level depend
A free, open-source, and cross-platform big data analytics framework
Pentaho offers comprehensive data integration and analytics platform.
Search replace files or pipe
Text-based user interface to query data on Oracle DB in a smart way
Generalized Interoperability and Strong AI
A lightweight opinionated ETL framework, halfway between plain scripts
ETL engine based on Groovy
NBi is a testing framework (add-on to NUnit)
Streaming reactive and dataflow graphs in Python
Lightweight library to write, orchestrate and test your SQL ETL
Sync data between persistence engines, like ETL only not stodgy
Simple message-based, web-based ETL integration
Design, automate, operate and publish data pipelines at scale
osDQ dedicated to create apache spark based data pipeline using JSON