Profile and monitor your ML data pipeline end-to-end
This is a Java implementation of WhyLogs, with support for Apache Spark integration for large scale datasets. Understanding the properties of data as it moves through applications is essential to keeping your ML/AI pipeline stable and improving your user experience, whether your pipeline is built for production or experimentation. WhyLogs is an open source statistical logging library that allows data science and ML teams to effortlessly profile ML/AI pipelines and applications, producing log...
Open source Extract Transform Load engine written in Java
ETL Framework is a standalone Extract Transform Load engine written in Java. It includes executables for all major platforms and can be easily integrated into other applications.
Key Features:
* embeddable, open source and free
* fast and scalable
* uses target database features to do transformations and loads
* manual and automatic data mapping
* data streaming
* bulk data loads
* data quality features using SQL, JavaScript? and regex
* data transformations
Requirements
*...