This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/
This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark.