osDQ dedicated to create apache spark based data pipeline using JSON
This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/
This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run dataprocessing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark.
giServer the easy to use and extensible batch and integration server
The giServer is an easy-to-use integration server for process automation and event-driven or scheduled execution of batch jobs.
Instead of using complex XML configuration files an elaborate GUI for batch job management is included.
Some possible usage scenarios are:
- Automatic processing of incoming data files
- Big Data applications
- Process automation
- Data Mining/Aggregation applications
- Automatic Reporting
- Processing and analysis of database records