Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.
Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
osDQ dedicated to create apache spark based data pipeline using JSON
This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/
This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also.
Get json example at https://github.com/arrahtech/osdq-spark
How to run
Unzip the zip file
Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ....
SplitPDF -SplitPDF.jar- is a ‘command-line driven’ Java-program, it splits a PDF-file by bookmarks into separated PDF’s. The bookmark is used as title for the newly created PDF. Extremely usefull and fast in a batch processing environment.
A generic SQL driven data audit tool for detecting differences between any JDBC accessible database tables and other data sources. Platform independent. It's a unix like diff for databases. Produces key values with the differing column name and data