Home / version_2
Name Modified Size Downloads / Week Status
Parent folder
Totals: 7 Items   17.6 MB 7
dwh_datavault_mappings_1.xls 2014-02-06 69.6 kB 11 weekly downloads
docu_generated_by_cookbook.zip 2014-02-01 4.5 MB 11 weekly downloads
pdi_dv_framework_code.zip 2014-02-01 6.8 MB 11 weekly downloads
pdi_dv_framework_databases.sql.7z 2014-02-01 1.4 MB 11 weekly downloads
pdi_data_vault_framework_docu.pdf 2014-02-01 712.2 kB 11 weekly downloads
readme.txt 2014-02-01 2.2 kB 11 weekly downloads
presentation_pdi_data_vault_framework_meetup2012.pdf 2012-10-01 4.1 MB 11 weekly downloads
Description of the project The presentation is the one I did at the Pentaho Benedutch event in 2011. It describes the functionality of the 'Pentaho Data Integration Datavault Framework' I developed. At the moment the version for MySQL includes the latest developments. The documentation PDF is very recent and complete, because I needed to turn it all over to new colleagues. Based on a designed Data Vault (hub-,link- and satellite tables are present) and an Excel sheet with the mappings, no Data Vault ETL development is needed for adding hubs, links etc. Kasper de Graaf played a big part in the specifications for the tool set, him being a Data Vault expert, me being an ETL designer/developer. The Virtual Machine (VMWare) is a 64 bit Ubuntu 12.04 Server with Percona Server as the database, a MySQL replacement with an improved InnoDB storage engine. The code is the latest and greatest, including link-(group) validity satellite functionality and generic staging of tables and files. PDI version: Pentaho Data Integration 5.0.1 CE User: percona Password: percona mysql root/percona NB: entries to add/modify in my.cnf max_connections = 2048 table_definition_cache = 1200 Starting Kettle Run the launcher at the Desktop: PDI_kff_launcher.sh Select the file based repository (appears as default): pdi_file_repository_dv_demo_kff The job that 'does it all' (metadata + all Data Vault objects): job_data_vault_all_incl_md Running a complete batch including staging and the Data Vault: percona@ubuntu:~$ nohup ./run_job_complete_batch_data_warehouse.sh & ----Attention---- After editing the metadata Excel sheet, be sure to refresh the column 'source_concat' in the sheet 'source_tables'. For some reason this colum is sometimes seen as 'null' by Kettle, destroying the joins in the metadata queries to obtain the 'record_source_id'. If you refresh this column by copying the value in the first row to all others, you'll be fine. ----Attention number 2---- If you discover errors/bugs in my code, please inform me at eacweber@gmail.com, so I can use your collective brains to improve it. Greetings, Edwin Weber, owner of the one man army Weber Solutions.
Source: readme.txt, updated 2014-02-01

Thanks for helping keep SourceForge clean.

Screenshot instructions:
Red Hat Linux   Ubuntu

Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Briefly describe the problem (required):

Upload screenshot of ad (required):
Select a file, or drag & drop file here.

Please provide the ad click URL, if possible:

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks