Description of the project
The presentation is the one I did at the Pentaho Benedutch event in 2011.
It describes the functionality of the 'Pentaho Data Integration Datavault Framework' I developed.
At the moment the version for MySQL includes the latest developments.
The documentation PDF is very recent and complete, because I needed to turn it all over to new colleagues.
Based on a designed Data Vault (hub-,link- and satellite tables are present) and an Excel sheet with the mappings, no Data Vault ETL development is needed for adding hubs, links etc.
Kasper de Graaf played a big part in the specifications for the tool set, him being a Data Vault expert, me being an ETL designer/developer.
The Virtual Machine (VMWare) is a 64 bit Ubuntu 12.04 Server with Percona Server as the database, a MySQL replacement with an improved InnoDB storage engine.
The code is the latest and greatest, including link-(group) validity satellite functionality and generic staging of tables and files.
PDI version: Pentaho Data Integration 5.0.1 CE
NB: entries to add/modify in my.cnf
max_connections = 2048
table_definition_cache = 1200
Run the launcher at the Desktop: PDI_kff_launcher.sh
Select the file based repository (appears as default): pdi_file_repository_dv_demo_kff
The job that 'does it all' (metadata + all Data Vault objects): job_data_vault_all_incl_md
Running a complete batch including staging and the Data Vault:
percona@ubuntu:~$ nohup ./run_job_complete_batch_data_warehouse.sh &
After editing the metadata Excel sheet, be sure to refresh the column 'source_concat' in the sheet 'source_tables'.
For some reason this colum is sometimes seen as 'null' by Kettle, destroying the joins in the metadata queries to obtain the 'record_source_id'.
If you refresh this column by copying the value in the first row to all others, you'll be fine.
----Attention number 2----
If you discover errors/bugs in my code, please inform me at firstname.lastname@example.org, so I can use your collective brains to improve it.
Edwin Weber, owner of the one man army Weber Solutions.