GeoKettle is a powerful, metadata-driven spatial ETL (Extract, Transform and Load) tool dedicated to the integration of different data sources for building and updating geospatial databases, data warehouses and services.
Rapid data integration framework
Java based data integration framework; can be used to transform/map/manipulate data in various formats (CSV,FIXLEN,XML,XBASE,COBOL,LOTUS, etc.); can be used standalone or embedded(as a library). Connects to RDBMS/JMS/SOAP/LDAP/S3/HTTP/FTP/ZIP/TAR.
PDI Portable is a portable version of Pentaho Data Integration.
PDIPortable is an open source database packaged as a portable app, so you can run the full Pentaho Data Integration on your iPod, USB flash drive, portable hard drive, etc. It has all the same features as Pentaho Data Integration, plus, it leaves no personal information behind on the machine you run it on, so you can take it with you wherever you go.
Free open source ESB tool to connect applications and data resources.
Quick start your SOA project with a free open source ESB tool to connect applications and data resources. Based on extensible open source technology, Open Studio for ESB enables you to service-enable applications and legacy systems to build a powerful service-oriented architecture (SOA). An Eclipse-based tooling environment with pre-built connectors and components, and built-in enterprise integration patterns simplifies integrating business applications, SaaS, web services, and APIs. Open Studio for ESB is fully open source, so you can see the code and extend it. Thousands of developers use Talend Open Studio to integrate easily with any application, database, API, or web services. Embed existing Java code libraries or leverage community components and code to extend your project. With millions of downloads and a full range of robust, open source integration software tools, Talend is an open source leader in cloud and big data integration.
ETL engine based on Groovy
P.S. Dear friends. Repository migration to https://github.com/ascrus/getl . You can download jar file from this site or maven. GETL - based package in Groovy, which automates the work of loading and transforming data. His name is an acronym for «Groovy ETL». GETL is a set of libraries of pre-built classes and objects that can be used to solve problems unpacking, transform and load data into programs written in Groovy, or Java, as well as from any software that supports the work with Java classes. GETL taken into account when developing ideas and following requirements: 1. The simpler the class hierarchy, the easier solution; 2. The data structures tend to change over time, or not be known in advance, working with them must be maintained; 3. All routine work ETL should be automated wherever possible; 4. Compiling the code on the fly bail speed and reserve for the optimization; 5. Sophisticated class hierarchy guarantee easy connection of other open source solutions.
KETL(tm) is a production ready ETL platform. The engine is built upon an open, multi-threaded, XML-based architecture. KETL's is designed to assist in the development and deployment of data integration efforts which require ETL and scheduling
End-to-end big data in a massively scalable supercomputing platform.
HPCC Systems® (www.hpccsystems.com) from LexisNexis® Risk Solutions is a proven, open source solution for Big Data insights that can be implemented by businesses of all sizes. With HPCC Systems, developers can design applications with Big Data at their core, enabling businesses to better analyze and understand data at scale, improving business time to results and decisions. HPCC Systems offers a consistent data-centric programming language, two processing platforms and a single, complete end-to-end architecture for efficient processing. Read our blog (http://hpccsystems.com/blog ), or connect with us on Twitter (@hpccsystems), Facebook (https://www.facebook.com/hpccsystems ) and LinkedIn (http://www.linkedin.com/company/hpcc-systems) HPCC Systems is available on AWS & can be configured through the Instant Cloud Solution. The download here is a VM.
Utility that performs bulk user import to Active Directory from selected data sources. It can perform data mapping and generate required fields using existing info( generate userPrincipleName from name, surname and patronymic of user for example). This is still a beta-release, so things can work not so well sometimes.
Parse, analyze and -- most importantly -- use COBOL data definitions. This gives you access to COBOL data from Python programs. Write data analyzers, one-time data conversion utilities and Python programs that are part of COBOL systems. Really.
ETL Based on Perl With WEB Interface
EplSite ETL is a tool to do easy the data migrations, doing extraction, transformation, validation and load in a very fast way. It was built by people involved in data migrations so, it contains the necessary to do the migration(Extract Transformation, validation and load) and do it well.
Merge/concatenate PDF files into one PDF file
Merge your PDF files for upload to reporting engine or other needs. Command line, win32 Written in Python. Compiled with PyInstaller.
This is a Pentaho Data Integration plugin for CiviCRM.
This is a Pentaho Data Integration plugin for CiviCRM. It allows you to take advantage of the power of Pentaho Data Integration tools and use it with your CiviCRM instance.
Migrate/Copy your data between Oracle database and 13 major DBs.
Command line data Copy/Migration tool for Oracle. Supports Oracle 7.3, Oracle 8i, Oracle 9i, Oracle 10G, Oracle 11G and 13 major databases. 1. Exadata 2. Sybase ASE 3. Informix Innovator C 4. Sybase SQL Anywhere 5. DB2 UDB 6. CSV 7. SQLServer 8. MariaDB 9. Sybase IQ 10. PostgreSQL 11. MySQL 12. Informix IDS 13. TimesTen
Misc scripts and utilities related to Oracle Warehouse Builder ETL (Tcl scripts, OWB Expert, project samples, etc.)
PanBI is a collection of analytics modules for existing information systems. For each IS, it provides data extraction, transformation and loading logic coupled with an OLAP schema, delivering OLAP functionality to an unprecedented user base.
A command line utility to read a text file containing lines of data, clean up any CR/LF anomalies, and output the lines of text with clean CR/LF terminators to standard output. The binary is a Windows 32 bit console app.
Diffs, patches, and revision control for CSV files, spreadsheets, and databases.
Data Systems and Antibiotic Systems
A collection of Java programs using JavaMail API to help automating the process of sending mails.
Simplified CSV turbo loader to Oracle
#FreeUkraine #StopRussia #SaveUkraine #StopPutin #CrimeaIsUkraine #UnitedForUkraine #RussiaInvadedUkraine Tired of writing control files? No problem! CSV*Loader will generate control file for SQL*Loader. Too slow? No problem! CSV*Loader turbo mode may load it 10x faster to your Oracle database than your good old Perl::DBI script.
DataMigrator for 14 major databases
Touch and go Windows command line data migration tool for 14 databases: 1. Sybase ASE 2. Informix Innovator C 3. Sybase SQL Anywhere 4. DB2 UDB 5. SQLServer 6. MariaDB 7. Sybase IQ 8. PostgreSQL 9. MySQL 10. Informix IDS 11. TimesTen 12. Oracle 13. SQL Lite 14. Exadata
Simplified turbo spooler for Oracle.
#SaveUkraine #StopRussia #FreeUkraine #StopPutin #CrimeaIsUkraine #UnitedForUkraine #RussiaInvadedUkraine Exports/Spools scalar data on disk for a given Oracle table. Turbo mode spools 5x faster.
TPL (transfer,parse,load) tool for batch files.
This is an enterprise-strength system for batch file processing, e.g. transfer, parse and load data using batch (text) files within and without the enterprise. The system is controlled through parameters and doesn't require any programming, code generation or code deployment. This is a heavy duty back end system with no GUI. Nonetheless it's very easy to use, easier than most GUI-based ETLs, even easier to install. It currently supports 4 major dbs: Oracle, Sybase, MySQL, MSSQL. The free community edition allows to process about 10 files a day depending on the setup. For support and licensing go to the www.datastreamprocessor.com
Dorra Data Machine
ETL solution based on Oracle