Rapid data integration framework
Java based data integration framework; can be used to transform/map/manipulate data in various formats (CSV,FIXLEN,XML,XBASE,COBOL,LOTUS, etc.); can be used standalone or embedded(as a library). Connects to RDBMS/JMS/SOAP/LDAP/S3/HTTP/FTP/ZIP/TAR.
GeoKettle is a powerful, metadata-driven spatial ETL (Extract, Transform and Load) tool dedicated to the integration of different data sources for building and updating geospatial databases, data warehouses and services.
Free open source ESB tool to connect applications and data resources.
Quick start your SOA project with a free open source ESB tool to connect applications and data resources. Based on extensible open source technology, Open Studio for ESB enables you to service-enable applications and legacy systems to build a powerful service-oriented architecture (SOA). An Eclipse-based tooling environment with pre-built connectors and components, and built-in enterprise integration patterns simplifies integrating business applications, SaaS, web services, and APIs. Open Studio for ESB is fully open source, so you can see the code and extend it. Thousands of developers use Talend Open Studio to integrate easily with any application, database, API, or web services. Embed existing Java code libraries or leverage community components and code to extend your project. With millions of downloads and a full range of robust, open source integration software tools, Talend is an open source leader in cloud and big data integration.
KETL(tm) is a production ready ETL platform. The engine is built upon an open, multi-threaded, XML-based architecture. KETL's is designed to assist in the development and deployment of data integration efforts which require ETL and scheduling
ETL engine based on Groovy
P.S. Dear friends. Repository migration to https://github.com/ascrus/getl . You can download jar file from this site or maven. GETL - based package in Groovy, which automates the work of loading and transforming data. His name is an acronym for «Groovy ETL». GETL is a set of libraries of pre-built classes and objects that can be used to solve problems unpacking, transform and load data into programs written in Groovy, or Java, as well as from any software that supports the work with Java classes. GETL taken into account when developing ideas and following requirements: 1. The simpler the class hierarchy, the easier solution; 2. The data structures tend to change over time, or not be known in advance, working with them must be maintained; 3. All routine work ETL should be automated wherever possible; 4. Compiling the code on the fly bail speed and reserve for the optimization; 5. Sophisticated class hierarchy guarantee easy connection of other open source solutions.
A utility that uses Informatica Operations API
A Java utility that uses the Informatica Operations API allowing parameter inputs, trapping of suspended workflows and ability to send an email on failure. This utility extends the functionality of the pmcmd startworkflow and starttask command. If you pass in a parameter file and individual parameters on the command line, a temporary parameter file is created that has the values from the parameter file and appends the individual parameters. The e-mail sent is in HTML format using tables and looks similar to the workflow monitor output.
Utility that performs bulk user import to Active Directory from selected data sources. It can perform data mapping and generate required fields using existing info( generate userPrincipleName from name, surname and patronymic of user for example). This is still a beta-release, so things can work not so well sometimes.
Java tools for decoding and manipulating BER encoded ASN.1 Files
A tool for easy manipulation of BER encoded files. An "awk" for ASN.1 BER (for Unix people) or maybe a "notepad" for ASN.1 BER (for Windows people). Jberd (Java BER decoder) is a lightweight BER decoder and associated tools for interpreting and processing BER encoded ASN.1 files. The following facilities are provided: • JBerd Profiler. A tool for profiling the contents of BER encoded files • JBerd Flattener. A tool for converting BER encoded files to flat files for processing by other facilities • JBerd Decoder objects. A set of Java facilities for writing BER applications that require BER decoding Go to the "files" section (link at the top of this page) to download a pdf of detailed documentation. Andrew Forsyth
XIForge is a team of IT volunteer to explore new free open source technology framework and platform. We focus Pentaho and OpenBravo ERP. Our current hosted project includes Pentaho Data Integration Parse JSON String plugin. Team founder is Reid Lai.
webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy to master. The standard webStraktor output format is XML based, either in ASCII, UTF-8 or ISO-8859-1 (Latin1) code pages. webStraktor relies on the Apache HttpClient for retrieving content via the HTTP protocol. It adheres to the Robots Exclusion Protocol and it can be configured to operate in an anonymous way by connecting to the predominant types of web proxy servers. webStraktor extends the functionality of web crawlers, spiders or bots by integrating scraping and crawling capabilities.
A collection of Java programs using JavaMail API to help automating the process of sending mails.
This is a Pentaho Data Integration plugin for CiviCRM.
This is a Pentaho Data Integration plugin for CiviCRM. It allows you to take advantage of the power of Pentaho Data Integration tools and use it with your CiviCRM instance.
automate Informatica control file creation
Createinfactl is a Java utility that enables Administrators to fully automate Informatica deployments from the command line by creating thedeployment group control XML file to be used with the pmrep command “deploydeploymentgroup”. Default settings for the control file can be overridden at the command line and works with both static and dynamic deployment groups in the repository. Please review the “Using the Deployment Control File” section in the Informatica Command Reference guide for further help on the deployment control file. This utility supports Informatica PowerCenter versions 8.6.1 onwards and Java 1.6\. This utility also contains JDBC drivers for DB2, SQL Server, Oracle, and Teradata databases.
Java utility that reads the metadata from table(s)
Dbmetadata is a Java utility that reads the metadata from table(s) in a specified database and creates the Informatica XML to import into the repository. I created this utility when we were migrating to a new platform and needed a quick way to create flatfile and relational sources and targets that matched the DDL of the table. I also needed to use shortcuts. If you use the import table list, it will create one XML file with all of the tables and shortcuts (if a shortcut folder is specified) for the requested output type and database/file type.
Jipes provides open source Java APIs deeply integrated into the Oracle RDBMS, including an Ant task for building and exporting database objects. A Java Data Cartridge replacing database links is also in process.
The aDORe Federation is a standards-based federated repository framework and reference implementation which aims to address many of the scalability issues experienced by large scale digital object repositories.
Better SQL in java! Offering a seamless java class mapping and SQL-like domain-specific language implemented for number of commercial and open-source DBMS
Document Management System
Document Management System created using JEE6
Query, integrate and manipulate data using natural languages.
iLastic is an open-source framework to query, integrate and manipulate any type of data in English. Extract, transform and merge information from the web, databases, files or any other data repository using a language you already know... English
Prototype of an ETL tool for Extracting, Transforming, Loading data into dimensional data model for data warehouse or something else
Peer to peer data integration tool