Free open source ETL software for data integration anywhere.
Expand your open source stack with a free open source ETL tool for data integration and data transformation anywhere. Work with the latest cloud applications and platforms or traditional databases and applications using Open Studio for Data Integration to design and deploy quickly with graphical tools, native code generation, and 100s of pre-built components and connectors. Open Studio for Data Integration is fully open source, so you can see the code and work with it. Embed existing Java code libraries, create your own components or leverage community components and code to extend your project. Millions of downloads and a full range of robust, open source integration software tools have made Talend the open source leader in cloud and big data integration.
A high performance, open source, data replication engine for MySQL
Tungsten Replicator is a high performance, free and open source replication engine that supports a variety of extractor and applier modules. Data can be extracted from MySQL, Oracle and Amazon RDS, and applied to numerous transactional stores and datawarehouse stores (MySQL, Oracle, and Amazon RDS; NoSQL stores such as MongoDB; Vertica, Hadoop, and Amazon RDS). Tungsten Replicator helps technically focused users solve host of problems and offers features that surpass those of most other open source replicators. During replication, Tungsten Replication allows data to be exchanged between different databases and database versions, information can be filtered and modified, and deployment can be between on-premise or cloud-based databases. It supports parallel replication and advanced topologies such as fan-in and multi-master. It can also be used efficiently in cross-site deployments.
World's first open source data quality & data preparation project
This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
PDI Portable is a portable version of Pentaho Data Integration.
PDIPortable is an open source database packaged as a portable app, so you can run the full Pentaho Data Integration on your iPod, USB flash drive, portable hard drive, etc. It has all the same features as Pentaho Data Integration, plus, it leaves no personal information behind on the machine you run it on, so you can take it with you wherever you go.
Free open source ESB tool to connect applications and data resources.
Quick start your SOA project with a free open source ESB tool to connect applications and data resources. Based on extensible open source technology, Open Studio for ESB enables you to service-enable applications and legacy systems to build a powerful service-oriented architecture (SOA). An Eclipse-based tooling environment with pre-built connectors and components, and built-in enterprise integration patterns simplifies integrating business applications, SaaS, web services, and APIs. Open Studio for ESB is fully open source, so you can see the code and extend it. Thousands of developers use Talend Open Studio to integrate easily with any application, database, API, or web services. Embed existing Java code libraries or leverage community components and code to extend your project. With millions of downloads and a full range of robust, open source integration software tools, Talend is an open source leader in cloud and big data integration.
A free data prep tool for discovery, cleansing, and blending data.
Talend Data Preparation Free Desktop makes it easy to get clean, useful data from almost anywhere into almost any business or cloud application. Download a free Windows or Mac version to get clean, useful data from Excel and CSV files into Salesforce, Marketo, Tableau … almost any business or cloud application. Talend Data Prep makes it easy for data scientists or data users to discover and cleanse data, which is the first step in a data governance process. Data Prep Free Desktop uses point-and-click visual tools and smart guides to help fix, shape, and blend data to make it actionable. Any data cleansing operation can be reused and revised, making it easy generate regular reports on data with consistency. Millions of downloads and a full range of robust, open source integration software tools have made Talend the open source leader in cloud and big data integration.
End-to-end big data in a massively scalable supercomputing platform.
HPCC Systems® (www.hpccsystems.com) from LexisNexis® Risk Solutions is a proven, open source solution for Big Data insights that can be implemented by businesses of all sizes. With HPCC Systems, developers can design applications with Big Data at their core, enabling businesses to better analyze and understand data at scale, improving business time to results and decisions. HPCC Systems offers a consistent data-centric programming language, two processing platforms and a single, complete end-to-end architecture for efficient processing. Read our blog (http://hpccsystems.com/blog ), or connect with us on Twitter (@hpccsystems), Facebook (https://www.facebook.com/hpccsystems ) and LinkedIn (http://www.linkedin.com/company/hpcc-systems) HPCC Systems is available on AWS & can be configured through the Instant Cloud Solution. The download here is a VM.
OGSA-DAI is a product that allows data resources, such as file collections, relational or XML databases, to be accessed, integrated and federated across the Internet.
Applications for data management
"Information is data in action", and, consequently, having good quality data is essential. The AESTEL package contains two highly configurable applications for data management: A data loader and a reporting application, i.e. DataLoader and AEREA, respectively. The data loader application applies user-defined instructions to validate, process and load data. The reporting application provides a query builder and spreadsheet template designer. Both applications work with any relational data model. (Postgres and Oracle have been tested). The two applications have been initially developed for small molecule drug discovery research. However, they can be extended for use in other data domains.
Jipes provides open source Java APIs deeply integrated into the Oracle RDBMS, including an Ant task for building and exporting database objects. A Java Data Cartridge replacing database links is also in process.
osDQ project dedicated to create apache spark based data quality
This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data quality and data preparation features for big data. This uses java API of apache spark
Baraah Financial Holdings, Incorporated provides loans to individuals to start business, we require a Request for Funding to be filed. Such a document is thought to include a detailed and stage-orientated business plan.
Simplified CSV turbo loader to Oracle
#FreeUkraine #StopRussia #SaveUkraine #StopPutin #CrimeaIsUkraine #UnitedForUkraine #RussiaInvadedUkraine Tired of writing control files? No problem! CSV*Loader will generate control file for SQL*Loader. Too slow? No problem! CSV*Loader turbo mode may load it 10x faster to your Oracle database than your good old Perl::DBI script.
ChronoDB is a data manager for generic time series.
The project has been renamed CrNiCKL (chronicle) and moved to http://crnickl.sourceforge.net. ChronoDB is a data manager written in Java. It supports time series of any type. With its simple and powerful schema subsystem it takes charge of very large heterogeneous data sets. The software consist of an API and a generic implementation layer running on top of an SQL or a NoSQL system. Applications can mix objects from multiple ChronoDB databases.
CrNiCKL (chronicle) is a Java database for time series
CrNiCKL (pronounced "chronicle") is a data manager written in Java handling large sets of heterogeneous time series. A simple schema system allows to confiture value types and time domains. CrNiCKL runs on top of SQL or NoSQL databases. Drivers for JDBC and MongoDB are available.
The DMAIG Management System's goal is to organize content and increase the efficiency of an organization looking for Xaraya modules that allows a small non-technical group to create, maintain, and utilize information on a large user set.
DataMigrator for 14 major databases
Touch and go Windows command line data migration tool for 14 databases: 1. Sybase ASE 2. Informix Innovator C 3. Sybase SQL Anywhere 4. DB2 UDB 5. SQLServer 6. MariaDB 7. Sybase IQ 8. PostgreSQL 9. MySQL 10. Informix IDS 11. TimesTen 12. Oracle 13. SQL Lite 14. Exadata
Migrate/Copy your data between Oracle database and 13 major DBs.
Command line data Copy/Migration tool for Oracle. Supports Oracle 7.3, Oracle 8i, Oracle 9i, Oracle 10G, Oracle 11G and 13 major databases. 1. Exadata 2. Sybase ASE 3. Informix Innovator C 4. Sybase SQL Anywhere 5. DB2 UDB 6. CSV 7. SQLServer 8. MariaDB 9. Sybase IQ 10. PostgreSQL 11. MySQL 12. Informix IDS 13. TimesTen
Simplified turbo spooler for Oracle.
#SaveUkraine #StopRussia #FreeUkraine #StopPutin #CrimeaIsUkraine #UnitedForUkraine #RussiaInvadedUkraine Exports/Spools scalar data on disk for a given Oracle table. Turbo mode spools 5x faster.
This beta opensource program is for Manufacturing company. Support in Web APACHE,PHP and DATABSE MySQL
Longname:Operational data business express---- ODBExpress is a report suit for business intelligence, it includes reporting, analysis (OLAP),etc.
Modest collection of OMBPlus scripts for Oracle Warehouse Builder. Utilities for creating repositories and targets, staging source tables, generating surrogate keys, etc.
A RESTFul/JSON Web Service for text and metata extraction
An open source RESTFul Web Service for text , meta-data extraction and analysis. oss-text-extractor supports various binary formats: Word processor (doc, docx, odt, rtf) Spreadsheet (xls, xlsx, ods) Presentation (ppt, pptx, odp) Publishing (pdf, pub) Web (rss, html/xhtml) Medias (audio, images) Others (vsd, text)
Merge/concatenate PDF files into one PDF file
Merge your PDF files for upload to reporting engine or other needs. Command line, win32 Written in Python. Compiled with PyInstaller.