Data Integration Tools for Linux

View 38 business solutions

Browse free open source Data Integration tools and projects for Linux below. Use the toggles on the left to filter open source Data Integration tools by OS, license, language, programming language, and project status.

  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
  • 1
    Pentaho from Hitachi Vantara

    Pentaho from Hitachi Vantara

    End to end data integration and analytics platform

    Pentaho Community Edition can now be downloaded from https://www.hitachivantara.com/en-us/products/pentaho-platform/data-integration-analytics/pentaho-community-edition.html Join the Community at https://community.hitachivantara.com/communities/community-pentaho-home?CommunityKey=e0eaa1d8-5ecc-4721-a6a7-75d4e890ee0 Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho Kettle enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. The Pentaho Enterprise Edition Trialware can be obtained from https://www.hitachivantara.com/en-us/products/lumada-dataops/data-integration-analytics/download-pentaho.html
    Leader badge
    Downloads: 1,194 This Week
    Last Update:
    See Project
  • 2
    Pentaho Data Integration

    Pentaho Data Integration

    Pentaho Data Integration ( ETL ) a.k.a Kettle

    Pentaho Data Integration uses the Maven framework. Project distribution archive is produced under the assemblies module. Core implementation, database dialog, user interface, PDI engine, PDI engine extensions, PDI core plugins, and integration tests. Maven, version 3+, and Java JDK 1.8 are requisites. Use of the Pentaho checkstyle format (via mvn checkstyle:check and reviewing the report) and developing working Unit Tests helps to ensure that pull requests for bugs and improvements are processed quickly. In addition to the unit tests, there are integration tests that test cross-module operation.
    Downloads: 73 This Week
    Last Update:
    See Project
  • 3
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Leader badge
    Downloads: 42 This Week
    Last Update:
    See Project
  • 4
    Airbyte

    Airbyte

    Data integration platform for ELT pipelines from APIs, databases

    We believe that only an open-source solution to data movement can cover the long tail of data sources while empowering data engineers to customize existing connectors. Our ultimate vision is to help you move data from any source to any destination. Airbyte already provides the largest catalog of 300+ connectors for APIs, databases, data warehouses, and data lakes. Moving critical data with Airbyte is as easy and reliable as flipping on a switch. Our teams process more than 300 billion rows each month for ambitious businesses of all sizes. Enable your data engineering teams to focus on projects that are more valuable to your business. Building and maintaining custom connectors have become 5x easier with Airbyte. With an average response rate of 10 minutes or less and a Customer Satisfaction score of 96/100, our team is ready to support your data integration journey all over the world.
    Downloads: 6 This Week
    Last Update:
    See Project
  • HRSoft Compensation - Human Resources Software Icon
    HRSoft Compensation - Human Resources Software

    HRSoft is the only unified, purpose-built SaaS platform designed to transform your complex HR processes into seamless digital ones

    Manage your enterprise’s compensation lifecycle and accurately recognize top performers with a digitized, integrated system. Keep employees invested and your HR team in control while preventing compensation chaos.
  • 5
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file id, via an indexing mechanism. This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file. In short, the mapped file group contains all versions of a group of records.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    CloverDX

    CloverDX

    Design, automate, operate and publish data pipelines at scale

    Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd parties. Simple data manipulation jobs can be created visually. More complex business logic can be implemented using Clover's domain-specific-language CTL, in Java or languages like Python or JavaScript. Through its DataServices functionality, it allows to quickly turn data pipelines into REST API endpoints. The platform allows to easily scale your data job across multiple cores or nodes/machines. Supports Docker/Kubernetes deployments and offers AWS/Azure images in their respective marketplace
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust multi-tenant, multi-tool engine that scales technically and organizationally. Dagster as a unified control plane: The ‘single plane of glass’ data teams love to use. Rein in the chaos and maintain control over your data as the complexity scales. Centralize your metadata in one tool with built-in observability, diagnostics, cataloging, and lineage. Spot any issues and identify performance improvement opportunities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Jaspersoft ETL
    Jaspersoft ETL is a data integration platform providing high performance data extract-transform-load (ETL) capabilities. Jaspersoft ETL is appropriate for all analytic and operational data integration needs. Activity on this project is located at jas
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Daffodil Replicator is a powerful Open Source Java tool for data integration, data migration and data protection in real time. It allows bi-directional data replication and synchronization between homogeneous / heterogeneous databases including Oracle, M
    Downloads: 3 This Week
    Last Update:
    See Project
  • Precoro helps companies spend smarter Icon
    Precoro helps companies spend smarter

    Fully Automated Process in One Tool: From Purchase Orders to Budget Control and Reporting.

    For minor company expenses, you might utilize a spend management solution or track everything in spreadsheets. For everything more, you'll need Precoro. We help companies achieve procurement excellence and budget efficiency by building transparent, predictable, automated spending workflows.
  • 10
    PaloKettlePlugin is for Pentaho Data Integration aka Kettle. It's a Cell Input und Output Step for Palo Molap. The first code was developed by mybiq/3A-Strategy, the PDI-3 version has been developed by Stratebi. Now by 3A-Strategy and Litebi for PDI
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Templates for integrating the data structures of Compiere, Openbravo or ADempiere for all kind of Pentaho Data Integration processes. Later on we plan to migrate these to Talend too.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    EasyDataQuality for Pentaho Kettle

    EasyDataQuality for Pentaho Kettle

    EasyDataQuality for Pentaho Data Integration in Kettle

    EasyDQ plugins for Contact cleansing in Pentaho Data Integration in Kettle.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13

    COMA Community Edition

    Schema Matching Solution for Data Integration

    COMA CE is the community edition of the well-established COMA project developed at the University of Leipzig. It comprises the parsers, matcher library, matching framework and a sample GUI for tests and evaluations. COMA was initiated at the database chair of the University of Leipzig in 2002 and got much positive feedback ever since. It excels due to numerous matching strategies, which can be combined to large matching workflows, and which enable reliable match results between different kind of schemas.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14

    pdi-jira

    JIRA plugin for Pentaho Data Integration

    Using this PDI plugin you can connect any JIRA service even using SSL connection and perform JSON data extraction over the results. JQL is used to obtain data from the JIRA remote service.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15

    ARSystem plugins for Pentaho Kettle

    AR-System step and db plugins for Pentaho Data Integration Kettle V5

    Allows you to write per API to AR-System Server (BMC Remedy Action Request System). Includes two step output, one step input and one database plugin. The step plugins need the database plugin.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    KETL(tm) is a production ready ETL platform. The engine is built upon an open, multi-threaded, XML-based architecture. KETL's is designed to assist in the development and deployment of data integration efforts which require ETL and scheduling
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    bio2rdf
    The Bio2RDF project aims to transforms silos of life science data into a globally distributed network of linked data for biological knowledge discovery. Bio2RDF creates and provides machine understandable descriptions of biological entities using the RDF/RDFS/OWL Semantic Web languages. Using both syntactic and semantic data integration techniques, Bio2RDF seamlessly integrates diverse biological data and enables powerful new SPARQL-based services across its globally distributed knowledge bases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Metl ETL Data Integration

    Metl ETL Data Integration

    Simple message-based, web-based ETL integration

    Metl is a simple, web-based ETL tool that allows for data integrations including database, files, messaging, and web services. Supports RDBMS, SOAP, HTTP, FTP, SFTP, XML, FIXLEN, CSV, JSON, ZIP, and more. Metl implements scheduled integration tasks without the need for custom coding or heavy infrastructure. It can be deployed in the cloud or in an internal data center, and it was built to allow developers to extend it with custom components.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    The Stem Cell Artificial Neural network project entails the analysis and integration of genomics data for extracting the stemness signature of several tissues by training a multiclass single-layer linear artificial neural network.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Arch Data Integration Framework
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    AWS data tools

    AWS tools for data integration and more

    Here I list some data tools I created for Amazon AWS S3 and Redshift
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Alova.js

    Alova.js

    Workflow-Streamlined next-generation request tools

    Extremely streamline API integration workflow. Quickly find APIs in the editor, and enjoy full type hints even in js projects with the API code automatically generated by Alova's extension. Request in various complex scenes by one line of code. Automatically manage paging data, and data preloading, reduce unnecessary data refresh, improve fluency by 300%, and reduce coding difficulty by 50%. Send requests immediately by watching state changes, useful in tab switching and condition querying. Global interceptor that supports silent token refresh, as well as providing unified management of token-based login, logout, token assignment, and token refresh.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Apache DevLake

    Apache DevLake

    Apache DevLake is an open-source dev data platform

    Apache DevLake is an open-source dev data platform that ingests, analyzes, and visualizes the fragmented data from DevOps tools to extract insights for engineering excellence, developer experience, and community growth. Apache DevLake is designed for developer teams looking to make better sense of their development process and to bring a more data-driven approach to their own practices. You can ask Apache DevLake many questions regarding your development process. Just connect and query. Your Dev Data lives in many silos and tools. DevLake brings them all together to give you a complete view of your Software Development Life Cycle (SDLC). From DORA to scrum retros, DevLake implements metrics effortlessly with prebuilt dashboards supporting common frameworks and goals. DevLake fits teams of all shapes and sizes, and can be readily extended to support new data sources, metrics, and dashboards, with a flexible framework for data collection and transformation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    BD integration

    Heterogeneous BD integration

    The increasing need to obtain a generalized view of the information resources, presented in various systems has led to the data integration mechanisms formation, which focus on efficient access organization to external, heterogeneous data sources through a single interface. The project includes the mass integration platform which allows to create global infrastructure of tens and hundreds of heterogeneous databases based on service-oriented approach.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The BioDataServer is a database integration system. It implements a mediator-wrapper architecture and offers a SQL interface. The data integration is based on user defined intergrated schema and adapter that wrap any kind of data source.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next