Open Source Linux Data Integration Tools - Page 2

Data Integration Tools for Linux

View 38 business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust multi-tenant, multi-tool engine that scales technically and organizationally. Dagster as a unified control plane: The ‘single plane of glass’ data teams love to use. Rein in the chaos and maintain control over your data as the complexity scales. Centralize your metadata in one tool with built-in observability, diagnostics, cataloging, and lineage. Spot any issues and identify performance improvement opportunities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    DataSync Suite
    DataSync Suite is an open source platform for integrating tools like Zimbra, SugarCRM, and Drupal. The tool is focused on a single sign-on, application data integration, and fast, flexible deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    EasyDataQuality for Pentaho Kettle

    EasyDataQuality for Pentaho Kettle

    EasyDataQuality for Pentaho Data Integration in Kettle

    EasyDQ plugins for Contact cleansing in Pentaho Data Integration in Kettle.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Fluxion
    The Fluxion framework is a prototype data integration system using Semantic Web technologies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Save hundreds of developer hours with components built for SaaS applications. Icon
    Save hundreds of developer hours with components built for SaaS applications.

    The #1 Embedded Analytics Solution for SaaS Teams.

    Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.
    Try Developer Playground
  • 5
    Open source Application and Data Integration Platform that allows developers and end-users to integrate and transform information using a web-based drag-and-drop interface that doesn't require coding or programming skills.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Grinn

    Grinn

    graph database and R package for omic data integration

    http://kwanjeeraw.github.io/grinn/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    The Hanalyzer is a tool designed to help biologists explain results observed in genome-scale experiments and to generate new hypotheses. It combines information extraction, semantic data integration, reasoning, and visualization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Harmony Data Integration

    Harmony Data Integration

    Fast, sensitive and accurate integration of single-cell data

    Harmony is a general-purpose R package with an efficient algorithm for integrating multiple data sets. It is especially useful for large single-cell datasets such as single-cell RNA-seq. Harmony has been tested on R versions =4. Please consult the DESCRIPTION file for more details on required R packages. Harmony has been tested on Linux, OS X, and Windows platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Hetionet

    Hetionet

    Hetionet: an integrative network of disease

    Hetionet is a hetnet — network with multiple node and edge (relationship) types — which encodes biology. The hetnet was designed for Project Rephetio, which aims to systematically identify why drugs work and predict new therapies for drugs. The JSON and Neo4j formats contain node and edge properties, which are absent in the TSV and matrix formats, including licensing information. Therefore the recommended formats are JSON and Neo4j. Our hetio package in Python reads the JSON format, but it is otherwise a simple yet new format. The Neo4j graph database has an established and thriving ecosystem. However, if you would like to access Hetionet without Neo4j, then we suggest the JSON format. The matrix format refers to HetMat archives, which store edge adjacency matrices on disk. Additional usage information is available at the corresponding download locations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Red Hat Enterprise Linux on Microsoft Azure Icon
    Red Hat Enterprise Linux on Microsoft Azure

    Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

    Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.
    Learn More
  • 10
    INDUS is a porject for knowledge acquisition and data integration from heterogeneous distributed data, particularly from bio-informatics databases
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    This disease-centric project contributes data integration and analysis tools from the Institute for Systems Biology (ISB). We offer this project to the research community to further our efforts in disease prediction and prevention.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    An extension package to Pentaho Data Integration, providing plug-ins. Steps/job entries can be downloaded independently and each comes with source code in the .zip file. All are licensed as LGPL or GPL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    The JasperSoft Business Intelligence Suite provides integrated reporting, analysis, and data integration to make faster, better decisions. * Integrated or stand-alone * Analytic & operational data integration * Embeddable with ERP or CRM
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Jitsu

    Jitsu

    Jitsu is an open-source Segment alternative

    Jitsu is a fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days. Installing Jitsu is a matter of selecting your framework and adding few lines of code to your app. Jitsu is built to be framework agnostic, so regardless of your stack, we have a solution that'll work for your team. Connect data warehouse (Snowflake, Clickhouse, BigQuery, S3, Redshift ot Postgres) and query your data instantly. Jitsu can either stream data in real-time or send it in micro-batches (up to once a minute). Apply any transformation with Jitsu. Just write JavaScript code right in the UI to do anything with incoming data. And yes, the code editor supports code completion, debugging and many more. It feels like a full-featured IDE!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    KETL(tm) is a production ready ETL platform. The engine is built upon an open, multi-threaded, XML-based architecture. KETL's is designed to assist in the development and deployment of data integration efforts which require ETL and scheduling
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    LD-FusionTool

    Data Fusion and Conflict Resolution tool for Linked Data

    LD-FusionTool covers the Data Fusion step in the integration process for RDF, where data are merged to produce consistent and clean representations of objects, and conflicts which emerged during data integration need to be resolved.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Developing a "bridge" to facilitate transfer of data between various databases(ith dis-similar schemas). JDBC and XML would be used.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows. No in-app data processing: command line tools as the main tool for interacting with databases and data. Single machine pipeline execution based on Python's multiprocessing. No need for distributed task queues. Easy debugging and output logging. Cost based priority queues: nodes with higher cost (based on recorded run times) are run first.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    N-Browse is a client-server package for interactive visualization of network data with heterogeneous types of links, intended for ease of use and designed using a generic database schema for data integration and visualization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ODI \ OWB ETL \ ELT Datawarehousing Data Integration
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Framework for text mining, data integration and data analysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OPENSUITE - an integration platform to enable process data integration between independently developed business applications.OPENSUITE integration platform takes advantage of the SOA best integration practices to supply the middleware layer functionality
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Ondex Web

    Web-based visualisation of networks using Java

    Ondex Web is a new web-based implementation of the network visualization and exploration tools from the Ondex data integration platform. New features such as context-sensitive menus and annotation tools provide users with intuitive ways to explore and manipulate the appearance of heterogeneous biological networks. Ondex Web is open source, written in Java and can be easily embedded into Web sites as an applet. Ondex Web supports loading data from a variety of network formats, such as XGMML, NWB, Pajek and OXL
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Open Information Integration
    Open Information Integration Tool Suite (Open II) is used by analysts and programmers to accelerate data integration and harmonization across organizations. OpenII has a neutral schema repository for browsing and comparing all sorts of data models. OpenII is built as a Rich Client Platform Application on top of Eclipse 3.x. Developers need to download Eclipse, install the RCP support, the Fatjar plugin and the Delta Pack in one of the 3.x flavors. Release Notes Release Date: Jan 2014 Build Version: 1.0.2666 Notes: 1. Now support for AVRO and HCatalog imports 2. Better support for OWL 3. New OWL and Containing Relationship viewers 4. Added case insensitive option in exact matcher for Harmony
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PANDORA

    PANDORA

    Revolutionizing Biomedical Research with Advanced Machine Learning

    PANDORA is a machine learning (ML) tool that can be used to integrate various data types, including clinical, transcriptome and microbiome data and find connections in large datasets. PANDORA can be easily installed using Docker, a pre-built version of the software can be pulled from DockerHub. In order to run a test instance of PANDORA, users will first need to prepare their local environment by downloading, installing, and configuring Docker. genular is a community behind SIMON an open-source Machine Learning KnowledgeDiscovery software, built by a vibrant community of people just like you! Join us and make SIMON even cooler! Exploratory analysis of machine learning results with the help of many different visualization techniques will give you instant insights into models and data.
    Downloads: 0 This Week
    Last Update:
    See Project