Compare the Top Data Engineering Tools that integrate with DataHub as of October 2025

This a list of Data Engineering tools that integrate with DataHub. Use the filters on the left to add additional filters for products that have integrations with DataHub. View the products that work with DataHub in the table below.

What are Data Engineering Tools for DataHub?

Data engineering tools are designed to facilitate the process of preparing and managing large datasets for analysis. These tools support tasks like data extraction, transformation, and loading (ETL), allowing engineers to build efficient data pipelines that move and process data from various sources into storage systems. They help ensure data integrity and quality by providing features for validation, cleansing, and monitoring. Data engineering tools also often include capabilities for automation, scalability, and integration with big data platforms. By streamlining complex workflows, they enable organizations to handle large-scale data operations more efficiently and support advanced analytics and machine learning initiatives. Compare and read user reviews of the best Data Engineering tools for DataHub currently available using the table below. This list is updated regularly.

  • 1
    Teradata VantageCloud
    Teradata VantageCloud is a cloud-native platform built for modern data engineering at scale. It enables teams to ingest, transform, and orchestrate structured and semi-structured data across multi-cloud and hybrid environments. With support for SQL, Python, and R, VantageCloud integrates with popular data pipelines and tools, allowing for efficient ETL/ELT workflows, real-time processing, and advanced analytics. Its open architecture ensures interoperability with industry standards, while built-in governance and workload management help maintain performance and compliance. Ideal for data engineers building resilient, scalable data infrastructure.
    View Tool
    Visit Website
  • 2
    Google Cloud BigQuery
    BigQuery is an essential tool for data engineers, allowing them to streamline the process of data ingestion, transformation, and analysis. With its scalable infrastructure and robust suite of data engineering features, users can efficiently build data pipelines and automate workflows. BigQuery integrates easily with other Google Cloud tools, making it a versatile solution for data engineering tasks. New customers can take advantage of $300 in free credits to explore BigQuery’s features, enabling them to build and refine their data workflows for maximum efficiency and effectiveness. This allows engineers to focus more on innovation and less on managing the underlying infrastructure.
    Starting Price: Free ($300 in free credits)
    View Tool
    Visit Website
  • 3
    dbt

    dbt

    dbt Labs

    dbt helps data teams transform raw data into trusted, analysis-ready datasets faster. With dbt, data analysts and data engineers can collaborate on version-controlled SQL models, enforce testing and documentation standards, lean on detailed metadata to troubleshoot and optimize pipelines, and deploy transformations reliably at scale. Built on modern software engineering best practices, dbt brings transparency and governance to every step of the data transformation workflow. Thousands of companies, from startups to Fortune 500 enterprises, rely on dbt to improve data quality and trust as well as drive efficiencies and reduce costs as they deliver AI-ready data across their organization. Whether you’re scaling data operations or just getting started, dbt empowers your team to move from raw data to actionable analytics with confidence.
    Starting Price: $100 per user/ month
    View Tool
    Visit Website
  • 4
    Looker

    Looker

    Google

    Looker, Google Cloud’s business intelligence platform, enables you to chat with your data. Organizations turn to Looker for self-service and governed BI, to build custom applications with trusted metrics, or to bring Looker modeling to their existing environment. The result is improved data engineering efficiency and true business transformation. Looker is reinventing business intelligence for the modern company. Looker works the way the web does: browser-based, its unique modeling language lets any employee leverage the work of your best data analysts. Operating 100% in-database, Looker capitalizes on the newest, fastest analytic databases—to get real results, in real time.
  • 5
    Databricks Data Intelligence Platform
    The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.
  • 6
    Delta Lake

    Delta Lake

    Delta Lake

    Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Data lakes typically have multiple data pipelines reading and writing data concurrently, and data engineers have to go through a tedious process to ensure data integrity, due to the lack of transactions. Delta Lake brings ACID transactions to your data lakes. It provides serializability, the strongest level of isolation level. Learn more at Diving into Delta Lake: Unpacking the Transaction Log. In big data, even the metadata itself can be "big data". Delta Lake treats metadata just like data, leveraging Spark's distributed processing power to handle all its metadata. As a result, Delta Lake can handle petabyte-scale tables with billions of partitions and files at ease. Delta Lake provides snapshots of data enabling developers to access and revert to earlier versions of data for audits, rollbacks or to reproduce experiments.
  • 7
    Feast

    Feast

    Tecton

    Make your offline data available for real-time predictions without having to build custom pipelines. Ensure data consistency between offline training and online inference, eliminating train-serve skew. Standardize data engineering workflows under one consistent framework. Teams use Feast as the foundation of their internal ML platforms. Feast doesn’t require the deployment and management of dedicated infrastructure. Instead, it reuses existing infrastructure and spins up new resources when needed. You are not looking for a managed solution and are willing to manage and maintain your own implementation. You have engineers that are able to support the implementation and management of Feast. You want to run pipelines that transform raw data into features in a separate system and integrate with it. You have unique requirements and want to build on top of an open source solution.
  • Previous
  • You're on page 1
  • Next