Best Data Engineering Tools for Azure Databricks

Compare the Top Data Engineering Tools that integrate with Azure Databricks as of June 2025

This a list of Data Engineering tools that integrate with Azure Databricks. Use the filters on the left to add additional filters for products that have integrations with Azure Databricks. View the products that work with Azure Databricks in the table below.

What are Data Engineering Tools for Azure Databricks?

Data engineering tools are designed to facilitate the process of preparing and managing large datasets for analysis. These tools support tasks like data extraction, transformation, and loading (ETL), allowing engineers to build efficient data pipelines that move and process data from various sources into storage systems. They help ensure data integrity and quality by providing features for validation, cleansing, and monitoring. Data engineering tools also often include capabilities for automation, scalability, and integration with big data platforms. By streamlining complex workflows, they enable organizations to handle large-scale data operations more efficiently and support advanced analytics and machine learning initiatives. Compare and read user reviews of the best Data Engineering tools for Azure Databricks currently available using the table below. This list is updated regularly.

  • 1
    AnalyticsCreator

    AnalyticsCreator

    AnalyticsCreator

    Streamline your data engineering workflows with AnalyticsCreator by automating the design and deployment of robust data pipelines for databases, warehouses, lakes, and cloud services. The faster pipeline deployment ensures seamless connectivity across your ecosystem, improving innovation with modern engineering practices. Integrate a wide range of data sources and targets effortlessly, ensuring seamless ecosystem connectivity. Improve development cycles with automated documentation, lineage tracking, and schema evolution. Support modern engineering practices such as CI/CD and agile methodologies to accelerate collaboration and innovation across teams.
    View Tool
    Visit Website
  • 2
    Sifflet

    Sifflet

    Sifflet

    Automatically cover thousands of tables with ML-based anomaly detection and 50+ custom metrics. Comprehensive data and metadata monitoring. Exhaustive mapping of all dependencies between assets, from ingestion to BI. Enhanced productivity and collaboration between data engineers and data consumers. Sifflet seamlessly integrates into your data sources and preferred tools and can run on AWS, Google Cloud Platform, and Microsoft Azure. Keep an eye on the health of your data and alert the team when quality criteria aren’t met. Set up in a few clicks the fundamental coverage of all your tables. Configure the frequency of runs, their criticality, and even customized notifications at the same time. Leverage ML-based rules to detect any anomaly in your data. No need for an initial configuration. A unique model for each rule learns from historical data and from user feedback. Complement the automated rules with a library of 50+ templates that can be applied to any asset.
  • 3
    Microsoft Fabric
    Reshape how everyone accesses, manages, and acts on data and insights by connecting every data source and analytics service together—on a single, AI-powered platform. All your data. All your teams. All in one place. Establish an open and lake-centric hub that helps data engineers connect and curate data from different sources—eliminating sprawl and creating custom views for everyone. Accelerate analysis by developing AI models on a single foundation without data movement—reducing the time data scientists need to deliver value. Innovate faster by helping every person in your organization act on insights from within Microsoft 365 apps, such as Microsoft Excel and Microsoft Teams. Responsibly connect people and data using an open and scalable solution that gives data stewards additional control with built-in security, governance, and compliance.
    Starting Price: $156.334/month/2CU
  • 4
    Prophecy

    Prophecy

    Prophecy

    Prophecy enables many more users - including visual ETL developers and Data Analysts. All you need to do is point-and-click and write a few SQL expressions to create your pipelines. As you use the Low-Code designer to build your workflows - you are developing high quality, readable code for Spark and Airflow that is committed to your Git. Prophecy gives you a gem builder - for you to quickly develop and rollout your own Frameworks. Examples are Data Quality, Encryption, new Sources and Targets that extend the built-in ones. Prophecy provides best practices and infrastructure as managed services – making your life and operations simple! With Prophecy, your workflows are high performance and use scale-out performance & scalability of the cloud.
    Starting Price: $299 per month
  • 5
    DQOps

    DQOps

    DQOps

    DQOps is an open-source data quality platform designed for data quality and data engineering teams that makes data quality visible to business sponsors. The platform provides an efficient user interface to quickly add data sources, configure data quality checks, and manage issues. DQOps comes with over 150 built-in data quality checks, but you can also design custom checks to detect any business-relevant data quality issues. The platform supports incremental data quality monitoring to support analyzing data quality of very big tables. Track data quality KPI scores using our built-in or custom dashboards to show progress in improving data quality to business sponsors. DQOps is DevOps-friendly, allowing you to define data quality definitions in YAML files stored in Git, run data quality checks directly from your data pipelines, or automate any action with a Python Client. DQOps works locally or as a SaaS platform.
    Starting Price: $499 per month
  • 6
    Chalk

    Chalk

    Chalk

    Powerful data engineering workflows, without the infrastructure headaches. Complex streaming, scheduling, and data backfill pipelines, are all defined in simple, composable Python. Make ETL a thing of the past, fetch all of your data in real-time, no matter how complex. Incorporate deep learning and LLMs into decisions alongside structured business data. Make better predictions with fresher data, don’t pay vendors to pre-fetch data you don’t use, and query data just in time for online predictions. Experiment in Jupyter, then deploy to production. Prevent train-serve skew and create new data workflows in milliseconds. Instantly monitor all of your data workflows in real-time; track usage, and data quality effortlessly. Know everything you computed and data replay anything. Integrate with the tools you already use and deploy to your own infrastructure. Decide and enforce withdrawal limits with custom hold times.
    Starting Price: Free
  • 7
    Databricks Data Intelligence Platform
    The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.
  • Previous
  • You're on page 1
  • Next