Best Data Management Software for Azure Databricks

Compare the Top Data Management Software that integrates with Azure Databricks as of June 2025

This a list of Data Management software that integrates with Azure Databricks. Use the filters on the left to add additional filters for products that have integrations with Azure Databricks. View the products that work with Azure Databricks in the table below.

What is Data Management Software for Azure Databricks?

Data management software systems are software platforms that help organize, store and analyze information. They provide a secure platform for data sharing and analysis with features such as reporting, automation, visualizations, and collaboration. Data management software can be customized to fit the needs of any organization by providing numerous user options to easily access or modify data. These systems enable organizations to keep track of their data more efficiently while reducing the risk of data loss or breaches for improved business security. Compare and read user reviews of the best Data Management software for Azure Databricks currently available using the table below. This list is updated regularly.

  • 1
    Quaeris

    Quaeris

    Quaeris, Inc.

    Align analytics to your everyday business workflows. Your business relies on people, data and documents, but the process of using them is broken. QuaerisAI enables seamless downstream workflows across your People, Documents and Data Assets. Use natural language search on data, documents and collaborate in private or within Communities - all in one platform! QuaerisAI offers time savings of at-least 30 minutes to an hour/day/resource - imagine the productivity enhancements you give your users without the expense of buying and consolidating a bunch of AI tools. Quaeris can be rolled out to team of 10s or 1000s of users seamlessly within a matter of days - without much need of IT, and that is why IT & data teams love us!
    Starting Price: $100 per month
    Partner badge
    View Software
    Visit Website
  • 2
    AnalyticsCreator

    AnalyticsCreator

    AnalyticsCreator

    Accelerate DWH development by automating the design and generation of complex data models, including dimensional, data mart, and data vault architectures. This automation ensures faster time-to-value through streamlined workflows, resulting in improved data accuracy and consistency. Using AnalyticsCreator allows you to seamlessly integrate your data with platforms like MS Fabric, Power BI, Snowflake, Tableau, Azure Synapse, and more. With built-in transformations and historization capabilities, you can manage historical data with support for Slowly Changing Dimensions (SCD) types, enhancing governance and operational efficiency. Streamline your teamwork with robust version control features and automated documentation, ensuring enhanced collaboration and reduced development cycles. Enable faster prototyping, schema evolution, and metadata management for a more agile approach to data management.
    View Software
    Visit Website
  • 3
    QuerySurge
    QuerySurge leverages AI to automate the data validation and ETL testing of Big Data, Data Warehouses, Business Intelligence Reports and Enterprise Apps/ERPs with full DevOps functionality for continuous testing. Use Cases - Data Warehouse & ETL Testing - Hadoop & NoSQL Testing - DevOps for Data / Continuous Testing - Data Migration Testing - BI Report Testing - Enterprise App/ERP Testing QuerySurge Features - Projects: Multi-project support - AI: automatically create datas validation tests based on data mappings - Smart Query Wizards: Create tests visually, without writing SQL - Data Quality at Speed: Automate the launch, execution, comparison & see results quickly - Test across 200+ platforms: Data Warehouses, Hadoop & NoSQL lakes, databases, flat files, XML, JSON, BI Reports - DevOps for Data & Continuous Testing: RESTful API with 60+ calls & integration with all mainstream solutions - Data Analytics & Data Intelligence:  Analytics dashboard & reports
  • 4
    Kyvos

    Kyvos

    Kyvos Insights

    Kyvos is a semantic data lakehouse that accelerates every BI and AI initiative. The platform delivers lightning-fast analytics at infinite scale, maximum savings and the lowest carbon footprint. It offers high-performance storage for structured or unstructured data and trusted data for AI applications. The infrastructure-agnostic platform is critical for any modern data or AI stack, whether on-premises or on cloud. Leading enterprises use Kyvos as a universal source for fast, price-performant analytics, enabling rich dialogs with data and building context-aware AI apps.
  • 5
    Sifflet

    Sifflet

    Sifflet

    Automatically cover thousands of tables with ML-based anomaly detection and 50+ custom metrics. Comprehensive data and metadata monitoring. Exhaustive mapping of all dependencies between assets, from ingestion to BI. Enhanced productivity and collaboration between data engineers and data consumers. Sifflet seamlessly integrates into your data sources and preferred tools and can run on AWS, Google Cloud Platform, and Microsoft Azure. Keep an eye on the health of your data and alert the team when quality criteria aren’t met. Set up in a few clicks the fundamental coverage of all your tables. Configure the frequency of runs, their criticality, and even customized notifications at the same time. Leverage ML-based rules to detect any anomaly in your data. No need for an initial configuration. A unique model for each rule learns from historical data and from user feedback. Complement the automated rules with a library of 50+ templates that can be applied to any asset.
  • 6
    StarfishETL

    StarfishETL

    StarfishETL

    StarfishETL is an Integration Platform as a Service (iPaaS), and although “integration” is in the name, it’s capable of much more. An iPaaS lives in the cloud and can integrate different systems by using their APIs. This makes it adaptable beyond integration for migration, data governance, and data cleansing. Unlike traditional integration apps, StarfishETL provides low-code mapping and powerful scripting tools to manage, personalize, and manipulate data at scale. Features: - Drag and drop mapping - AI-powered connections - Purpose built integrations - Extensibility through scripting - Secure on-premises connections - Scalable data capacity
    Starting Price: 400/month
  • 7
    Hackolade

    Hackolade

    Hackolade

    Hackolade Studio is a powerful data modeling platform that supports a wide range of technologies including relational SQL and NoSQL databases, cloud data warehouses, APIs, streaming platforms, and data exchange formats. Designed for modern data architecture, it enables users to visually design, document, and evolve schemas across systems like Oracle, PostgreSQL, Databricks, Snowflake, MongoDB, Cassandra, DynamoDB, Neo4j, Kafka (with Confluent Schema Registry), OpenAPI, GraphQL, and more. Hackolade Studio offers forward and reverse engineering, schema versioning, model validation, and integration with metadata catalogs such as Unity Catalog and Collibra. It empowers data architects, engineers, and governance teams to collaborate on consistent, governed, and scalable data models. Whether building data products, managing API contracts, or ensuring regulatory compliance, Hackolade Studio streamlines the process in one unified interface.
    Starting Price: €175 per month
  • 8
    Dagster

    Dagster

    Dagster Labs

    Dagster is a next-generation orchestration platform for the development, production, and observation of data assets. Unlike other data orchestration solutions, Dagster provides you with an end-to-end development lifecycle. Dagster gives you control over your disparate data tools and empowers you to build, test, deploy, run, and iterate on your data pipelines. It makes you and your data teams more productive, your operations more robust, and puts you in complete control of your data processes as you scale. Dagster brings a declarative approach to the engineering of data pipelines. Your team defines the data assets required, quickly assessing their status and resolving any discrepancies. An assets-based model is clearer than a tasks-based one and becomes a unifying abstraction across the whole workflow.
    Starting Price: $0
  • 9
    Zing Data

    Zing Data

    Zing Data

    A flexible visual query builder lets you get answers in seconds. Analyze data from your phone or browser to work from anywhere. Natural language querying, powered by LLMs lets you ask questions using plain English. No desktop, SQL, or data scientist needed. Shared questions let you learn from team mates, and search for any questions asked across your organization. @mentions, push notifications, and shared chat bring the right people into the conversation and empower you to make data actionable. Easily copy and modify shared questions, export data, and change how charts are displayed to not just view somebody elses’s analysis, but instead make it your own. You can even turn on external sharing to provide access to partners outside your domain or for public datasets. Get the underlying data tables in two taps. Even run full on custom SQL with smart typeaheads to make quick work of joins, aggregations, and calculated fields.
    Starting Price: $0
  • 10
    Dasera

    Dasera

    Dasera

    Dasera is a Data Security Posture Management (DSPM) platform providing automated security and governance controls for structured and unstructured data across cloud and on-prem environments. Uniquely, Dasera monitors data in use while offering continuous visibility and automated remediation, preventing data breaches across the entire data lifecycle. Dasera provides continuous visibility, risk detection, and mitigation to align with business goals while ensuring seamless integration, unmatched security, and regulatory compliance. Through its deep understanding of the four data variables - data infrastructure, data attributes, data users, and data usage - Dasera promotes a secure data-driven growth strategy that minimizes risk and maximizes value, giving businesses a competitive edge in today's rapidly evolving digital landscape.
    Starting Price: 3 data stores at $20,000
  • 11
    Microsoft Fabric
    Reshape how everyone accesses, manages, and acts on data and insights by connecting every data source and analytics service together—on a single, AI-powered platform. All your data. All your teams. All in one place. Establish an open and lake-centric hub that helps data engineers connect and curate data from different sources—eliminating sprawl and creating custom views for everyone. Accelerate analysis by developing AI models on a single foundation without data movement—reducing the time data scientists need to deliver value. Innovate faster by helping every person in your organization act on insights from within Microsoft 365 apps, such as Microsoft Excel and Microsoft Teams. Responsibly connect people and data using an open and scalable solution that gives data stewards additional control with built-in security, governance, and compliance.
    Starting Price: $156.334/month/2CU
  • 12
    Prophecy

    Prophecy

    Prophecy

    Prophecy enables many more users - including visual ETL developers and Data Analysts. All you need to do is point-and-click and write a few SQL expressions to create your pipelines. As you use the Low-Code designer to build your workflows - you are developing high quality, readable code for Spark and Airflow that is committed to your Git. Prophecy gives you a gem builder - for you to quickly develop and rollout your own Frameworks. Examples are Data Quality, Encryption, new Sources and Targets that extend the built-in ones. Prophecy provides best practices and infrastructure as managed services – making your life and operations simple! With Prophecy, your workflows are high performance and use scale-out performance & scalability of the cloud.
    Starting Price: $299 per month
  • 13
    DQOps

    DQOps

    DQOps

    DQOps is an open-source data quality platform designed for data quality and data engineering teams that makes data quality visible to business sponsors. The platform provides an efficient user interface to quickly add data sources, configure data quality checks, and manage issues. DQOps comes with over 150 built-in data quality checks, but you can also design custom checks to detect any business-relevant data quality issues. The platform supports incremental data quality monitoring to support analyzing data quality of very big tables. Track data quality KPI scores using our built-in or custom dashboards to show progress in improving data quality to business sponsors. DQOps is DevOps-friendly, allowing you to define data quality definitions in YAML files stored in Git, run data quality checks directly from your data pipelines, or automate any action with a Python Client. DQOps works locally or as a SaaS platform.
    Starting Price: $499 per month
  • 14
    Openbridge

    Openbridge

    Openbridge

    Uncover insights to supercharge sales growth using code-free, fully-automated data pipelines to data lakes or cloud warehouses. A flexible, standards-based platform to unify sales and marketing data for automating insights and smarter growth. Say goodbye to messy, expensive manual data downloads. Always know what you’ll pay and only pay for what you use. Fuel your tools with quick access to analytics-ready data. As certified developers, we only work with secure, official APIs. Get started quickly with data pipelines from popular sources. Pre-built, pre-transformed, and ready-to-go data pipelines. Unlock data from Amazon Vendor Central, Amazon Seller Central, Instagram Stories, Facebook, Amazon Advertising, Google Ads, and many others. Code-free data ingestion and transformation processes allow teams to realize value from their data quickly and cost-effectively. Data is always securely stored directly in a trusted, customer-owned data destination like Databricks, Amazon Redshift, etc.
    Starting Price: $149 per month
  • 15
    HStreamDB
    A streaming database is purpose-built to ingest, store, process, and analyze massive data streams. It is a modern data infrastructure that unifies messaging, stream processing, and storage to help get value out of your data in real-time. Ingest massive amounts of data continuously generated from various sources, such as IoT device sensors. Store millions of data streams reliably in a specially designed distributed streaming data storage cluster. Consume data streams in real-time as fast as from Kafka by subscribing to topics in HStreamDB. With the permanent data stream storage, you can playback and consume data streams anytime. Process data streams based on event-time with the same familiar SQL syntax you use to query data in a relational database. You can use SQL to filter, transform, aggregate, and even join multiple data streams.
    Starting Price: Free
  • 16
    Kedro

    Kedro

    Kedro

    Kedro is the foundation for clean data science code. It borrows concepts from software engineering and applies them to machine-learning projects. A Kedro project provides scaffolding for complex data and machine-learning pipelines. You spend less time on tedious "plumbing" and focus instead on solving new problems. Kedro standardizes how data science code is created and ensures teams collaborate to solve problems easily. Make a seamless transition from development to production with exploratory code that you can transition to reproducible, maintainable, and modular experiments. A series of lightweight data connectors is used to save and load data across many different file formats and file systems.
    Starting Price: Free
  • 17
    Tabular

    Tabular

    Tabular

    Tabular is an open table store from the creators of Apache Iceberg. Connect multiple computing engines and frameworks. Decrease query time and storage costs by up to 50%. Centralize enforcement of data access (RBAC) policies. Connect any query engine or framework, including Athena, BigQuery, Redshift, Snowflake, Databricks, Trino, Spark, and Python. Smart compaction, clustering, and other automated data services reduce storage costs and query times by up to 50%. Unify data access at the database or table. RBAC controls are simple to manage, consistently enforced, and easy to audit. Centralize your security down to the table. Tabular is easy to use plus it features high-powered ingestion, performance, and RBAC under the hood. Tabular gives you the flexibility to work with multiple “best of breed” compute engines based on their strengths. Assign privileges at the data warehouse database, table, or column level.
    Starting Price: $100 per month
  • 18
    STRM

    STRM

    STRM

    Creating and managing data policies is a slow pain. With PACE by STRM, you can make sure data is used securely. Apply data policies through code, wherever it lives. Farewell to long waits and costly meetings, meet your new open source data security engine. Data policies aren't just about controlling access; they are about extracting value from data with the right guardrails. PACE lets you collaborate on the why and when automating the how through code. With PACE you can programmatically define and apply data policies across platforms. Integrated into your data platform and catalog (optional), and by leveraging the native capabilities of the stack you already have. PACE enables automated policy application across key data platforms and catalogs to ease your governance processes. Ease the process of policy creation and implementation, centralize control, and decentralize execution. Fulfill auditing obligations by simply showing how controls are implemented.
    Starting Price: Free
  • 19
    Artie

    Artie

    Artie

    Stream only the data that has changed to the destination. Eliminate data latency and reduce computational overhead. Change data capture (CDC) is a highly efficient method to sync data. Log-based replication is a non-intrusive way to replicate data in real time and does not impact source database performance. Set up the end-to-end solution in minutes, with zero pipeline maintenance. Let your data teams work on higher-value projects. Setting up Artie takes just a few simple steps. Artie will handle backfilling historical data and continuously stream new changes to the final table as they occur. Artie ensures data consistency and high reliability. In the event of an outage, Artie leverages offsets in Kafka to pick up where it left off, which helps maintain high data integrity while avoiding the burden of performing full re-syncs.
    Starting Price: $231 per month
  • 20
    Protegrity

    Protegrity

    Protegrity

    Our platform allows businesses to use data—including its application in advanced analytics, machine learning, and AI—to do great things without worrying about putting customers, employees, or intellectual property at risk. The Protegrity Data Protection Platform doesn't just secure data—it simultaneously classifies and discovers data while protecting it. You can't protect what you don't know you have. Our platform first classifies data, allowing users to categorize the type of data that can mostly be in the public domain. With those classifications established, the platform then leverages machine learning algorithms to discover that type of data. Classification and discovery finds the data that needs to be protected. Whether encrypting, tokenizing, or applying privacy methods, the platform secures the data behind the many operational systems that drive the day-to-day functions of business, as well as the analytical systems behind decision-making.
  • 21
    Bluemetrix

    Bluemetrix

    Bluemetrix

    Migrating data to the cloud is difficult. Let us simplify the process for you with Bluemetrix Data Manager (BDM). BDM automates the ingestion of complex data sources and ensures that as your data sources change dynamically, your pipelines configure automatically to match the new data sources. BDM provides the application of automation and processing of data at scale, in a secure modern environment, with smart GUI and API interfaces available. With data governance fully automated, it is now possible to streamline pipeline creation, while recording and storing all actions automatically in your catalogue as the pipeline executes. Easy to create templating, combined with smart scheduling options, enables Self Service capability for data consumers - both business and technical users. A free enterprise-grade data ingestion tool that will automate the ingestion of data smoothly and quickly from on-premise sources into the cloud, while also automating the creation and running of pipelines.
  • 22
    Embeddable

    Embeddable

    Embeddable

    Build remarkable analytics experiences, in 10% of the time. Frustrated with your embedded analytics tool, or with maintaining a custom-built charts and dashboards in your app? Embeddable is a next-generation embedded analytics tool where you own the front-end code and we handle everything else, Enabling you to build fully-bespoke, fast-loading charts and dashboards in your app without the engineering costs. Delight your customers, reduce engineering overheads, and deliver your dream experience, fast. Compatible with all major databases. Cloud & Self-hosted. Multi-tenancy. Open source component library + more.
    Starting Price: On request
  • 23
    Chalk

    Chalk

    Chalk

    Powerful data engineering workflows, without the infrastructure headaches. Complex streaming, scheduling, and data backfill pipelines, are all defined in simple, composable Python. Make ETL a thing of the past, fetch all of your data in real-time, no matter how complex. Incorporate deep learning and LLMs into decisions alongside structured business data. Make better predictions with fresher data, don’t pay vendors to pre-fetch data you don’t use, and query data just in time for online predictions. Experiment in Jupyter, then deploy to production. Prevent train-serve skew and create new data workflows in milliseconds. Instantly monitor all of your data workflows in real-time; track usage, and data quality effortlessly. Know everything you computed and data replay anything. Integrate with the tools you already use and deploy to your own infrastructure. Decide and enforce withdrawal limits with custom hold times.
    Starting Price: Free
  • 24
    Datagaps ETL Validator
    DataOps ETL Validator is the most comprehensive data validation and ETL testing automation tool. Comprehensive ETL/ELT validation tool to automate the testing of data migration and data warehouse projects with easy-to-use low-code, no-code component-based test creation and drag-and-drop user interface. ETL process involves extracting data from various sources, transforming it to fit operational needs, and loading it into a target database or data warehouse. ETL testing involves verifying the accuracy, integrity, and completeness of data as it moves through the ETL process to ensure it meets business rules and requirements. Automating ETL testing can be achieved using tools that automate data comparison, validation, and transformation tests, significantly speeding up the testing cycle and reducing manual labor. ETL Validator automates ETL testing by providing intuitive interfaces for creating test cases without extensive coding.
  • 25
    Harbr

    Harbr

    Harbr

    Create data products from any source in seconds, without moving the data. Make them available to anyone, while maintaining complete control. Deliver powerful experiences to unlock value. Enhance your data mesh by seamlessly sharing, discovering, and governing data across domains. Foster collaboration and accelerate innovation with unified access to high-quality data products. Provide governed access to AI models for any user. Control how data interacts with AI to safeguard intellectual property. Automate AI workflows to rapidly integrate and iterate new capabilities. Access and build data products from Snowflake without moving any data. Experience the ease of getting more from your data. Make it easy for anyone to analyze data and remove the need for centralized provisioning of infrastructure and tools. Data products are magically integrated with tools, to ensure governance and accelerate outcomes.
  • 26
    Databricks Data Intelligence Platform
    The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.
  • 27
    Nucleon Database Master

    Nucleon Database Master

    Nucleon Software

    Nucleon Database Master is a modern, powerful, intuitive and easy to use database query, administration, and management software with a consistent and modern user interface. Database Master simplifies managing, monitoring, querying, editing, visualizing, designing relational and NoSQL DBMS. Database Master allows you to execute extended SQL, JQL and C# (Linq) query scripts, provides all database objects such as tables, views, procedures, packages, columns, indexes, relationships (constraints), collections, triggers and other database objects.
    Starting Price: $99 one-time payment
  • 28
    Mage Sensitive Data Discovery
    Uncover hidden sensitive data locations within your enterprise through Mage's patented Sensitive Data Discovery module. Find data hidden in all types of data stores in the most obscure locations, be it structured, unstructured, Big Data, or on the Cloud. Leverage the power of Artificial Intelligence and Natural Language Processing to uncover data in the most complex of locations. Ensure efficient identification of sensitive data with minimal false positives with a patented approach to data discovery. Configure any additional data classifications over and above the 70+ out of the box data classifications covering all popular PII and PHI data. Schedule sample, full, or even incremental scans through a simplified discovery process.
  • 29
    Azure Data Lake
    Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape, and speed, and do all types of processing and analytics across platforms and languages. It removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics. Azure Data Lake works with existing IT investments for identity, management, and security for simplified data management and governance. It also integrates seamlessly with operational stores and data warehouses so you can extend current data applications. We’ve drawn on the experience of working with enterprise customers and running some of the largest scale processing and analytics in the world for Microsoft businesses like Office 365, Xbox Live, Azure, Windows, Bing, and Skype. Azure Data Lake solves many of the productivity and scalability challenges that prevent you from maximizing the
  • 30
    Azure Data Lake Storage
    Eliminate data silos with a single storage platform. Optimize costs with tiered storage and policy management. Authenticate data using Azure Active Directory (Azure AD) and role-based access control (RBAC). And help protect data with security features like encryption at rest and advanced threat protection. Highly secure with flexible mechanisms for protection across data access, encryption, and network-level control. Single storage platform for ingestion, processing, and visualization that supports the most common analytics frameworks. Cost optimization via independent scaling of storage and compute, lifecycle policy management, and object-level tiering. Meet any capacity requirements and manage data with ease, with the Azure global infrastructure. Run large-scale analytics queries at consistently high performance.
  • Previous
  • You're on page 1
  • 2
  • Next