Best Data Management Software for Apache Airflow - Page 2

Compare the Top Data Management Software that integrates with Apache Airflow as of October 2025 - Page 2

This a list of Data Management software that integrates with Apache Airflow. Use the filters on the left to add additional filters for products that have integrations with Apache Airflow. View the products that work with Apache Airflow in the table below.

  • 1
    Mode

    Mode

    Mode Analytics

    Understand how users are interacting with your product and identify opportunity areas to inform your product decisions. Mode empowers one Stitch analyst to do the work of a full data team through speed, flexibility, and collaboration. Build dashboards for annual revenue, then use chart visualizations to identify anomalies quickly. Create polished, investor-ready reports or share analysis with teams for collaboration. Connect your entire tech stack to Mode and identify upstream issues to improve performance. Speed up workflows across teams with APIs and webhooks. Understand how users are interacting with your product and identify opportunity areas to inform your product decisions. Leverage marketing and product data to fix weak spots in your funnel, improve landing-page performance, and understand churn before it happens.
  • 2
    IBM Databand
    Monitor your data health and pipeline performance. Gain unified visibility for pipelines running on cloud-native tools like Apache Airflow, Apache Spark, Snowflake, BigQuery, and Kubernetes. An observability platform purpose built for Data Engineers. Data engineering is only getting more challenging as demands from business stakeholders grow. Databand can help you catch up. More pipelines, more complexity. Data engineers are working with more complex infrastructure than ever and pushing higher speeds of release. It’s harder to understand why a process has failed, why it’s running late, and how changes affect the quality of data outputs. Data consumers are frustrated with inconsistent results, model performance, and delays in data delivery. Not knowing exactly what data is being delivered, or precisely where failures are coming from, leads to persistent lack of trust. Pipeline logs, errors, and data quality metrics are captured and stored in independent, isolated systems.
  • 3
    lakeFS

    lakeFS

    Treeverse

    lakeFS enables you to manage your data lake the way you manage your code. Run parallel pipelines for experimentation and CI/CD for your data. Simplifying the lives of engineers, data scientists and analysts who are transforming the world with data. lakeFS is an open source platform that delivers resilience and manageability to object-storage based data lakes. With lakeFS you can build repeatable, atomic and versioned data lake operations, from complex ETL jobs to data science and analytics. lakeFS supports AWS S3, Azure Blob Storage and Google Cloud Storage (GCS) as its underlying storage service. It is API compatible with S3 and works seamlessly with all modern data frameworks such as Spark, Hive, AWS Athena, Presto, etc. lakeFS provides a Git-like branching and committing model that scales to exabytes of data by utilizing S3, GCS, or Azure Blob for storage.
  • 4
    Datafold

    Datafold

    Datafold

    Prevent data outages by identifying and fixing data quality issues before they get into production. Go from 0 to 100% test coverage of your data pipelines in a day. Know the impact of each code change with automatic regression testing across billions of rows. Automate change management, improve data literacy, achieve compliance, and reduce incident response time. Don’t let data incidents take you by surprise. Be the first one to know with automated anomaly detection. Datafold’s easily adjustable ML model adapts to seasonality and trend patterns in your data to construct dynamic thresholds. Save hours spent on trying to understand data. Use the Data Catalog to find relevant datasets, fields, and explore distributions easily with an intuitive UI. Get interactive full-text search, data profiling, and consolidation of metadata in one place.
  • 5
    Great Expectations

    Great Expectations

    Great Expectations

    Great Expectations is a shared, open standard for data quality. It helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. We recommend deploying within a virtual environment. If you’re not familiar with pip, virtual environments, notebooks, or git, you may want to check out the Supporting. There are many amazing companies using great expectations these days. Check out some of our case studies with companies that we've worked closely with to understand how they are using great expectations in their data stack. Great expectations cloud is a fully managed SaaS offering. We're taking on new private alpha members for great expectations cloud, a fully managed SaaS offering. Alpha members get first access to new features and input to the roadmap.
  • 6
    Meltano

    Meltano

    Meltano

    Meltano provides the ultimate flexibility in deployment options. Own your data stack, end to end. Ever growing connector library of 300+ connectors have been running in production for years. Run workflows in isolated environments, execute end-to-end tests, and version control everything. Open source gives you the power to build your ideal data stack. Define your entire project as code and collaborate confidently with your team. The Meltano CLI enables you to rapidly create your project, making it easy to start replicating data. Meltano is designed to be the best way to run dbt to manage your transformations. Your entire data stack is defined in your project, making it simple to deploy it to production. Validate your changes in development before moving to CI, and in staging before moving to production.
  • 7
    Metaphor

    Metaphor

    Metaphor Data

    Automatically indexed warehouses, lakes, dashboards, and other pieces of your data stack. Combined with utilization, lineage, and other social popularity signals, Metaphor lets you show the most trusted data to your users. Provide an open 360 view of your data and conversations about data to everyone in the organization. Meet your customers where they are - share artifacts from the catalog including documentation, natively via Slack. Tag insightful Slack conversations and associate them with data. Collaborate across silos by the organic discovery of important terms and usage patterns. Easily discover data across the entire stack, write technical details and Business friendly wiki that is easily consumed by non-technical users. Support your users directly in Slack and use the catalog as a Data Enablement tool to quickly onboard users for a more personalized experience.
  • 8
    rudol

    rudol

    rudol

    Unify your data catalog, reduce communication overhead and enable quality control to any member of your company, all without deploying or installing anything. rudol is a data quality platform that helps companies understand all their data sources, no matter where they come from; reduces excessive communication in reporting processes or urgencies; and enables data quality diagnosing and issue prevention to all the company, through easy steps With rudol, each organization is able to add data sources from a growing list of providers and BI tools with a standardized structure, including MySQL, PostgreSQL, Airflow, Redshift, Snowflake, Kafka, S3*, BigQuery*, MongoDB*, Tableau*, PowerBI*, Looker* (* in development). So, regardless of where it’s coming from, people can understand where and how the data is stored, read and collaborate with its documentation, or easily contact data owners using our integrations.
    Starting Price: $0
  • 9
    Acryl Data

    Acryl Data

    Acryl Data

    No more data catalog ghost towns. Acryl Cloud drives fast time-to-value via Shift Left practices for data producers and an intuitive UI for data consumers. Continuously detect data quality incidents in real-time, automate anomaly detection to prevent breakages, and drive fast resolution when they do occur. Acryl Cloud supports both push-based and pull-based metadata ingestion for easy maintenance, ensuring information is trustworthy, up-to-date, and definitive. Data should be operational. Go beyond simple visibility and use automated Metadata Tests to continuously expose data insights and surface new areas for improvement. Reduce confusion and accelerate resolution with clear asset ownership, automatic detection, streamlined alerts, and time-based lineage for tracing root causes.
  • 10
    Pantomath

    Pantomath

    Pantomath

    Organizations continuously strive to be more data-driven, building dashboards, analytics, and data pipelines across the modern data stack. Unfortunately, most organizations struggle with data reliability issues leading to poor business decisions and lack of trust in data as an organization, directly impacting their bottom line. Resolving complex data issues is a manual and time-consuming process involving multiple teams all relying on tribal knowledge to manually reverse engineer complex data pipelines across different platforms to identify root-cause and understand the impact. Pantomath is a data pipeline observability and traceability platform for automating data operations. It continuously monitors datasets and jobs across the enterprise data ecosystem providing context to complex data pipelines by creating automated cross-platform technical pipeline lineage.
  • 11
    SDF

    SDF

    SDF

    SDF is a developer platform for data that enhances SQL comprehension across organizations, enabling data teams to unlock the full potential of their data. It provides a transformation layer to streamline query writing and management, an analytical database engine for local execution, and an accelerator for improved transformation processes. SDF also offers proactive quality and governance features, including reports, contracts, and impact analysis, to ensure data integrity and compliance. By representing business logic as code, SDF facilitates the classification and management of data types, enhancing the clarity and maintainability of data models. It integrates seamlessly with existing data workflows, supporting various SQL dialects and cloud environments, and is designed to scale with the growing needs of data teams. SDF's open-core architecture, built on Apache DataFusion, allows for customization and extension, fostering a collaborative ecosystem for data development.
  • 12
    Soda

    Soda

    Soda

    Soda drives your data operations by identifying data issues, alerting the right people, and helping teams diagnose and resolve root causes. With automated and self-serve data monitoring capabilities, no data—or people—are ever left in the dark. Get ahead of data issues quickly by delivering full observability through easy instrumentation across your data workloads. Empower data teams to discover data issues that automation will miss. Self-service capabilities deliver the broad coverage that data monitoring needs. Alert the right people at the right time to help teams across the business diagnose, prioritize, and fix data issues. With Soda, your data never leaves your private cloud. Soda monitors data at the source and only stores metadata in your cloud.