Alternatives to AWS Glue

Compare AWS Glue alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to AWS Glue in 2024. Compare features, ratings, user reviews, pricing, and more from AWS Glue competitors and alternatives in order to make an informed decision for your business.

  • 1
    Google Cloud BigQuery
    BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely scale analytics, share rich data experiences with built-in business intelligence, and train and deploy ML models with a simple SQL interface, helping to make your organization’s operations more data-driven.
    Compare vs. AWS Glue View Software
    Visit Website
  • 2
    Pentaho

    Pentaho

    Hitachi Vantara

    Accelerate data-driven transformation powered by intelligent data operations across your edge to multi-cloud data fabric. Pentaho lets you automate the daily tasks of collecting, integrating, governing, and analytics, on an intelligent platform providing an open and composable foundation for all enterprise data. Schedule your free demo to learn more about Pentaho Integration and Analytics, Data Catalog and Storage Optimizer.
  • 3
    Minitab Connect
    The best insights are based on the most complete, most accurate, and most timely data. Minitab Connect empowers data users from across the enterprise with self-serve tools to transform diverse data into a governed network of data pipelines, feed analytics initiatives and foster organization-wide collaboration. Users can effortlessly blend and explore data from databases, cloud and on-premise apps, unstructured data, spreadsheets, and more. Flexible, automated workflows accelerate every step of the data integration process, while powerful data preparation and visualization tools help yield transformative insights. Flexible, intuitive data integration tools let users connect and blend data from a variety of internal and external sources, like data warehouses, data lakes, IoT devices, SaaS applications, cloud storage, spreadsheets, and email.
  • 4
    IRI Voracity

    IRI Voracity

    IRI, The CoSort Company

    Voracity is the only high-performance, all-in-one data management platform accelerating AND consolidating the key activities of data discovery, integration, migration, governance, and analytics. Voracity helps you control your data in every stage of the lifecycle, and extract maximum value from it. Only in Voracity can you: 1) CLASSIFY, profile and diagram enterprise data sources 2) Speed or LEAVE legacy sort and ETL tools 3) MIGRATE data to modernize and WRANGLE data to analyze 4) FIND PII everywhere and consistently MASK it for referential integrity 5) Score re-ID risk and ANONYMIZE quasi-identifiers 6) Create and manage DB subsets or intelligently synthesize TEST data 7) Package, protect and provision BIG data 8) Validate, scrub, enrich and unify data to improve its QUALITY 9) Manage metadata and MASTER data. Use Voracity to comply with data privacy laws, de-muck and govern the data lake, improve the reliability of your analytics, and create safe, smart test data
  • 5
    TiMi

    TiMi

    TIMi

    With TIMi, companies can capitalize on their corporate data to develop new ideas and make critical business decisions faster and easier than ever before. The heart of TIMi’s Integrated Platform. TIMi’s ultimate real-time AUTO-ML engine. 3D VR segmentation and visualization. Unlimited self service business Intelligence. TIMi is several orders of magnitude faster than any other solution to do the 2 most important analytical tasks: the handling of datasets (data cleaning, feature engineering, creation of KPIs) and predictive modeling. TIMi is an “ethical solution”: no “lock-in” situation, just excellence. We guarantee you a work in all serenity and without unexpected extra costs. Thanks to an original & unique software infrastructure, TIMi is optimized to offer you the greatest flexibility for the exploration phase and the highest reliability during the production phase. TIMi is the ultimate “playground” that allows your analysts to test the craziest ideas!
  • 6
    Composable DataOps Platform

    Composable DataOps Platform

    Composable Analytics

    Composable is an enterprise-grade DataOps platform built for business users that want to architect data intelligence solutions and deliver operational data-driven products leveraging disparate data sources, live feeds, and event data regardless of the format or structure of the data. With a modern, intuitive dataflow visual designer, built-in services to facilitate data engineering, and a composable architecture that enables abstraction and integration of any software or analytical approach, Composable is the leading integrated development environment to discover, manage, transform and analyze enterprise data.
  • 7
    HERE

    HERE

    HERE Technologies

    HERE is the #1 location platform for developers, ranked above Google, Mapbox and TomTom for mapping quality. Make the switch to enhance your offering and take advantage of greater monetization opportunities. Bring rich location data, intelligent products and powerful tools together to drive your business forward. HERE lets you add location-aware capabilities to your apps and online services with free access to over 20 market-leading APIs, including mapping, geocoding, routing, traffic, weather and more. Plus, when you sign up for HERE Freemium you’ll also gain access to the HERE XYZ map builder, which comes with 5GB of free storage for all your geodata. No matter your skill level you can get started right away with industry-leading mapping and location technology. Configure our location services with your data and business insights, and build differentiated solutions. Integrate with ease into your application or solution with standardized APIs and SDKs.
    Starting Price: $0.08 per GB
  • 8
    IBM DataStage
    Accelerate AI innovation with cloud-native data integration on IBM Cloud Pak for data. AI-powered data integration, anywhere. Your AI and analytics are only as good as the data that fuels them. With a modern container-based architecture, IBM® DataStage® for IBM Cloud Pak® for Data delivers that high-quality data. It combines industry-leading data integration with DataOps, governance and analytics on a single data and AI platform. Automation accelerates administrative tasks to help reduce TCO. AI-based design accelerators and out-of-the-box integration with DataOps and data science services speed AI innovation. Parallelism and multicloud integration let you deliver trusted data at scale across hybrid or multicloud environments. Manage the data and analytics lifecycle on the IBM Cloud Pak for Data platform. Services include data science, event messaging, data virtualization and data warehousing. Parallel engine and automated load balancing.
  • 9
    Informatica Intelligent Data Management Cloud
    Our AI-powered Intelligent Data Platform is the industry's most comprehensive and modular platform. It helps you unleash the value of data across your enterprise—and empowers you to solve your most complex problems. Our platform defines a new standard for enterprise-class data management. We deliver best-in-class products and an integrated platform that unifies them, so you can power your business with intelligent data. Connect to any data from any source—and scale with confidence. You’re backed by a global platform that processes over 15 trillion cloud transactions every month. Future-proof your business with an end-to-end platform that delivers trusted data at scale across data management use cases. Our AI-powered architecture supports integration patterns and allows you to grow and evolve at your own speed. Our solution is modular, microservices-based and API-driven.
  • 10
    Matillion

    Matillion

    Matillion

    Cloud-Native ETL Tool. Load and Transform Data To Your Cloud Data Warehouse In Minutes. We reversed the traditional ETL process to create a solution that performs data integration within the cloud itself. Our solution utilizes the near-infinite storage capacity of the cloud—meaning your projects get near-infinite scalability. By working in the cloud, we reduce the complexity involved in moving large amounts of data. Process a billion rows of data in fifteen minutes—and go from launch to live in just five. Modern businesses seeking a competitive advantage must harness their data to gain better business insights. Matillion enables your data journey by extracting, migrating and transforming your data in the cloud allowing you to gain new insights and make better business decisions.
  • 11
    Semarchy xDI
    Experience Semarchy’s flexible unified data platform to empower better business decisions enterprise-wide. Integrate all your data with xDI, the high-performance, agile, and extensible data integration for all styles and use cases. Its single technology federates all forms of data integration, and mapping converts business rules into deployable code. xDI has extensible and open architecture supporting on-premise, cloud, hybrid, and multi-cloud environments.
  • 12
    Stitch

    Stitch

    Talend

    Stitch is a cloud-based platform for ETL – extract, transform, and load. More than a thousand companies use Stitch to move billions of records every day from SaaS applications and databases into data warehouses and data lakes.
  • 13
    StreamSets

    StreamSets

    StreamSets

    StreamSets DataOps Platform. The data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power modern analytics and hybrid integration. Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps. With StreamSets, you can deliver the continuous data that drives the connected enterprise.
    Starting Price: $1000 per month
  • 14
    Alation

    Alation

    Alation

    Alation is the first company to bring a data catalog to market. It radically improves how people find, understand, trust, use, and reuse data. Alation pioneered active, non-invasive data governance, which supports both data democratization and compliance at scale, so people have the data they need alongside guidance on how to use it correctly. By combining human insight with AI and machine learning, Alation tackles the toughest challenges in data today. More than 350 enterprises use Alation to make confident, data-driven decisions. American Family Insurance, Exelon, Munich Re, and Pfizer are all proud customers.
  • 15
    Amazon Athena
    Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets. Athena is out-of-the-box integrated with AWS Glue Data Catalog, allowing you to create a unified metadata repository across various services, crawl data sources to discover schemas and populate your Catalog with new and modified table and partition definitions, and maintain schema versioning.
  • 16
    Airbyte

    Airbyte

    Airbyte

    Get all your ELT data pipelines running in minutes, even your custom ones. Let your team focus on insights and innovation. Unify your data integration pipelines in one open-source ELT platform. Airbyte addresses all your data team's connector needs, however custom they are and whatever your scale. The data integration platform that can scale with your custom or high-volume needs. From high-volume databases to the long tail of API sources. Leverage Airbyte’s long tail of high-quality connectors that adapt to schema and API changes. Extensible to unify all native & custom ELT. Edit pre-built open-source connectors, or build new ones with our connector development kit in a few hours. Transparent and scalable pricing. Finally, a transparent and predictable cost-based pricing that scales with your data needs. You don’t need to worry about volume anymore. No more need for custom systems for your in-house scripts or database replication.
    Starting Price: $2.50 per credit
  • 17
    AWS Data Pipeline
    AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR. AWS Data Pipeline helps you easily create complex data processing workloads that are fault tolerant, repeatable, and highly available. You don’t have to worry about ensuring resource availability, managing inter-task dependencies, retrying transient failures or timeouts in individual tasks, or creating a failure notification system. AWS Data Pipeline also allows you to move and process data that was previously locked up in on-premises data silos.
    Starting Price: $1 per month
  • 18
    Apache Atlas

    Apache Atlas

    Apache Software Foundation

    Atlas is a scalable and extensible set of core foundational governance services – enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the whole enterprise data ecosystem. Apache Atlas provides open metadata management and governance capabilities for organizations to build a catalog of their data assets, classify and govern these assets and provide collaboration capabilities around these data assets for data scientists, analysts and the data governance team. Pre-defined types for various Hadoop and non-Hadoop metadata. Ability to define new types for the metadata to be managed. Types can have primitive attributes, complex attributes, object references; can inherit from other types. Instances of types, called entities, capture metadata object details and their relationships. REST APIs to work with types and instances allow easier integration.
  • 19
    Apache Beam

    Apache Beam

    Apache Software Foundation

    The easiest way to do batch and streaming data processing. Write once, run anywhere data processing for mission-critical production workloads. Beam reads your data from a diverse set of supported sources, no matter if it’s on-prem or in the cloud. Beam executes your business logic for both batch and streaming use cases. Beam writes the results of your data processing logic to the most popular data sinks in the industry. A simplified, single programming model for both batch and streaming use cases for every member of your data and application teams. Apache Beam is extensible, with projects such as TensorFlow Extended and Apache Hop built on top of Apache Beam. Execute pipelines on multiple execution environments (runners), providing flexibility and avoiding lock-in. Open, community-based development and support to help evolve your application and meet the needs of your specific use cases.
  • 20
    Azure Data Catalog
    In the new world of data, you can spend more time looking for data than you do analyzing it. Azure Data Catalog is an enterprise-wide metadata catalog that makes data asset discovery straightforward. It’s a fully-managed service that lets you—from analyst to data scientist to data developer—register, enrich, discover, understand, and consume data sources. Work with data in the tool of your choice. Data Catalog lets you find the data you need and use it in the tools you choose. Your data stays where you want it, and Data Catalog helps you discover and work with it where you want, with an intuitive user experience. ncrease broad adoption and continuous value creation across your data ecosystem. Data Catalog helps you get tips, tricks, and unwritten rules into an experience where everyone can get value. With Data Catalog, everyone can contribute. Democratize data asset discovery.
    Starting Price: $1 per user per month
  • 21
    Azure Data Factory
    Integrate data silos with Azure Data Factory, a service built for all data integration needs and skill levels. Easily construct ETL and ELT processes code-free within the intuitive visual environment, or write your own code. Visually integrate data sources using more than 90+ natively built and maintenance-free connectors at no added cost. Focus on your data—the serverless integration service does the rest. Data Factory provides a data integration and transformation layer that works across your digital transformation initiatives. Data Factory can help independent software vendors (ISVs) enrich their SaaS apps with integrated hybrid data as to deliver data-driven user experiences. Pre-built connectors and integration at scale enable you to focus on your users while Data Factory takes care of the rest.
  • 22
    Google Cloud Dataflow
    Unified stream and batch data processing that's serverless, fast, and cost-effective. Fully managed data processing service. Automated provisioning and management of processing resources. Horizontal autoscaling of worker resources to maximize resource utilization. OSS community-driven innovation with Apache Beam SDK. Reliable and consistent exactly-once processing. Streaming data analytics with speed. Dataflow enables fast, simplified streaming data pipeline development with lower data latency. Allow teams to focus on programming instead of managing server clusters as Dataflow’s serverless approach removes operational overhead from data engineering workloads. Allow teams to focus on programming instead of managing server clusters as Dataflow’s serverless approach removes operational overhead from data engineering workloads. Dataflow automates provisioning and management of processing resources to minimize latency and maximize utilization.
  • 23
    Google Cloud Data Catalog
    A fully managed and highly scalable data discovery and metadata management service. New customers get $300 in free credits to spend on Google Cloud during the Free Trial. All customers get up to 1 MiB of business or ingested metadata storage and 1 million API calls, free of charge. Pinpoint your data with a simple but powerful faceted-search interface. Sync technical metadata automatically and create schematized tags for business metadata. Tag sensitive data automatically, through Cloud Data Loss Prevention (DLP) integration. Get access immediately then scale without infrastructure to set up or manage. Empower any user on the team to find or tag data with a powerful UI, built with the same search technology as Gmail, or via API access. Data Catalog is fully managed, so you can start and scale effortlessly. Enforce data security policies and maintain compliance through Cloud IAM and Cloud DLP integrations.
    Starting Price: $100 per GiB per month
  • 24
    Xplenty

    Xplenty

    Xplenty Data Integration

    Xplenty, a scalable data integration and delivery software, allows SMBs and large enterprises to prepare and transfer data for analytics to the cloud. Xplenty features include data transformations, drag-and-drop interface, and integration with over 100 data stores and SaaS applications. Xplenty can be added by developers to their data solution stack with ease. Xplenty also allows users to schedule jobs and monitor job progress and status.
  • 25
    Hevo

    Hevo

    Hevo Data

    Hevo Data is a no-code, bi-directional data pipeline platform specially built for modern ETL, ELT, and Reverse ETL Needs. It helps data teams streamline and automate org-wide data flows that result in a saving of ~10 hours of engineering time/week and 10x faster reporting, analytics, and decision making. The platform supports 100+ ready-to-use integrations across Databases, SaaS Applications, Cloud Storage, SDKs, and Streaming Services. Over 500 data-driven companies spread across 35+ countries trust Hevo for their data integration needs. Try Hevo today and get your fully managed data pipelines up and running in just a few minutes.
    Starting Price: $249/month
  • 26
    Data Virtuality

    Data Virtuality

    Data Virtuality

    Connect and centralize data. Transform your existing data landscape into a flexible data powerhouse. Data Virtuality is a data integration platform for instant data access, easy data centralization and data governance. Our Logical Data Warehouse solution combines data virtualization and materialization for the highest possible performance. Build your single source of data truth with a virtual layer on top of your existing data environment for high data quality, data governance, and fast time-to-market. Hosted in the cloud or on-premises. Data Virtuality has 3 modules: Pipes, Pipes Professional, and Logical Data Warehouse. Cut down your development time by up to 80%. Access any data in minutes and automate data workflows using SQL. Use Rapid BI Prototyping for significantly faster time-to-market. Ensure data quality for accurate, complete, and consistent data. Use metadata repositories to improve master data management.
  • 27
    Denodo

    Denodo

    Denodo Technologies

    The core technology to enable modern data integration and data management solutions. Quickly connect disparate structured and unstructured sources. Catalog your entire data ecosystem. Data stays in the sources and it is accessed on demand, with no need to create another copy. Build data models that suit the needs of the consumer, even across multiple sources. Hide the complexity of your back-end technologies from the end users. The virtual model can be secured and consumed using standard SQL and other formats like REST, SOAP and OData. Easy access to all types of data. Full data integration and data modeling capabilities. Active Data Catalog and self-service capabilities for data & metadata discovery and data preparation. Full data security and data governance capabilities. Fast intelligent execution of data queries. Real-time data delivery in any format. Ability to create data marketplaces. Decoupling of business applications from data systems to facilitate data-driven strategies.
  • 28
    Talend Data Catalog
    Talend Data Catalog gives your organization a single, secure point of control for your data. With robust tools for search and discovery, and connectors to extract metadata from virtually any data source, Data Catalog makes it easy to protect your data, govern your analytics, manage data pipelines, and accelerate your ETL processes. Data Catalog automatically crawls, profiles, organizes, links, and enriches all your metadata. Up to 80% of the information associated with the data is documented automatically and kept up-to-date through smart relationships and machine learning, continually delivering the most current data to the user. Make data governance a team sport with a secure single point of control where you can collaborate to improve data accessibility, accuracy, and business relevance. Support data privacy and regulatory compliance with intelligent data lineage tracing and compliance tracking.
  • 29
    TIBCO Cloud Metadata
    One challenge in metadata management is the lack of connection between silos of metadata used in IT, operations, analytics, and compliance. TIBCO Cloud™ Metadata software is a single solution that spans all your metadata: data dictionaries, business glossaries, and data catalogs. Built-in artificial intelligence (AI) and machine learning (ML) algorithms facilitate metadata classification and data lineages (horizontal, vertical, regulatory). Deliver the data context, coherency, and control you need to achieve the highest efficiency, best performance, and smartest decision-making across all your teams and departments. Effective execution, analysis, and governance require accurate and consistent metadata about your operations, analytics, and compliance efforts. Instead of multiple silos, use a single solution. Discover, harvest, and manage metadata from all the applications, databases, data lakes, enterprise data warehouses, APIs, social media, and streaming sources you use.
  • 30
    IBM InfoSphere Information Server
    Set up cloud environments quickly for ad hoc development, testing and productivity for your IT and business users. Reduce the risks and costs of maintaining your data lake by implementing comprehensive data governance, including end-to-end data lineage, for business users. Improve cost savings by delivering clean, consistent and timely information for your data lakes, data warehouses or big data projects, while consolidating applications and retiring outdated databases. Take advantage of automatic schema propagation to speed up job generation, type-ahead search, and backwards capability, while designing once and executing anywhere. Create data integration flows and enforce data governance and quality rules with a cognitive design that recognizes and suggests usage patterns. Improve visibility and information governance by enabling complete, authoritative views of information with proof of lineage and quality.
    Starting Price: $16,500 per month
  • 31
    Rocket Data Virtualization
    Traditional methods of integrating mainframe data, ETL, data warehouses, building connectors, are simply not fast, accurate, or efficient enough for business today. More data than ever before is being created and stored on the mainframe, leaving these old methods further behind. Only data virtualization can close the ever-widening gap to automate the process of making mainframe data broadly accessible to developers and applications. You can curate (discover and map) your data once, then virtualize it for use anywhere, again and again. Finally, your data scales to your business ambitions. Data virtualization on z/OS eliminates the complexity of working with mainframe resources. Using data virtualization, you can knit data from multiple, disconnected sources into a single logical data source, making it much easier to connect mainframe data with your distributed applications. Combine mainframe data with location, social media, and other distributed data.
  • 32
    Lyftrondata

    Lyftrondata

    Lyftrondata

    Whether you want to build a governed delta lake, data warehouse, or simply want to migrate from your traditional database to a modern cloud data warehouse, do it all with Lyftrondata. Simply create and manage all of your data workloads on one platform by automatically building your pipeline and warehouse. Analyze it instantly with ANSI SQL, BI/ML tools, and share it without worrying about writing any custom code. Boost the productivity of your data professionals and shorten your time to value. Define, categorize, and find all data sets in one place. Share these data sets with other experts with zero codings and drive data-driven insights. This data sharing ability is perfect for companies that want to store their data once, share it with other experts, and use it multiple times, now and in the future. Define dataset, apply SQL transformations or simply migrate your SQL data processing logic to any cloud data warehouse.
  • 33
    K2View

    K2View

    K2View

    At K2View, we believe that every enterprise should be able to leverage its data to become as disruptive and agile as the best companies in its industry. We make this possible through our patented Data Product Platform, which creates and manages a complete and compliant dataset for every business entity – on demand, and in real time. The dataset is always in sync with its underlying sources, adapts to changes in the source structures, and is instantly accessible to any authorized data consumer. Data Product Platform fuels many operational use cases, including customer 360, data masking and tokenization, test data management, data migration, legacy application modernization, data pipelining and more – to deliver business outcomes in less than half the time, and at half the cost, of any other alternative. The platform inherently supports modern data architectures – data mesh, data fabric, and data hub – and deploys in cloud, on-premise, or hybrid environments.
  • 34
    IBM Cloud Pak for Data
    The biggest challenge to scaling AI-powered decision-making is unused data. IBM Cloud Pak® for Data is a unified platform that delivers a data fabric to connect and access siloed data on-premises or across multiple clouds without moving it. Simplify access to data by automatically discovering and curating it to deliver actionable knowledge assets to your users, while automating policy enforcement to safeguard use. Further accelerate insights with an integrated modern cloud data warehouse. Universally safeguard data usage with privacy and usage policy enforcement across all data. Use a modern, high-performance cloud data warehouse to achieve faster insights. Empower data scientists, developers and analysts with an integrated experience to build, deploy and manage trustworthy AI models on any cloud. Supercharge analytics with Netezza, a high-performance data warehouse.
    Starting Price: $699 per month
  • 35
    Informatica PowerCenter
    Embrace agility with the market-leading scalable, high-performance enterprise data integration platform. Support the entire data integration lifecycle, from jumpstarting the first project to ensuring successful mission-critical enterprise deployments. PowerCenter, the metadata-driven data integration platform, jumpstarts and accelerates data integration projects in order to deliver data to the business more quickly than manual hand coding. Developers and analysts collaborate, rapidly prototype, iterate, analyze, validate, and deploy projects in days instead of months. PowerCenter serves as the foundation for your data integration investments. Use machine learning to efficiently monitor and manage your PowerCenter deployments across domains and locations.
  • 36
    Validio

    Validio

    Validio

    See how your data assets are used: popularity, utilization, and schema coverage. Get important insights about your data assets such as popularity, utilization, quality, and schema coverage. Find and filter the data you need based on metadata tags and descriptions. Get important insights about your data assets such as popularity, utilization, quality, and schema coverage. Drive data governance and ownership across your organization. Stream-lake-warehouse lineage to facilitate data ownership and collaboration. Automatically generated field-level lineage map to understand the entire data ecosystem. Anomaly detection learns from your data and seasonality patterns, with automatic backfill from historical data. Machine learning-based thresholds are trained per data segment, trained on actual data instead of metadata only.
  • 37
    erwin Data Catalog
    erwin Data Catalog by Quest is metadata management software that helps organizations learn what data they have and where it’s located, including data at rest and in motion. It tells you the data and metadata available for a certain topic so those particular sources and assets can be found quickly for analysis and decision-making. erwin Data Catalog automates the processes involved in harvesting, integrating, activating and governing enterprise data according to business requirements. This automation results in greater accuracy and faster time to value for data governance and digital transformation efforts, including data warehouse, data lake, data vault and other Big Data deployments, cloud migrations, etc. Metadata management is key to sustainable data governance and any other organizational effort for which data is key to the outcome. erwin Data Catalog automates enterprise metadata management, data mapping, data cataloging, code generation, data profiling and data lineage.
  • 38
    SAP Data Intelligence
    Turn data chaos into data value with data intelligence. Connect, discover, enrich, and orchestrate disjointed data assets into actionable business insights at enterprise scale. SAP Data Intelligence is a comprehensive data management solution. As the data orchestration layer of SAP’s Business Technology Platform, it transforms distributed data sprawls into vital data insights, delivering innovation at scale. Provide your users with intelligent, relevant, and contextual insights with integration across the IT landscape. Integrate and orchestrate massive data volumes and streams at scale. Streamline, operationalize, and govern innovation driven by machine learning. Optimize governance and minimize compliance risk with comprehensive metadata management rules. Connect, discover, enrich, and orchestrate disjointed data assets into actionable business insights at enterprise scale.
    Starting Price: $1.22 per month
  • 39
    Oracle Cloud Infrastructure Data Catalog
    Oracle Cloud Infrastructure (OCI) Data Catalog is a metadata management service that helps data professionals discover data and support data governance. Designed specifically to work well with the Oracle ecosystem, it provides an inventory of assets, a business glossary, and a common metastore for data lakes. OCI Data Catalog is fully managed by Oracle and runs with all the power and scale of Oracle Cloud Infrastructure. Benefit from all of the security, reliability, performance, and scale of Oracle Cloud while using OCI Data Catalog. Using REST APIs and SDKs, developers can integrate OCI Data Catalog’s capabilities in their custom applications. Using a trusted system for managing user identities and access privileges, administrators can control access to data catalog objects and capabilities to manage security requirements. Discover data assets across Oracle data stores on-premises and in the cloud to start gaining real value from data.
  • 40
    Enterprise Enabler

    Enterprise Enabler

    Stone Bond Technologies

    It unifies information across silos and scattered data for visibility across multiple sources in a single environment; whether in the cloud, spread across siloed databases, on instruments, in Big Data stores, or within various spreadsheets/documents, Enterprise Enabler can integrate all your data so you can make informed business decisions in real-time. By creating logical views of data from the original source locations. This means you can reuse, configure, test, deploy, and monitor all your data in a single integrated environment. Analyze your business data in one place as it is occurring to maximize the use of assets, minimize costs, and improve/refine your business processes. Our implementation time to market value is 50-90% faster. We get your sources connected and running so you can start making business decisions based on real-time data.
  • 41
    Oracle Big Data Preparation
    Oracle Big Data Preparation Cloud Service is a managed Platform as a Service (PaaS) cloud-based offering that enables you to rapidly ingest, repair, enrich, and publish large data sets with end-to-end visibility in an interactive environment. You can integrate your data with other Oracle Cloud Services, such as Oracle Business Intelligence Cloud Service, for down-stream analysis. Profile metrics and visualizations are important features of Oracle Big Data Preparation Cloud Service. When a data set is ingested, you have visual access to the profile results and summary of each column that was profiled, and the results of duplicate entity analysis completed on your entire data set. Visualize governance tasks on the service Home page with easily understood runtime metrics, data health reports, and alerts. Keep track of your transforms and ensure that files are processed correctly. See the entire data pipeline, from ingestion to enrichment and publishing.
  • 42
    Orbit Analytics

    Orbit Analytics

    Orbit Analytics

    Empower your business by leveraging a true self-service reporting and analytics platform. Powerful and scalable, Orbit’s operational reporting and business intelligence software enables users to create their own analytics and reports. Orbit Reporting + Analytics offers pre-built integration with enterprise resource planning (ERP) and key cloud business applications that include PeopleSoft, Oracle E-Business Suite, Salesforce, Taleo, and more. With Orbit, you can quickly and efficiently find answers from any data source, determine opportunities, and make smart, data-driven decisions. Orbit comes with more than 200 integrators and connectors that allow you to combine data from multiple data sources, so you can harness the power of collective knowledge to make informed decisions. Orbit Adapters connect with your key business systems, and designed to seamlessly inherit authentication, data security, business roles and apply them to reporting.
  • 43
    Databricks Data Intelligence Platform
    The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.
  • 44
    1touch.io Inventa
    Partial visibility into your data is putting your organization at risk​. 1touch.io uses a unique network analytics approach, powerful ML and AI techniques together with unprecedented data lineage accuracy to continuously discover and catalog all your sensitive and protected data into a PII Inventory and a Master Data Catalog. We automatically discover and analyze all usage of data and its lineage without relying upon the organization’s knowledge of the existence or location of the data. Adding a multilayer machine learning analytic engine gives us the ability to “read and understand” the data and link all the pieces into a full picture represented as both a PII inventory and a Master Catalog. Finding your known and unknown sensitive data within your network allows for immediate risk reduction. Organizing your data flow to understand precise data lineage and business processes enables you to achieve core compliance requirements.
  • 45
    Zaloni Arena
    End-to-end DataOps built on an agile platform that improves and safeguards your data assets. Arena is the premier augmented data management platform. Our active data catalog enables self-service data enrichment and consumption to quickly control complex data environments. Customizable workflows that increase the accuracy and reliability of every data set. Use machine-learning to identify and align master data assets for better data decisioning. Complete lineage with detailed visualizations alongside masking and tokenization for superior security. We make data management easy. Arena catalogs your data, wherever it is and our extensible connections enable analytics to happen across your preferred tools. Conquer data sprawl challenges: Our software drives business and analytics success while providing the controls and extensibility needed across today’s decentralized, multi-cloud data complexity.
  • 46
    erwin Data Intelligence
    erwin Data Intelligence (erwin DI) combines data catalog and data literacy capabilities for greater awareness of and access to available data assets, guidance on their use, and guardrails to ensure data policies and best practices are followed. Automatically harvest, transform and feed metadata from a wide array of data sources, operational processes, business applications and data models into a central catalog. Then make it accessible and understandable via role-based, contextual views so stakeholders can make strategic decisions based on accurate insights. erwin DI supports enterprise data governance, digital transformation and any effort that relies on data for favorable outcomes. Schedule ongoing scans of metadata from the widest array of data sources. Easily map data elements from source to target, including data in motion, and harmonize data integration across platforms. Enable data consumers to define and discover data relevant to their roles.
    Starting Price: $299 per month
  • 47
    Gathr

    Gathr

    Gathr

    The only all-in-one data pipeline platform. Built ground-up for a cloud-first world, Gathr is the only platform to handle all your data integration and engineering needs - ingestion, ETL, ELT, CDC, streaming analytics, data preparation, machine learning, advanced analytics and more. With Gathr, anyone can build and deploy pipelines in minutes, irrespective of skill levels. Create Ingestion pipelines in minutes, not weeks. Ingest data from any source, deliver to any destination. Build applications quickly with a wizard-based approach. Replicate data in real-time using a templatized CDC app. Native integration for all sources and targets. Best-in-class capabilities with everything you need to succeed today and tomorrow. Choose between free, pay-per-use or customize as per your requirements.
  • 48
    Dremio

    Dremio

    Dremio

    Dremio delivers lightning-fast queries and a self-service semantic layer directly on your data lake storage. No moving data to proprietary data warehouses, no cubes, no aggregation tables or extracts. Just flexibility and control for data architects, and self-service for data consumers. Dremio technologies like Data Reflections, Columnar Cloud Cache (C3) and Predictive Pipelining work alongside Apache Arrow to make queries on your data lake storage very, very fast. An abstraction layer enables IT to apply security and business meaning, while enabling analysts and data scientists to explore data and derive new virtual datasets. Dremio’s semantic layer is an integrated, searchable catalog that indexes all of your metadata, so business users can easily make sense of your data. Virtual datasets and spaces make up the semantic layer, and are all indexed and searchable.
  • 49
    Hyper-Q

    Hyper-Q

    Datometry

    Adaptive Data Virtualization™ technology enables enterprises to run their existing applications on modern cloud data warehouses, without rewriting or reconfiguring them. Datometry Hyper-Q™ lets enterprises adopt new cloud databases rapidly, control ongoing operating expenses, and build out analytic capabilities for faster digital transformation. Datometry Hyper-Q virtualization software allows any existing applications to run on any cloud database, making applications and databases interoperable. Enterprises can now adopt the cloud database of choice, without having to rip, rewrite and replace applications. Enables runtime application compatibility with Transformation and Emulation of legacy data warehouse functions. Deploys transparently on Azure, AWS, and GCP clouds. Applications can use existing JDBC, ODBC and Native connectors without changes. Connects to major cloud data warehouses, Azure Synapse Analytics, AWS Redshift, and Google BigQuery.
  • 50
    Oracle Big Data SQL Cloud Service
    Oracle Big Data SQL Cloud Service enables organizations to immediately analyze data across Apache Hadoop, NoSQL and Oracle Database leveraging their existing SQL skills, security policies and applications with extreme performance. From simplifying data science efforts to unlocking data lakes, Big Data SQL makes the benefits of Big Data available to the largest group of end users possible. Big Data SQL gives users a single location to catalog and secure data in Hadoop and NoSQL systems, Oracle Database. Seamless metadata integration and queries which join data from Oracle Database with data from Hadoop and NoSQL databases. Utilities and conversion routines support automatic mappings from metadata stored in HCatalog (or the Hive Metastore) to Oracle Tables. Enhanced access parameters give administrators the flexibility to control column mapping and data access behavior. Multiple cluster support enables one Oracle Database to query multiple Hadoop clusters and/or NoSQL systems.