Apache Hive Integrations

Oracle Machine Learning

Oracle

Machine learning uncovers hidden patterns and insights in enterprise data, generating new value for the business. Oracle Machine Learning accelerates the creation and deployment of machine learning models for data scientists using reduced data movement, AutoML technology, and simplified deployment. Increase data scientist and developer productivity and reduce their learning curve with familiar open source-based Apache Zeppelin notebook technology. Notebooks support SQL, PL/SQL, Python, and markdown interpreters for Oracle Autonomous Database so users can work with their language of choice when developing models. A no-code user interface supporting AutoML on Autonomous Database to improve both data scientist productivity and non-expert user access to powerful in-database algorithms for classification and regression. Data scientists gain integrated model deployment from the Oracle Machine Learning AutoML User Interface.

View Software

Lyftrondata

Whether you want to build a governed delta lake, data warehouse, or simply want to migrate from your traditional database to a modern cloud data warehouse, do it all with Lyftrondata. Simply create and manage all of your data workloads on one platform by automatically building your pipeline and warehouse. Analyze it instantly with ANSI SQL, BI/ML tools, and share it without worrying about writing any custom code. Boost the productivity of your data professionals and shorten your time to value. Define, categorize, and find all data sets in one place. Share these data sets with other experts with zero codings and drive data-driven insights. This data sharing ability is perfect for companies that want to store their data once, share it with other experts, and use it multiple times, now and in the future. Define dataset, apply SQL transformations or simply migrate your SQL data processing logic to any cloud data warehouse.

View Software

LT Browser

TestMu AI

Next-gen browser to build, test & debug mobile websites. Test website on different pre-installed mobile device view ports. See mobile view of website on android and iOS resolutions with LT Browser, a dev friendly browser for mobile view debugging. Can’t find your favorite device? With LT Browser, you can create your own custom device view port and save it for future uses. Create new mobile, tablet or desktop devices and test website on various devices, screen resolution and perform screen resolution test for website on different screen sizes. You don’t have to switch between two devices to perform mobile website test. Test on two devices simultaneously with LT Browser and perform mobile website test on different tablet and desktop sizes and inspect website on different resolutions simultaneously. LT Browser comes with DevTools to debug multiple device sizes while performing responsiveness test simultaneously. Test website on various device resolutions with separate DevTools for each.

Starting Price: $15 per month

View Software

IRI Voracity

IRI, The CoSort Company

Voracity is the only high-performance, all-in-one data management platform accelerating AND consolidating the key activities of data discovery, integration, migration, governance, and analytics. Voracity helps you control your data in every stage of the lifecycle, and extract maximum value from it. Only in Voracity can you: 1) CLASSIFY, profile and diagram enterprise data sources 2) Speed or LEAVE legacy sort and ETL tools 3) MIGRATE data to modernize and WRANGLE data to analyze 4) FIND PII everywhere and consistently MASK it for referential integrity 5) Score re-ID risk and ANONYMIZE quasi-identifiers 6) Create and manage DB subsets or intelligently synthesize TEST data 7) Package, protect and provision BIG data 8) Validate, scrub, enrich and unify data to improve its QUALITY 9) Manage metadata and MASTER data. Use Voracity to comply with data privacy laws, de-muck and govern the data lake, improve the reliability of your analytics, and create safe, smart test data

View Software

IRI Data Protector Suite

IRI, The CoSort Company

The IRI Data Protector suite contains multiple data masking products which can be licensed standalone or in a discounted bundle to profile, classify, search, mask, and audit PII and other sensitive information in structured, semi-structured, and unstructured data sources. Apply their many masking functions consistently for referential integrity: IRI FieldShield® Structured Data Masking FieldShield classifies, finds, de-identifies, risk-scores, and audits PII in databases, flat files, JSON, etc. IRI DarkShield® Semi & Unstructured Data Masking DarkShield classifies, finds, and deletes PII in text, pdf, Parquet, C/BLOBs, MS documents, logs, NoSQL DBs, images, and faces. IRI CellShield® Excel® Data Masking CellShield finds, reports on, masks, and audits changes to PII in Excel columns and values LAN-wide or in the cloud. IRI Data Masking as a Service IRI DMaaS engineers in the US and abroad do the work of classifying, finding, masking, and risk-scoring PII for you.

View Software

Xtendlabs

Installing, and configuring today’s complex software technology platforms takes an extraordinary investment in time and resources. Not with Xtendlabs. Xtendlabs Emerging Technology Platform-as-a-Services provides immediate access to emerging Big Data, Data Sciences, and Database technology platforms online, from any device and location, 24/7. Xtendlabs are available on-demand, any time, from any location, including home, office or the road. Xtendlabs scale to meet your needs on-demand, so you can focus on your business problem and learning rather than struggling to find and set up infrastructure . Just sign-in to get immediate access to your virtual lab environment. Xtendlabs requires no virtual machine installation, system setup or configuration, saving valuable time and resources. Pay as you go monthly. With Xtendlabs there are no upfront investments in software or hardware.

View Software

SAS Federation Server

SAS

Create federated source data names to enable users to access multiple data sources via the same connection. Use the web-based administrative console for simplified maintenance of user access, privileges and authorizations. Apply data quality functions such as match-code generation, parsing and other tasks inside the view. Improved performance with in-memory data caches & scheduling. Secured information with data masking & encryption. Lets you keep application queries current and available to users, and reduce loads on operational systems. Enables you to define access permissions for a user or group at the catalog, schema, table, column and row levels. Advanced data masking and encryption capabilities let you determine not only who’s authorized to view your data, but also what they see on an extremely granular level. It all helps ensure sensitive data doesn’t fall into the wrong hands.

View Software

WEBDEV

Windev

Responsive web design, WEBDEV allows you to easily develop Internet and Intranet sites and applications (WEB & SaaS) to manage data and processes. WEBDEV also generates PHP. WINDEV supports all databases. WEBDEV also supports all the databases that use ODBC drivers or OLEDB providers. The WINDEV, WEBDEV and WINDEV Mobile environments are compatible and share project elements. It has never been easier to build multi-target applications. The developer can focus on key business requirements, and not on the code, applications can finally meet your needs. Up to 20 times less code, develop applications in no time! Shorter time to market, allows you to gain market share. Software is easier to develop and improved reliability. Complete application RAD generator for PC, web, and mobile, template creation (patterns, inheritance & MVP). The ease of use and speed that allow you to develop and realize even your most ambitious projects.

Starting Price: $1,703 one-time payment

View Software

WINDEV

Windev

Thanks to its full integration, legendary ease of use, and advanced technology, WINDEV allows you to easily develop large-scale projects in Windows, Linux, .NET, Java, and much more! (Full compatibility with web, mobile, Android, iOS, etc.) WINDEV creates applications for Windows, Linux, and Mac. WEBDEV recompiles them for the Internet. WINDEV Mobile recompiles them for tablets or smartphones. Use the same project, interfaces, objects, elements, and source code regardless of the target. Make the most of your developments, and deploy faster on all devices. The ability to simply recompile an application for different targets is a decisive advantage. It guarantees continuity and the ability to respond to changes. Several automatic features are available. Portable code and objects (for Web browsers and Mobile code). WINDEV supports all the databases that use ODBC drivers or OLEDB providers.

Starting Price: $1,768 one-time payment

View Software

SQL

SQL is a domain-specific programming language used for accessing, managing, and manipulating relational databases and relational database management systems.

Starting Price: Free

View Software

OpenText Structured Data Manager

OpenText

OpenText Structured Data Manager helps organizations simplify compliance and gain control over structured data throughout its lifecycle. It discovers, classifies, protects, and relocates inactive data from enterprise applications into secure, lower-cost repositories. The platform supports secure archiving, test data management, application retirement, and full lifecycle governance. By reducing the volume of legacy structured data, it lowers infrastructure complexity and total cost of ownership. Its built-in data masking and test data protection ensure development and QA environments remain safe while accelerating modernization. With automated governance workflows, OpenText Structured Data Manager helps organizations reduce risk without slowing operations.

View Software

Progress DataDirect

Progress Software

Empowering applications with enterprise data is our passion here at Progress DataDirect. We offer cloud and on-premises data connectivity solutions across relational, NoSQL, Big Data, and SaaS data sources. Performance, reliability, and security are at the heart of everything we design for thousands of enterprises and the leading vendors in analytics, BI, and data management. Minimize your development costs with our portfolio of high-value connectors for a variety of data sources. Enjoy 24/7 world-class support and security for greater peace of mind. Connect with affordable, easy-to-use, and time-saving drivers for faster SQL access to your data. As a leader in data connectivity, keeping up with the evolving trends in space is our mission. But if we haven’t built the connector you need yet, reach out and we’ll help you develop the right solution. Embed connectivity in an application or service.

View Software

jethro

Data-driven decision-making has unleashed a surge of business data and a rise in user demand to analyze it. This trend drives IT departments to migrate off expensive Enterprise Data Warehouses (EDW) toward cost-effective Big Data platforms like Hadoop or AWS. These new platforms come with a Total Cost of Ownership (TCO) that is about 10 times lower. They are not ideal for interactive BI applications, however, as they fail to match the high performance and user concurrency of legacy EDWs. For this exact reason, we developed Jethro. Customers use Jethro for interactive BI on Big Data. Jethro is a transparent middle tier that requires no changes to existing apps or data. It is self-driving with no maintenance required. Jethro is compatible with BI tools like Tableau, Qlik, and Microstrategy and is data source agnostic. Jethro delivers on the demands of business users allowing for thousands of concurrent users to run complicated queries over billions of records.

View Software

Baidu Sugar

Baidu AI Cloud

Sugar will charge fees according to the organization. A user can belong to multiple organizations, and there are multiple users in an organization. Multiple spaces can be created under the organization. Generally, it is recommended to divide spaces according to projects or teams. Data between spaces is not shared. Each space has its own independent permission management. When you use Sugar to analyze and visualize data, you need to specify the data source of the original data. Data source is the place where data is stored. Generally, it refers to the connection address (host, port, user name, password, etc.) of the database. A dashboard is a kind of visual page type, that mainly reflects cool visual effect, and is generally used to put on the large screen for real-time data visualization.

Starting Price: $0.33 per year

View Software

Foundational

Identify code and optimization issues in real-time, prevent data incidents pre-deploy, and govern data-impacting code changes end to end—from the operational database to the user-facing dashboard. Automated, column-level data lineage, from the operational database all the way to the reporting layer, ensures every dependency is analyzed. Foundational automates data contract enforcement by analyzing every repository from upstream to downstream, directly from source code. Use Foundational to proactively identify code and data issues, find and prevent issues, and create controls and guardrails. Foundational can be set up in minutes with no code changes required.

View Software

IBM watsonx.data

IBM

Put your data to work, wherever it resides, with the open, hybrid data lakehouse for AI and analytics. Connect your data from anywhere, in any format, and access through a single point of entry with a shared metadata layer. Optimize workloads for price and performance by pairing the right workloads with the right query engine. Embed natural-language semantic search without the need for SQL, so you can unlock generative AI insights faster. Manage and prepare trusted data to improve the relevance and precision of your AI applications. Use all your data, everywhere. With the speed of a data warehouse, the flexibility of a data lake, and special features to support AI, watsonx.data can help you scale AI and analytics across your business. Choose the right engines for your workloads. Flexibly manage cost, performance, and capability with access to multiple open engines including Presto, Presto C++, Spark Milvus, and more.

View Software

TapData

CDC-based live data platform for heterogeneous database replication, real-time data integration, or building a real-time data warehouse. By using CDC to sync production line data stored in DB2 and Oracle to the modern database, TapData enabled an AI-augmented real-time dispatch software to optimize the semiconductor production line process. The real-time data made instant decision-making in the RTD software a possibility, leading to faster turnaround times and improved yield. As one of the largest telcos, customer has many regional systems that cater to the local customers. By syncing and aggregating data from various sources and locations into a centralized data store, customers were able to build an order center where the collective orders from many applications can now be aggregated. TapData seamlessly integrates inventory data from 500+ stores, providing real-time insights into stock levels and customer preferences, enhancing supply chain efficiency.

View Software

eQube®-DaaS

eQ Technologic

Our platform establishes a data fabric with a connected network of integrated data, applications, and devices that puts the power of analytics in the hands of end users leading to actionable insight. Data from any source can be aggregated using eQube's data virtualization layer and exposed as a web service, REST service, OData service, or API. Efficiently and rapidly integrate many legacy systems and new COTS (Commercial off-the-shelf) systems. Responsibly retire legacy systems in an orderly manner without disrupting the business. Provide on-demand 'visibility' across the business processes with analytics and business intelligence (A/BI) capabilities. eQube®-MI-based application integration infrastructure can be readily extended for secure, scalable, and robust information collaboration across networks, partners, suppliers, and customers that are geographically dispersed.

View Software

Adobe Real-Time CDP

Adobe

Adobe Real-Time Customer Data Platform enables marketers to collect, normalize, and govern B2B and B2C data and unify it into real-time profiles that can be activated across any channel. Engage consumers and business accounts with consistent experiences using unified, real-time person and account-based profiles made up of consumer data, professional data, or both. All of it protected by the industry’s most functional data management and privacy tools.

View Software

Airtool

Airtool is a next-generation low-code platform purpose-built to support the demands of large-scale, mission-critical enterprise systems. Developed by Deister Software, Airtool was born from the need to create a powerful ERP solution capable of handling massive datasets without sacrificing performance. Unlike traditional low-code tools focused on simple apps, Airtool offers unmatched scalability, making it ideal for building robust, data-intensive applications. Engineered with a unified development framework, Airtool enforces best practices and standardizes workflows, reducing complexity and making onboarding seamless. Its ERP-focused design ensures precision and control in managing complex business operations, while integrated data management allows real-time interaction with millions of records. Airtool empowers teams to reduce reliance on specialized developers with reusable components and intuitive tools, accelerating development while maintaining maintainability and consistency.

Starting Price: $50/month

View Software

Cloudera Data Warehouse

Cloudera

Cloudera Data Warehouse is a cloud-native, self-service analytics solution that lets IT rapidly deliver query capabilities to BI analysts, enabling users to go from zero to query in minutes. It supports all data types, structured, semi-structured, unstructured, real-time, and batch, and scales cost-effectively from gigabytes to petabytes. It is fully integrated with streaming, data engineering, and AI services, and enforces a unified security, governance, and metadata framework across private, public, or hybrid cloud deployments. Each virtual warehouse (data warehouse or mart) is isolated and automatically configured and optimized, ensuring that workloads do not interfere with each other. Cloudera leverages open source engines such as Hive, Impala, Kudu, and Druid, along with tools like Hue and more, to handle diverse analytics, from dashboards and operational analytics to research and discovery over vast event or time-series data.

View Software

CelerData Cloud

CelerData

CelerData is a high-performance SQL engine built to power analytics directly on data lakehouses, eliminating the need for traditional data‐warehouse ingestion pipelines. It delivers sub-second query performance at scale, supports on-the‐fly JOINs without costly denormalization, and simplifies architecture by allowing users to run demanding workloads on open format tables. Built on the open source engine StarRocks, the platform outperforms legacy query engines like Trino, ClickHouse, and Apache Druid in latency, concurrency, and cost-efficiency. With a cloud-managed service that runs in your own VPC, you retain infrastructure control and data ownership while CelerData handles maintenance and optimization. The platform is positioned to power real-time OLAP, business intelligence, and customer-facing analytics use cases and is trusted by enterprise customers (including names such as Pinterest, Coinbase, and Fanatics) who have achieved significant latency reductions and cost savings.

View Software

Data Virtuality

Connect and centralize data. Transform your existing data landscape into a flexible data powerhouse. Data Virtuality is a data integration platform for instant data access, easy data centralization and data governance. Our Logical Data Warehouse solution combines data virtualization and materialization for the highest possible performance. Build your single source of data truth with a virtual layer on top of your existing data environment for high data quality, data governance, and fast time-to-market. Hosted in the cloud or on-premises. Data Virtuality has 3 modules: Pipes, Pipes Professional, and Logical Data Warehouse. Cut down your development time by up to 80%. Access any data in minutes and automate data workflows using SQL. Use Rapid BI Prototyping for significantly faster time-to-market. Ensure data quality for accurate, complete, and consistent data. Use metadata repositories to improve master data management.

View Software

Mode

Mode Analytics

Understand how users are interacting with your product and identify opportunity areas to inform your product decisions. Mode empowers one Stitch analyst to do the work of a full data team through speed, flexibility, and collaboration. Build dashboards for annual revenue, then use chart visualizations to identify anomalies quickly. Create polished, investor-ready reports or share analysis with teams for collaboration. Connect your entire tech stack to Mode and identify upstream issues to improve performance. Speed up workflows across teams with APIs and webhooks. Understand how users are interacting with your product and identify opportunity areas to inform your product decisions. Leverage marketing and product data to fix weak spots in your funnel, improve landing-page performance, and understand churn before it happens.

View Software

Astro by Astronomer

Astronomer

For data teams looking to increase the availability of trusted data, Astronomer provides Astro, a modern data orchestration platform, powered by Apache Airflow, that enables the entire data team to build, run, and observe data pipelines-as-code. Astronomer is the commercial developer of Airflow, the de facto standard for expressing data flows as code, used by hundreds of thousands of teams across the world.

View Software

Nucleon Database Master

Nucleon Software

Nucleon Database Master is a modern, powerful, intuitive and easy to use database query, administration, and management software with a consistent and modern user interface. Database Master simplifies managing, monitoring, querying, editing, visualizing, designing relational and NoSQL DBMS. Database Master allows you to execute extended SQL, JQL and C# (Linq) query scripts, provides all database objects such as tables, views, procedures, packages, columns, indexes, relationships (constraints), collections, triggers and other database objects.

Starting Price: $99 one-time payment

View Software

ActionIQ

The ActionIQ Customer Data Platform enables you to align your people, technology, and processes to deliver exceptional customer experiences across every touchpoint. How to separate the CDP posers from the true players. Download ActionIQ’s guide to save yourself months of frustrating research and get to the truth on the confusing CDP landscape. In today’s experience economy, consumers expect brands to know them and consistently deliver authentic, helpful experiences. The ActionIQ CDP enables large organizations to solve chronic customer data fragmentation and gain the customer intelligence needed to orchestrate experiences across all their brand touchpoints. Build an omnichannel “smart hub” stack that unifies data and empowers teams with the real-time insights they need. Gain deep customer understanding that enables you to deliver trust-building, profitable customer experiences at scale.

View Software

Apache Spark

Apache Software Foundation

Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources. You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

View Software

Amazon EMR

Amazon

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. If you have existing on-premises deployments of open-source tools such as Apache Spark and Apache Hive, you can also run EMR clusters on AWS Outposts. Analyze data using open-source ML frameworks such as Apache Spark MLlib, TensorFlow, and Apache MXNet. Connect to Amazon SageMaker Studio for large-scale model training, analysis, and reporting.

View Software

Nightfall

Discover, classify, and protect your sensitive data. Nightfall™ uses machine learning to identify business-critical data, like customer PII, across your SaaS, APIs, and data infrastructure, so you can manage & protect it. Integrate in minutes with cloud services via APIs to monitor data without agents. Machine learning classifies your sensitive data & PII with high accuracy, so nothing gets missed. Setup automated workflows for quarantines, deletions, alerts, and more - saving you time and keeping your business safe. Nightfall integrates directly with all your SaaS, APIs, and data infrastructure. Start building with Nightfall’s APIs for sensitive data classification & protection for free. Via REST API, programmatically get structured results from Nightfall’s deep learning-based detectors for things like credit card numbers, API keys, and more. Integrate with just a few lines of code. Seamlessly add data classification to your applications & workflows using Nightfall's REST API.

View Software

TiMi

TIMi

With TIMi, companies can capitalize on their corporate data to develop new ideas and make critical business decisions faster and easier than ever before. The heart of TIMi’s Integrated Platform. TIMi’s ultimate real-time AUTO-ML engine. 3D VR segmentation and visualization. Unlimited self service business Intelligence. TIMi is several orders of magnitude faster than any other solution to do the 2 most important analytical tasks: the handling of datasets (data cleaning, feature engineering, creation of KPIs) and predictive modeling. TIMi is an “ethical solution”: no “lock-in” situation, just excellence. We guarantee you a work in all serenity and without unexpected extra costs. Thanks to an original & unique software infrastructure, TIMi is optimized to offer you the greatest flexibility for the exploration phase and the highest reliability during the production phase. TIMi is the ultimate “playground” that allows your analysts to test the craziest ideas!

View Software

Truedat

Bluetab Solutions

Truedat is an open source data governance business solution tool developed by Bluetab Solutions in order to help our clients become data-driven companies. We help to define business processes, roles & responsibilities. We also help putting processes into practice. Integration and customization of truedat´s open source components to support the data governance processes. We guarantee the support and maintenance of the process & software of our solution modules installed by us. Based on our experience in, we have developed a solution that covers the need for Data Governance, allowing to manage and control highly complex and changing data architectures. The highly increasing migration of enterprise IT platforms to cloud, multi-cloud and hybrid architectures, increases the sources, complexity and types of data and therefore rises the need for truedat. Our solution comes from more than 8 years of experience in Data Governance consulting and development projects.

View Software

Privacera

At the intersection of data governance, privacy, and security, Privacera’s unified data access governance platform maximizes the value of data by providing secure data access control and governance across hybrid- and multi-cloud environments. The hybrid platform centralizes access and natively enforces policies across multiple cloud services—AWS, Azure, Google Cloud, Databricks, Snowflake, Starburst and more—to democratize trusted data enterprise-wide without compromising compliance with regulations such as GDPR, CCPA, LGPD, or HIPAA. Trusted by Fortune 500 customers across finance, insurance, retail, healthcare, media, public and the federal sector, Privacera is the industry’s leading data access governance platform that delivers unmatched scalability, elasticity, and performance. Headquartered in Fremont, California, Privacera was founded in 2016 to manage cloud data privacy and security by the creators of Apache Ranger™ and Apache Atlas™.

View Software

Microsoft Power Query

Microsoft

Power Query is the easiest way to connect, extract, transform and load data from a wide range of sources. Power Query is a data transformation and data preparation engine. Power Query comes with a graphical interface for getting data from sources and a Power Query Editor for applying transformations. Because the engine is available in many products and services, the destination where the data will be stored depends on where Power Query was used. Using Power Query, you can perform the extract, transform, and load (ETL) processing of data. Microsoft’s Data Connectivity and Data Preparation technology that lets you seamlessly access data stored in hundreds of sources and reshape it to fit your needs—all with an easy to use, engaging, no-code experience. Power Query supports hundreds of data sources with built-in connectors, generic interfaces (such as REST APIs, ODBC, OLE, DB and OData) and the Power Query SDK to build your own connectors.

View Software

Apache Knox

Apache Software Foundation

The Knox API Gateway is designed as a reverse proxy with consideration for pluggability in the areas of policy enforcement, through providers and the backend services for which it proxies requests. Policy enforcement ranges from authentication/federation, authorization, audit, dispatch, hostmapping and content rewrite rules. Policy is enforced through a chain of providers that are defined within the topology deployment descriptor for each Apache Hadoop cluster gated by Knox. The cluster definition is also defined within the topology deployment descriptor and provides the Knox Gateway with the layout of the cluster for purposes of routing and translation between user facing URLs and cluster internals. Each Apache Hadoop cluster that is protected by Knox has its set of REST APIs represented by a single cluster specific application context path. This allows the Knox Gateway to both protect multiple clusters and present the REST API consumer with a single endpoint.

View Software

Mage Static Data Masking

Mage Data

Mage™ Static Data Masking (SDM) and Test data Management (TDM) capabilities fully integrate with Imperva’s Data Security Fabric (DSF) delivering complete protection for all sensitive or regulated data while simultaneously integrating seamlessly with an organization’s existing IT framework and existing application development, testing and data flows without the requirement for any additional architectural changes.

View Software

Mage Dynamic Data Masking

Mage Data

Mage™ Dynamic Data Masking module of the Mage data security platform has been designed with the end customer needs taken into consideration. Mage™ Dynamic Data Masking has been developed working alongside our customers, to address the specific needs and requirements they have. As a result, this product has evolved in a way to meet all the use cases that an enterprise could possibly have. Most other solutions in the market are either a part of an acquisition or are developed to meet only a specific use case. Mage™ Dynamic Data Masking has been designed to deliver adequate protection to sensitive data in production to application and database users while simultaneously integrating seamlessly with an organization's existing IT framework without the requirement of any additional architectural changes.

View Software

Okera

Okera, the Universal Data Authorization company, helps modern, data-driven enterprises accelerate innovation, minimize data security risks, and demonstrate regulatory compliance. The Okera Dynamic Access Platform automatically enforces universal fine-grained access control policies. This allows employees, customers, and partners to use data responsibly, while protecting them from inappropriately accessing data that is confidential, personally identifiable, or regulated. Okera’s robust audit capabilities and data usage intelligence deliver the real-time and historical information that data security, compliance, and data delivery teams need to respond quickly to incidents, optimize processes, and analyze the performance of enterprise data initiatives. Okera began development in 2016 and now dynamically authorizes access to hundreds of petabytes of sensitive data for the world’s most demanding F100 companies and regulatory agencies. The company is headquartered in San Francisco.

View Software

Acceldata

Acceldata is an Agentic Data Management company helping enterprises manage complex data systems with AI-powered automation. Its unified platform brings together data quality, governance, lineage, and infrastructure monitoring to deliver trusted, actionable insights across the business. Acceldata’s Agentic Data Management platform uses intelligent AI agents to detect, understand, and resolve data issues in real time. Designed for modern data environments, it replaces fragmented tools with a self-learning system that ensures data is accurate, governed, and ready for AI and analytics.

View Software

Apache Sentry

Apache Software Foundation

Apache Sentry™ is a system for enforcing fine grained role based authorization to data and metadata stored on a Hadoop cluster. Apache Sentry has successfully graduated from the Incubator in March of 2016 and is now a Top-Level Apache project. Apache Sentry is a granular, role-based authorization module for Hadoop. Sentry provides the ability to control and enforce precise levels of privileges on data for authenticated users and applications on a Hadoop cluster. Sentry currently works out of the box with Apache Hive, Hive Metastore/HCatalog, Apache Solr, Impala and HDFS (limited to Hive table data). Sentry is designed to be a pluggable authorization engine for Hadoop components. It allows you to define authorization rules to validate a user or application’s access requests for Hadoop resources. Sentry is highly modular and can support authorization for a wide variety of data models in Hadoop.

View Software

lakeFS

Treeverse

lakeFS enables you to manage your data lake the way you manage your code. Run parallel pipelines for experimentation and CI/CD for your data. Simplifying the lives of engineers, data scientists and analysts who are transforming the world with data. lakeFS is an open source platform that delivers resilience and manageability to object-storage based data lakes. With lakeFS you can build repeatable, atomic and versioned data lake operations, from complex ETL jobs to data science and analytics. lakeFS supports AWS S3, Azure Blob Storage and Google Cloud Storage (GCS) as its underlying storage service. It is API compatible with S3 and works seamlessly with all modern data frameworks such as Spark, Hive, AWS Athena, Presto, etc. lakeFS provides a Git-like branching and committing model that scales to exabytes of data by utilizing S3, GCS, or Azure Blob for storage.

View Software

Amundsen

Discover & trust data for your analysis and models. Be more productive by breaking silos. Get immediate context into the data and see how others are using it. Search for data within your organization by a simple text search. A PageRank-inspired search algorithm recommends results based on names, descriptions, tags, and querying/viewing activity on the table/dashboard. Build trust in data using automated and curated metadata, descriptions of tables and columns, other frequent users, when the table was last updated, statistics, a preview of the data if permitted, etc. Easy triage by linking the ETL job and code that generated the data. Update tables and columns with descriptions, reduce unnecessary back and forth about which table to use and what a column contains. See what data fellow co-workers frequently use, own or have bookmarked. Learn what most common queries for a table look like by seeing dashboards built on a given table.

View Software

Apache Kylin

Apache Software Foundation

Apache Kylin™ is an open source, distributed Analytical Data Warehouse for Big Data; it was designed to provide OLAP (Online Analytical Processing) capability in the big data era. By renovating the multi-dimensional cube and precalculation technology on Hadoop and Spark, Kylin is able to achieve near constant query speed regardless of the ever-growing data volume. Reducing query latency from minutes to sub-second, Kylin brings online analytics back to big data. Kylin can analyze 10+ billions of rows in less than a second. No more waiting on reports for critical decisions. Kylin connects data on Hadoop to BI tools like Tableau, PowerBI/Excel, MSTR, QlikSense, Hue and SuperSet, making the BI on Hadoop faster than ever. As an Analytical Data Warehouse, Kylin offers ANSI SQL on Hadoop/Spark and supports most ANSI SQL query functions. Kylin can support thousands of interactive queries at the same time, thanks to the low resource consumption of each query.

View Software

Apache Zeppelin

Apache

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more. IPython interpreter provides comparable user experience like Jupyter Notebook. This release includes Note level dynamic form, note revision comparator and ability to run paragraph sequentially, instead of simultaneous paragraph execution in previous releases. Interpreter lifecycle manager automatically terminate interpreter process on idle timeout. So resources are released when they're not in use.

View Software

Occubee

3SOFT

Occubee platform automatically converts large amount of receipt data, information on thousands of products and dozens of retail-specific factors into valuable sales and demand forecasts. In stores, Occubee forecasts sales individually for each product and generates replenishment commands. In warehouses, Occubee optimizes the availability of goods and allocated capital, and generates orders for suppliers. In the head office, Occubee provides real-time monitoring of sales processes and generates anomaly alerts and reports. Modern technologies for data collection and processing ensure automation of key business processes in the retail industry. Occubee fully responds to the needs of modern retail and fits in with the global megatrends related to the use of data in business.

View Software

Apache Hudi

Apache Corporation

Hudi is a rich platform to build streaming data lakes with incremental data pipelines on a self-managing database layer, while being optimized for lake engines and regular batch processing. Hudi maintains a timeline of all actions performed on the table at different instants of time that helps provide instantaneous views of the table, while also efficiently supporting retrieval of data in the order of arrival. A Hudi instant consists of the following components. Hudi provides efficient upserts, by mapping a given hoodie key consistently to a file id, via an indexing mechanism. This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file. In short, the mapped file group contains all versions of a group of records.

View Software

Cloudera Data Platform

Cloudera

Unlock the potential of private and public clouds with the only hybrid data platform for modern data architectures with data anywhere. Cloudera is a hybrid data platform designed for unmatched freedom to choose—any cloud, any analytics, any data. Cloudera delivers faster and easier data management and data analytics for data anywhere, with optimal performance, scalability, and security. With Cloudera you get all the advantages of private cloud and public cloud for faster time to value and increased IT control. Cloudera provides the freedom to securely move data, applications, and users bi-directionally between the data center and multiple data clouds, regardless of where your data lives.

View Software

Varada

Varada’s dynamic and adaptive big data indexing solution enables to balance performance and cost with zero data-ops. Varada’s unique big data indexing technology serves as a smart acceleration layer on your data lake, which remains the single source of truth, and runs in the customer cloud environment (VPC). Varada enables data teams to democratize data by operationalizing the entire data lake while ensuring interactive performance, without the need to move data, model or manually optimize. Our secret sauce is our ability to automatically and dynamically index relevant data, at the structure and granularity of the source. Varada enables any query to meet continuously evolving performance and concurrency requirements for users and analytics API calls, while keeping costs predictable and under control. The platform seamlessly chooses which queries to accelerate and which data to index. Varada elastically adjusts the cluster to meet demand and optimize cost and performance.

View Software

Amadea

ISoft

Amadea technology relies on the fastest real-time calculation and modeling engine on the market. Speed up the creation, deployment and automation of your analytics projects within the same integrated environment. Data quality is the key to analytical projects. Thanks to the ISoft real-time calculation engine, the fastest on the market, Amadea allows companies to prepare and use massive and/or complex data in real-time, regardless of the volume. ISoft started from a simple observation, successful analytical projects must involve the business users at every stage. Founded on a no-code interface, accessible to all types of users, Amadea allows everyone involved in analytical projects to take part. As Amadea has the fastest real-time calculation engine on the market, it lets you specify, prototype and build your data applications simultaneously. Amadea incorporates the fastest real-time data analysis engine on the market, 10 million lines per second & per core for standard calculations.

View Software

StreamFlux

Fractal

Data is crucial when it comes to building, streamlining and growing your business. However, getting the full value out of data can be a challenge, many organizations are faced with poor access to data, incompatible tools, spiraling costs and slow results. Simply put, leaders who can turn raw data into real results will thrive in today’s landscape. The key to this is empowering everyone across your business to be able to analyze, build and collaborate on end-to-end AI and machine learning solutions in one place, fast. Streamflux is a one-stop shop to meet your data analytics and AI challenges. Our self-serve platform allows you the freedom to build end-to-end data solutions, uses models to answer complex questions and assesses user behaviors. Whether you’re predicting customer churn and future revenue, or generating recommendations, you can go from raw data to genuine business impact in days, not months.

View Software

Apache Hive Integrations

Apache Software Foundation

120 Integrations with Apache Hive

Oracle Machine Learning

Lyftrondata

LT Browser

IRI Voracity

IRI Data Protector Suite

Xtendlabs

SAS Federation Server

WEBDEV

WINDEV

SQL

OpenText Structured Data Manager

Progress DataDirect

jethro

Baidu Sugar

Foundational

IBM watsonx.data

TapData

eQube®-DaaS

Adobe Real-Time CDP

Airtool

Cloudera Data Warehouse

CelerData Cloud

Data Virtuality

Mode

Astro by Astronomer

Nucleon Database Master

ActionIQ

Apache Spark

Amazon EMR

Nightfall

TiMi

Truedat

Privacera

Microsoft Power Query

Apache Knox

Mage Static Data Masking

Mage Dynamic Data Masking

Okera

Acceldata

Apache Sentry

lakeFS

Amundsen

Apache Kylin

Apache Zeppelin

Occubee

Apache Hudi

Cloudera Data Platform

Varada

Amadea

StreamFlux

Related Categories

Related Categories That Integrate With Apache Hive