Business Software for Apache Hive - Page 2

Top Software that integrates with Apache Hive as of July 2025 - Page 2

  • 1
    Apache Doris

    Apache Doris

    The Apache Software Foundation

    Apache Doris is a modern data warehouse for real-time analytics. It delivers lightning-fast analytics on real-time data at scale. Push-based micro-batch and pull-based streaming data ingestion within a second. Storage engine with real-time upsert, append and pre-aggregation. Optimize for high-concurrency and high-throughput queries with columnar storage engine, MPP architecture, cost based query optimizer, vectorized execution engine. Federated querying of data lakes such as Hive, Iceberg and Hudi, and databases such as MySQL and PostgreSQL. Compound data types such as Array, Map and JSON. Variant data type to support auto data type inference of JSON data. NGram bloomfilter and inverted index for text searches. Distributed design for linear scalability. Workload isolation and tiered storage for efficient resource management. Supports shared-nothing clusters as well as separation of storage and compute.
    Starting Price: Free
  • 2
    Hue

    Hue

    Hue

    Hue brings the best querying experience with the most intelligent autocomplete and query editor components. The tables and storage browsers leverage your existing data catalog knowledge transparently. Help users find the correct data among thousands of databases and self-document it. Assist users with their SQL queries and leverage rich previews for links, sharing from the editor directly in Slack. Several apps, each one specialized in a certain type of querying are available. Data sources can be explored first via the browsers. The editor shines for SQL queries. It comes with an intelligent autocomplete, risk alerts, and self-service troubleshooting. Dashboards focus on visualizing indexed data but can also query SQL databases. You can now search for certain cell values in the table and the results are highlighted. To make your SQL editing experience, Hue comes with one of the best SQL autocomplete on the planet.
    Starting Price: Free
  • 3
    Yandex Data Proc
    You select the size of the cluster, node capacity, and a set of services, and Yandex Data Proc automatically creates and configures Spark and Hadoop clusters and other components. Collaborate by using Zeppelin notebooks and other web apps via a UI proxy. You get full control of your cluster with root permissions for each VM. Install your own applications and libraries on running clusters without having to restart them. Yandex Data Proc uses instance groups to automatically increase or decrease computing resources of compute subclusters based on CPU usage indicators. Data Proc allows you to create managed Hive clusters, which can reduce the probability of failures and losses caused by metadata unavailability. Save time on building ETL pipelines and pipelines for training and developing models, as well as describing other iterative tasks. The Data Proc operator is already built into Apache Airflow.
    Starting Price: $0.19 per hour
  • 4
    Apache Impala
    Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. For Apache Hive users, Impala utilizes the same metadata and ODBC driver. Like Hive, Impala supports SQL, so you don't have to worry about reinventing the implementation wheel. With Impala, more users, whether using SQL queries or BI applications, can interact with more data through a single repository and metadata stored from source through analysis.
    Starting Price: Free
  • 5
    StarRocks

    StarRocks

    StarRocks

    Whether you're working with a single table or multiple, you'll experience at least 300% better performance on StarRocks compared to other popular solutions. From streaming data to data capture, with a rich set of connectors, you can ingest data into StarRocks in real time for the freshest insights. A query engine that adapts to your use cases. Without moving your data or rewriting SQL, StarRocks provides the flexibility to scale your analytics on demand with ease. StarRocks enables a rapid journey from data to insight. StarRocks' performance is unmatched and provides a unified OLAP solution covering the most popular data analytics scenarios. Whether you're working with a single table or multiple, you'll experience at least 300% better performance on StarRocks compared to other popular solutions. StarRocks' built-in memory-and-disk-based caching framework is specifically designed to minimize the I/O overhead of fetching data from external storage to accelerate query performance.
    Starting Price: Free
  • 6
    Apache Phoenix

    Apache Phoenix

    Apache Software Foundation

    Apache Phoenix enables OLTP and operational analytics in Hadoop for low-latency applications by combining the best of both worlds. The power of standard SQL and JDBC APIs with full ACID transaction capabilities and the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store. Apache Phoenix is fully integrated with other Hadoop products such as Spark, Hive, Pig, Flume, and Map Reduce. Become the trusted data platform for OLTP and operational analytics for Hadoop through well-defined, industry-standard APIs. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.
    Starting Price: Free
  • 7
    Stackable

    Stackable

    Stackable

    The Stackable data platform was designed with openness and flexibility in mind. It provides you with a curated selection of the best open source data apps like Apache Kafka, Apache Druid, Trino, and Apache Spark. While other current offerings either push their proprietary solutions or deepen vendor lock-in, Stackable takes a different approach. All data apps work together seamlessly and can be added or removed in no time. Based on Kubernetes, it runs everywhere, on-prem or in the cloud. stackablectl and a Kubernetes cluster are all you need to run your first stackable data platform. Within minutes, you will be ready to start working with your data. Configure your one-line startup command right here. Similar to kubectl, stackablectl is designed to easily interface with the Stackable Data Platform. Use the command line utility to deploy and manage stackable data apps on Kubernetes. With stackablectl, you can create, delete, and update components.
    Starting Price: Free
  • 8
    Inferyx

    Inferyx

    Inferyx

    Move past application silos, cost overrun, and skill obsolescence to scale faster with our intelligent data and analytics platform. An intelligent platform built to perform data management and advanced analytics. Helps you scale across the technology landscape. Our architecture understands how data flows and transforms throughout its lifecycle. Enabling the development of future-proof enterprise AI applications. A highly modular and extensible platform that enables the handling of multifold components. Designed to scale with a multi-tenant architecture. Analyzing complex data structures is made easy using advanced data visualization. Resulting in enhanced enterprise AI app development in an intuitive and low-code predictive platform. Our unique hybrid multi-cloud platform is built using open source community software which makes it immensely adaptive, highly secure, and essentially low-cost.
    Starting Price: Free
  • 9
    DataHub

    DataHub

    DataHub

    DataHub is an open source metadata platform designed to streamline data discovery, observability, and governance across diverse data ecosystems. It enables organizations to effortlessly discover trustworthy data, with experiences tailored for each person and eliminates breaking changes with detailed cross-platform and column-level lineage. DataHub builds confidence in your data by providing a comprehensive view of business, operational, and technical context, all in one place. The platform offers automated data quality checks and AI-driven anomaly detection, notifying teams when issues arise and centralizing incident tracking. With detailed lineage, documentation, and ownership information, DataHub facilitates swift issue resolution. It also automates governance programs by classifying assets as they evolve, minimizing manual work through GenAI documentation, AI-driven classification, and smart propagation. DataHub's extensible architecture supports over 70 native integrations.
    Starting Price: Free
  • 10
    Alteryx

    Alteryx

    Alteryx

    Step into a new era of analytics with the Alteryx AI Platform. Empower your organization with automated data preparation, AI-powered analytics, and approachable machine learning — all with embedded governance and security. Welcome to the future of data-driven decisions for every user, every team, every step of the way. Empower your teams with an easy, intuitive user experience allowing everyone to create analytic solutions that improve productivity, efficiency, and the bottom line. Build an analytics culture with an end-to-end cloud analytics platform and transform data into insights with self-service data prep, machine learning, and AI-generated insights. Reduce risk and ensure your data is fully protected with the latest security standards and certifications. Connect to your data and applications with open API standards.
  • 11
    Protegrity

    Protegrity

    Protegrity

    Our platform allows businesses to use data—including its application in advanced analytics, machine learning, and AI—to do great things without worrying about putting customers, employees, or intellectual property at risk. The Protegrity Data Protection Platform doesn't just secure data—it simultaneously classifies and discovers data while protecting it. You can't protect what you don't know you have. Our platform first classifies data, allowing users to categorize the type of data that can mostly be in the public domain. With those classifications established, the platform then leverages machine learning algorithms to discover that type of data. Classification and discovery finds the data that needs to be protected. Whether encrypting, tokenizing, or applying privacy methods, the platform secures the data behind the many operational systems that drive the day-to-day functions of business, as well as the analytical systems behind decision-making.
  • 12
    Algonomy

    Algonomy

    Algonomy

    Algonomy’s real-time CDP helps marketers provide consistent personalized customer engagement in the moment. It enables real-time activation of audiences by unifying customer identities across online and offline channels for a single view. It is built for retail with the ability to track customer behavior using 1,200+ measures and dimensions out of the box. It uses ML algorithms to create micro-segments, throw up deep insights and uncover marketing opportunities throughout the customer lifecycle.
  • 13
    Ataccama ONE
    Ataccama reinvents the way data is managed to create value on an enterprise scale. Unifying Data Governance, Data Quality, and Master Data Management into a single, AI-powered fabric across hybrid and Cloud environments, Ataccama gives your business and data teams the ability to innovate with unprecedented speed while maintaining trust, security, and governance of your data.
  • 14
    IBM Cloud Mass Data Migration
    IBM Cloud® Mass Data Migration uses storage devices with 120 TB of usable capacity to accelerate moving data to the cloud and overcome common transfer challenges like high costs, long transfer times and security concerns — all in a single service. Using a single IBM Cloud Mass Data Migration device, you can migrate up to 120 TB of data (at RAID-6) in just days, as opposed to weeks or months using traditional data-transfer methods. Whether you need to migrate a few terabytes or many petabytes of data, you have the flexibility to request one or multiple devices to accommodate your workload. Moving large data sets can be an expensive and time-consuming process. Use an IBM Cloud Mass Data Migration device at your location for just 50 USD per day. IBM sends you a preconfigured device for you to simply connect, ingest data and then ship back to IBM for offload into IBM Cloud Object Storage. Once offloaded, enjoy immediate access to your data in the cloud while IBM securely wipes the device.
    Starting Price: $50 per day
  • 15
    E-MapReduce
    EMR is an all-in-one enterprise-ready big data platform that provides cluster, job, and data management services based on open-source ecosystems, such as Hadoop, Spark, Kafka, Flink, and Storm. Alibaba Cloud Elastic MapReduce (EMR) is a big data processing solution that runs on the Alibaba Cloud platform. EMR is built on Alibaba Cloud ECS instances and is based on open-source Apache Hadoop and Apache Spark. EMR allows you to use the Hadoop and Spark ecosystem components, such as Apache Hive, Apache Kafka, Flink, Druid, and TensorFlow, to analyze and process data. You can use EMR to process data stored on different Alibaba Cloud data storage service, such as Object Storage Service (OSS), Log Service (SLS), and Relational Database Service (RDS). You can quickly create clusters without the need to configure hardware and software. All maintenance operations are completed on its Web interface.
  • 16
    Apache Ranger

    Apache Ranger

    The Apache Software Foundation

    Apache Ranger™ is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. The vision with Ranger is to provide comprehensive security across the Apache Hadoop ecosystem. With the advent of Apache YARN, the Hadoop platform can now support a true data lake architecture. Enterprises can potentially run multiple workloads, in a multi tenant environment. Data security within Hadoop needs to evolve to support multiple use cases for data access, while also providing a framework for central administration of security policies and monitoring of user access. Centralized security administration to manage all security related tasks in a central UI or using REST APIs. Fine grained authorization to do a specific action and/or operation with Hadoop component/tool and managed through a central administration tool. Standardize authorization method across all Hadoop components. Enhanced support for different authorization methods - Role based access control etc.
  • 17
    Vaultspeed

    Vaultspeed

    VaultSpeed

    Experience faster data warehouse automation. The Vaultspeed automation tool is built on the Data Vault 2.0 standard and a decade of hands-on experience in data integration projects. Get support for all Data Vault 2.0 objects and implementation options. Generate quality code fast for all scenarios in a Data Vault 2.0 integration system. Plug Vaultspeed into your current set-up and leverage your investments in tools and knowledge. Get guaranteed compliance with the latest Data Vault 2.0 standard. We are in continuous interaction with Scalefree, the body of knowledge for the Data Vault 2.0 community. The Data Vault 2.0 modelling approach strips the model components to their bare minimum so they can be loaded through the same loading pattern (repeatable pattern) and have the same database structure. Vaultspeed works with a template system, which understands the structure of the object types, and easy-to-set configuration parameters.
    Starting Price: €600 per user per month
  • 18
    PHEMI Health DataLab
    The PHEMI Trustworthy Health DataLab is a unique, cloud-based, integrated big data management system that allows healthcare organizations to enhance innovation and generate value from healthcare data by simplifying the ingestion and de-identification of data with NSA/military-grade governance, privacy, and security built-in. Conventional products simply lock down data, PHEMI goes further, solving privacy and security challenges and addressing the urgent need to secure, govern, curate, and control access to privacy-sensitive personal healthcare information (PHI). This improves data sharing and collaboration inside and outside of an enterprise—without compromising the privacy of sensitive information or increasing administrative burden. PHEMI Trustworthy Health DataLab can scale to any size of organization, is easy to deploy and manage, connects to hundreds of data sources, and integrates with popular data science and business analysis tools.
  • 19
    SQLyog

    SQLyog

    Webyog

    A Powerful MySQL Development and Administration Solution. SQLyog Ultimate enables database developers, administrators, and architects to visually compare, optimize, and document schemas. SQLyog Ultimate includes a power tool to automate and schedule the synchronization of data between two MySQL hosts. Create the job definition file using the interactive wizard. The tool does not require any installation on the MySQL hosts. You can use any host to run the tool. SQLyog Ultimate includes a power tool to interactively synchronize data. Run the tool in attended mode to compare data from source and target before taking action. Using the intuitive display, compare data on source and target for every row to decide whether it should be synchronized in which direction. SQLyog Ultimate includes a power tool to interactively compare and synchronize schema. View the differences between tables, indexes, columns, and routines of two databases.
  • 20
    Apache Avro

    Apache Avro

    Apache Software Foundation

    Apache Avro™ is a data serialization system. Avro provides rich data structures, a compact, fast, binary data format, a container file, to store persistent data, remote procedure call (RPC). Also, it provides simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages. Avro relies on schemas. When Avro data is read, the schema used when writing it is always present. This permits each datum to be written with no per-value overheads, making serialization both fast and small. This also facilitates use with dynamic, scripting languages, since data, together with its schema, is fully self-describing. When Avro data is stored in a file, its schema is stored with it, so that files may be processed later by any program. If the program reading the data expects a different schema this can be easily resolved.
  • 21
    Oracle Machine Learning
    Machine learning uncovers hidden patterns and insights in enterprise data, generating new value for the business. Oracle Machine Learning accelerates the creation and deployment of machine learning models for data scientists using reduced data movement, AutoML technology, and simplified deployment. Increase data scientist and developer productivity and reduce their learning curve with familiar open source-based Apache Zeppelin notebook technology. Notebooks support SQL, PL/SQL, Python, and markdown interpreters for Oracle Autonomous Database so users can work with their language of choice when developing models. A no-code user interface supporting AutoML on Autonomous Database to improve both data scientist productivity and non-expert user access to powerful in-database algorithms for classification and regression. Data scientists gain integrated model deployment from the Oracle Machine Learning AutoML User Interface.
  • 22
    Lyftrondata

    Lyftrondata

    Lyftrondata

    Whether you want to build a governed delta lake, data warehouse, or simply want to migrate from your traditional database to a modern cloud data warehouse, do it all with Lyftrondata. Simply create and manage all of your data workloads on one platform by automatically building your pipeline and warehouse. Analyze it instantly with ANSI SQL, BI/ML tools, and share it without worrying about writing any custom code. Boost the productivity of your data professionals and shorten your time to value. Define, categorize, and find all data sets in one place. Share these data sets with other experts with zero codings and drive data-driven insights. This data sharing ability is perfect for companies that want to store their data once, share it with other experts, and use it multiple times, now and in the future. Define dataset, apply SQL transformations or simply migrate your SQL data processing logic to any cloud data warehouse.
  • 23
    LT Browser

    LT Browser

    LambdaTest

    Next-gen browser to build, test & debug mobile websites. Test website on different pre-installed mobile device view ports. See mobile view of website on android and iOS resolutions with LT Browser, a dev friendly browser for mobile view debugging. Can’t find your favorite device? With LT Browser, you can create your own custom device view port and save it for future uses. Create new mobile, tablet or desktop devices and test website on various devices, screen resolution and perform screen resolution test for website on different screen sizes. You don’t have to switch between two devices to perform mobile website test. Test on two devices simultaneously with LT Browser and perform mobile website test on different tablet and desktop sizes and inspect website on different resolutions simultaneously. LT Browser comes with DevTools to debug multiple device sizes while performing responsiveness test simultaneously. Test website on various device resolutions with separate DevTools for each.
    Starting Price: $15 per month
  • 24
    IRI Voracity

    IRI Voracity

    IRI, The CoSort Company

    Voracity is the only high-performance, all-in-one data management platform accelerating AND consolidating the key activities of data discovery, integration, migration, governance, and analytics. Voracity helps you control your data in every stage of the lifecycle, and extract maximum value from it. Only in Voracity can you: 1) CLASSIFY, profile and diagram enterprise data sources 2) Speed or LEAVE legacy sort and ETL tools 3) MIGRATE data to modernize and WRANGLE data to analyze 4) FIND PII everywhere and consistently MASK it for referential integrity 5) Score re-ID risk and ANONYMIZE quasi-identifiers 6) Create and manage DB subsets or intelligently synthesize TEST data 7) Package, protect and provision BIG data 8) Validate, scrub, enrich and unify data to improve its QUALITY 9) Manage metadata and MASTER data. Use Voracity to comply with data privacy laws, de-muck and govern the data lake, improve the reliability of your analytics, and create safe, smart test data
  • 25
    IRI Data Protector Suite

    IRI Data Protector Suite

    IRI, The CoSort Company

    The IRI Data Protector suite contains multiple data masking products which can be licensed standalone or in a discounted bundle to profile, classify, search, mask, and audit PII and other sensitive information in structured, semi-structured, and unstructured data sources. Apply their many masking functions consistently for referential integrity: ​IRI FieldShield® Structured Data Masking FieldShield classifies, finds, de-identifies, risk-scores, and audits PII in databases, flat files, JSON, etc. IRI DarkShield® Semi & Unstructured Data Masking DarkShield classifies, finds, and deletes PII in text, pdf, Parquet, C/BLOBs, MS documents, logs, NoSQL DBs, images, and faces. IRI CellShield® Excel® Data Masking CellShield finds, reports on, masks, and audits changes to PII in Excel columns and values LAN-wide or in the cloud. IRI Data Masking as a Service IRI DMaaS engineers in the US and abroad do the work of classifying, finding, masking, and risk-scoring PII for you.
  • 26
    Xtendlabs

    Xtendlabs

    Xtendlabs

    Installing, and configuring today’s complex software technology platforms takes an extraordinary investment in time and resources. Not with Xtendlabs. Xtendlabs Emerging Technology Platform-as-a-Services provides immediate access to emerging Big Data, Data Sciences, and Database technology platforms online, from any device and location, 24/7. Xtendlabs are available on-demand, any time, from any location, including home, office or the road. Xtendlabs scale to meet your needs on-demand, so you can focus on your business problem and learning rather than struggling to find and set up infrastructure . Just sign-in to get immediate access to your virtual lab environment. Xtendlabs requires no virtual machine installation, system setup or configuration, saving valuable time and resources. Pay as you go monthly. With Xtendlabs there are no upfront investments in software or hardware.
  • 27
    SAS Federation Server
    Create federated source data names to enable users to access multiple data sources via the same connection. Use the web-based administrative console for simplified maintenance of user access, privileges and authorizations. Apply data quality functions such as match-code generation, parsing and other tasks inside the view. Improved performance with in-memory data caches & scheduling. Secured information with data masking & encryption. Lets you keep application queries current and available to users, and reduce loads on operational systems. Enables you to define access permissions for a user or group at the catalog, schema, table, column and row levels. Advanced data masking and encryption capabilities let you determine not only who’s authorized to view your data, but also what they see on an extremely granular level. It all helps ensure sensitive data doesn’t fall into the wrong hands.
  • 28
    WEBDEV

    WEBDEV

    Windev

    Responsive web design, WEBDEV allows you to easily develop Internet and Intranet sites and applications (WEB & SaaS) to manage data and processes. WEBDEV also generates PHP. WINDEV supports all databases. WEBDEV also supports all the databases that use ODBC drivers or OLEDB providers. The WINDEV, WEBDEV and WINDEV Mobile environments are compatible and share project elements. It has never been easier to build multi-target applications. The developer can focus on key business requirements, and not on the code, applications can finally meet your needs. Up to 20 times less code, develop applications in no time! Shorter time to market, allows you to gain market share. Software is easier to develop and improved reliability. Complete application RAD generator for PC, web, and mobile, template creation (patterns, inheritance & MVP). The ease of use and speed that allow you to develop and realize even your most ambitious projects.
    Starting Price: $1,703 one-time payment
  • 29
    WINDEV

    WINDEV

    Windev

    Thanks to its full integration, legendary ease of use, and advanced technology, WINDEV allows you to easily develop large-scale projects in Windows, Linux, .NET, Java, and much more! (Full compatibility with web, mobile, Android, iOS, etc.) WINDEV creates applications for Windows, Linux, and Mac. WEBDEV recompiles them for the Internet. WINDEV Mobile recompiles them for tablets or smartphones. Use the same project, interfaces, objects, elements, and source code regardless of the target. Make the most of your developments, and deploy faster on all devices. The ability to simply recompile an application for different targets is a decisive advantage. It guarantees continuity and the ability to respond to changes. Several automatic features are available. Portable code and objects (for Web browsers and Mobile code). WINDEV supports all the databases that use ODBC drivers or OLEDB providers.
    Starting Price: $1,768 one-time payment
  • 30
    OpenText Voltage Structured Data Manager
    SDM manages structured data over its entire lifecycle. Providing data discovery, insight, protection, and management while reducing TCO of application infrastructure. Discover data, document it, and act on it. Structured Data Manager offers the out-of-the-box discovery of sensitive data. Manage and protect privacy throughout the lifecycle of data. Protection doesn’t have to limit accessibility. Structured Data Manager discovers and protects sensitive data, preserves its business value, and controls database growth. Scan for personal and sensitive data in databases, classify your data, and generate remediation processes.