Business Software for Hadoop - Page 2

Top Software that integrates with Hadoop as of August 2025 - Page 2

Hadoop Clear Filters
  • 1
    Hostmaster

    Hostmaster

    Hostmaster

    First-class reliable web hosting at affordable prices. Experience our speedy, robust servers, our feature-filled packages and our helpful support team 24/7, 365 days a year, all at a price you'll never believe! Host your personal or business website on our robust servers with our feature-packed shared hosting plans. Run your very own web hosting business with our all-inclusive reseller web hosting plans. Feel the benefit of our powerful servers, redundant network and our professional management team, keeping your data secure. All accounts are remotely backed up, every day. Manage every aspect of your client's web hosting experience with ease using cPanel's intuitive WebHostManager. Install advanced web scripts with the click of a button. Design a professional website in minutes, with 100+ fully customizable templates and our SiteBuilder. Our professional support team is available around the clock, every day of the year.
    Starting Price: $4.95 per month
  • 2
    IBM Analytics Engine
    IBM Analytics Engine provides an architecture for Hadoop clusters that decouples the compute and storage tiers. Instead of a permanent cluster formed of dual-purpose nodes, the Analytics Engine allows users to store data in an object storage layer such as IBM Cloud Object Storage and spins up clusters of computing notes when needed. Separating compute from storage helps to transform the flexibility, scalability and maintainability of big data analytics platforms. Build on an ODPi compliant stack with pioneering data science tools with the broader Apache Hadoop and Apache Spark ecosystem. Define clusters based on your application's requirements. Choose the appropriate software pack, version, and size of the cluster. Use as long as required and delete as soon as an application finishes jobs. Configure clusters with third-party analytics libraries and packages. Deploy workloads from IBM Cloud services like machine learning.
    Starting Price: $0.014 per hour
  • 3
    Elastic Observability
    Rely on the most widely deployed observability platform available, built on the proven Elastic Stack (also known as the ELK Stack) to converge silos, delivering unified visibility and actionable insights. To effectively monitor and gain insights across your distributed systems, you need to have all your observability data in one stack. Break down silos by bringing together the application, infrastructure, and user data into a unified solution for end-to-end observability and alerting. Combine limitless telemetry data collection and search-powered problem resolution in a unified solution for optimal operational and business results. Converge data silos by ingesting all your telemetry data (metrics, logs, and traces) from any source in an open, extensible, and scalable platform. Accelerate problem resolution with automatic anomaly detection powered by machine learning and rich data analytics.
    Starting Price: $16 per month
  • 4
    Dataplane

    Dataplane

    Dataplane

    The concept behind Dataplane is to make it quicker and easier to construct a data mesh with robust data pipelines and automated workflows for businesses and teams of all sizes. In addition to being more user friendly, there has been an emphasis on scaling, resilience, performance and security.
    Starting Price: Free
  • 5
    Normalyze

    Normalyze

    Normalyze

    Our agentless data discovery and scanning platform is easy to connect to any cloud account (AWS, Azure and GCP). There is nothing for you to deploy or manage. We support all native cloud data stores, structured or unstructured, across all three clouds. Normalyze scans both structured and unstructured data within your cloud accounts and only collects metadata to add to the Normalyze graph. No sensitive data is collected at any point during scanning. Display a graph of access and trust relationships that includes deep context with fine-grained process names, data store fingerprints, IAM roles and policies in real-time. Quickly locate all data stores containing sensitive data, find all-access paths, and score potential breach paths based on sensitivity, volume, and permissions to show all breaches waiting to happen. Categorize and identify sensitive data-based industry profiles such as PCI, HIPAA, GDPR, etc.
    Starting Price: $14,995 per year
  • 6
    Superblocks

    Superblocks

    Superblocks

    Superblocks is a programmable IDE for developers to build any internal app, workflow, or scheduled job at a fraction of the time and cost. Ship next month's roadmap this week. Quickly build apps, workflows & jobs connected to your data. Secure with granular permissions (RBAC), SSO, audit logs, and secret management in seconds. Deploy with Git and monitor production. Extend anything with code. No need to learn React, HTML, or CSS. Drag and drop components, connect them to data and make your app dynamic by triggering APIs. Build custom KYC, Compliance, AML, and credit approval tools to enable robust support processes to increase the velocity of your support team. Stop wrestling with CLIs. Quickly build admin panels for your datastores to read, write, and update your customer data with tables, forms, and charts. Clark is an AI-powered app builder that helps teams quickly create secure internal enterprise applications using their own design systems, permissions, and integrations.
    Starting Price: $0 per month
  • 7
    Dialogic OnDemand Voicemail
    Dialogic OnDemand Voicemail is all software and can run in virtualized environments, allowing you to share resources and reduce service delivery costs. It minimizes the number of mailboxes needed by creating temporary resources that can be shared across subscribers while maintaining the same privacy and security standards as permanent mailboxes. Legacy platforms are also expensive to maintain and need extra space and power. Upgrading to a fully virtualized, the on-demand platform will lower your operational costs without compromising service. And with an easy-to-use interface that is designed to enhance your subscribers’ self-service abilities, your customer care costs will be reduced too. Enable dynamic and temporary voicemailboxes. Assign mailbox to the customer only when needed. Reduce the number of voicemail boxes and cost. Access anywhere and on any device. Give your voicemail service a new look by making it visual, and give customers the latest features at the same time.
    Starting Price: Free
  • 8
    muCommander

    muCommander

    muCommander

    muCommander is an open-source, dual-pane file manager available on all major operating systems. Copy, move, rename and batch rename, email files. Multiple tabs and universal bookmarks. Credentials manager. Configurable keyboard shortcuts. Cloud storage Dropbox and Google Drive. Virtual filesystem with support for local volumes, FTP, SFTP, SMB, NFS, HTTP, Amazon S3, Hadoop HDFS, and Bonjour. Archives ZIP, RAR, 7z, TAR, GZip, BZip2, ISO/NRG, AR/Deb, LST. Checksum calculation. Fully customizable user interface, configurable toolbars, and themes. Available in many languages. muCommander is a lightweight, cross-platform file manager with a dual-pane interface. Java 11 or later is required to run muCommander. Report bugs, suggest new features, answer questions, write documentation, create video tutorials or translate the user interface. In order to start Open Office, you need to open the "natively" (mapped to shift-enter by default) document.
    Starting Price: Free
  • 9
    ELCA Smart Data Lake Builder
    Classical Data Lakes are often reduced to basic but cheap raw data storage, neglecting significant aspects like transformation, data quality and security. These topics are left to data scientists, who end up spending up to 80% of their time acquiring, understanding and cleaning data before they can start using their core competencies. In addition, classical Data Lakes are often implemented by separate departments using different standards and tools, which makes it harder to implement comprehensive analytical use cases. Smart Data Lakes solve these various issues by providing architectural and methodical guidelines, together with an efficient tool to build a strong high-quality data foundation. Smart Data Lakes are at the core of any modern analytics platform. Their structure easily integrates prevalent Data Science tools and open source technologies, as well as AI and ML. Their storage is cheap and scalable, supporting both unstructured data and complex data structures.
    Starting Price: Free
  • 10
    Akira AI

    Akira AI

    Akira AI

    Akira.ai provides businesses with Agentic AI, a set of specialized AI agents designed to optimize and automate complex workflows across various industries. These AI agents collaborate with human teams, enhancing productivity, making real-time decisions, and automating repetitive tasks, such as data analysis, incident management, and HR processes. The platform integrates smoothly with existing systems, including CRMs and ERPs, ensuring a disruption-free transition to AI-enhanced operations. Akira’s AI agents help businesses streamline their operations, increase decision-making speed, and boost overall efficiency, driving innovation across sectors like manufacturing, finance, and IT.
    Starting Price: $15 per month
  • 11
    Indexima Data Hub
    Reshape your perception of time in data analytics. Instantly access your business’ data in no time and work directly on your dashboard without going back and forth with the IT team. Meet Indexima DataHub, a new space-time where operational and functional users gain instant access to their data, in no time. With a combination of its unique indexing engine and machine learning, Indexima allows businesses to access all their data to simplify and speed up analytics. Robust and scalable, the solution allows organizations to query all their data directly at the source, in volumes of tens of billions of rows in just a few milliseconds. Our Indexima platform allows users to implement instant analytics on all their data in just one click. Thanks to Indexima’s new ROI and TCO calculator, find out in 30 seconds the ROI of your data platform. Infrastructure costs, project deployment time, and data engineering costs, while boosting your analytical performances.
    Starting Price: $3,290 per month
  • 12
    Yandex Data Proc
    You select the size of the cluster, node capacity, and a set of services, and Yandex Data Proc automatically creates and configures Spark and Hadoop clusters and other components. Collaborate by using Zeppelin notebooks and other web apps via a UI proxy. You get full control of your cluster with root permissions for each VM. Install your own applications and libraries on running clusters without having to restart them. Yandex Data Proc uses instance groups to automatically increase or decrease computing resources of compute subclusters based on CPU usage indicators. Data Proc allows you to create managed Hive clusters, which can reduce the probability of failures and losses caused by metadata unavailability. Save time on building ETL pipelines and pipelines for training and developing models, as well as describing other iterative tasks. The Data Proc operator is already built into Apache Airflow.
    Starting Price: $0.19 per hour
  • 13
    Apache Impala
    Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. For Apache Hive users, Impala utilizes the same metadata and ODBC driver. Like Hive, Impala supports SQL, so you don't have to worry about reinventing the implementation wheel. With Impala, more users, whether using SQL queries or BI applications, can interact with more data through a single repository and metadata stored from source through analysis.
    Starting Price: Free
  • 14
    Apache Phoenix

    Apache Phoenix

    Apache Software Foundation

    Apache Phoenix enables OLTP and operational analytics in Hadoop for low-latency applications by combining the best of both worlds. The power of standard SQL and JDBC APIs with full ACID transaction capabilities and the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store. Apache Phoenix is fully integrated with other Hadoop products such as Spark, Hive, Pig, Flume, and Map Reduce. Become the trusted data platform for OLTP and operational analytics for Hadoop through well-defined, industry-standard APIs. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.
    Starting Price: Free
  • 15
    Inferyx

    Inferyx

    Inferyx

    Move past application silos, cost overrun, and skill obsolescence to scale faster with our intelligent data and analytics platform. An intelligent platform built to perform data management and advanced analytics. Helps you scale across the technology landscape. Our architecture understands how data flows and transforms throughout its lifecycle. Enabling the development of future-proof enterprise AI applications. A highly modular and extensible platform that enables the handling of multifold components. Designed to scale with a multi-tenant architecture. Analyzing complex data structures is made easy using advanced data visualization. Resulting in enhanced enterprise AI app development in an intuitive and low-code predictive platform. Our unique hybrid multi-cloud platform is built using open source community software which makes it immensely adaptive, highly secure, and essentially low-cost.
    Starting Price: Free
  • 16
    Apache Trafodion

    Apache Trafodion

    Apache Software Foundation

    Apache Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop. Trafodion builds on the scalability, elasticity, and flexibility of Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop. Full-functioned ANSI SQL language support. JDBC/ODBC connectivity for Linux/Windows clients. Distributed ACID transaction protection across multiple statements, tables, and rows. Performance improvements for OLTP workloads with compile-time and run-time optimizations. Support for large data sets using a parallel-aware query optimizer. Reuse existing SQL skills and improve developer productivity. Distributed ACID transactions guarantee data consistency across multiple rows and tables. Interoperability with existing tools and applications. Hadoop and Linux distribution neutral. Easy to add to your existing Hadoop infrastructure.
    Starting Price: Free
  • 17
    Alteryx

    Alteryx

    Alteryx

    Step into a new era of analytics with the Alteryx AI Platform. Empower your organization with automated data preparation, AI-powered analytics, and approachable machine learning — all with embedded governance and security. Welcome to the future of data-driven decisions for every user, every team, every step of the way. Empower your teams with an easy, intuitive user experience allowing everyone to create analytic solutions that improve productivity, efficiency, and the bottom line. Build an analytics culture with an end-to-end cloud analytics platform and transform data into insights with self-service data prep, machine learning, and AI-generated insights. Reduce risk and ensure your data is fully protected with the latest security standards and certifications. Connect to your data and applications with open API standards.
  • 18
    OpenText Analytics Database (Vertica)
    OpenText Analytics Database is a high-performance, scalable analytics platform that enables organizations to analyze massive data sets quickly and cost-effectively. It supports real-time analytics and in-database machine learning to deliver actionable business insights. The platform can be deployed flexibly across hybrid, multi-cloud, and on-premises environments to optimize infrastructure and reduce total cost of ownership. Its massively parallel processing (MPP) architecture handles complex queries efficiently, regardless of data size. OpenText Analytics Database also features compatibility with data lakehouse architectures, supporting formats like Parquet and ORC. With built-in machine learning and broad language support, it empowers users from SQL experts to Python developers to derive predictive insights.
  • 19
    BigID

    BigID

    BigID

    BigID is data visibility and control for all types of data, everywhere. Reimagine data management for privacy, security, and governance across your entire data landscape. With BigID, you can automatically discover and manage personal and sensitive data – and take action for privacy, protection, and perspective. BigID uses advanced machine learning and data intelligence to help enterprises better manage and protect their customer & sensitive data, meet data privacy and protection regulations, and leverage unmatched coverage for all data across all data stores. 2
  • 20
    Ataccama ONE
    Ataccama reinvents the way data is managed to create value on an enterprise scale. Unifying Data Governance, Data Quality, and Master Data Management into a single, AI-powered fabric across hybrid and Cloud environments, Ataccama gives your business and data teams the ability to innovate with unprecedented speed while maintaining trust, security, and governance of your data.
  • 21
    Quorso

    Quorso

    Quorso

    Powering management to drive business performance. Management is slow, in-person and fragmented, making rapid, data-driven collaboration difficult. Quorso joins up management in a single tool – connecting your KPIs to your data, team and actions to power business performance. Create KPIs in seconds, then sit back as Quorso hunts through your data, finding actionable insights for each team member. Quorso helps your team deliver each action, then measures the impact so everyone learns what works. Quorso helps you remotely manage, engage and collaborate with your team – so it feels like you are on site every day. Quorso shows you how every action by every team member is improving your KPIs. Quorso boosts management productivity across every area of your business.
  • 22
    Fluentd

    Fluentd

    Fluentd Project

    A single, unified logging layer is key to make log data accessible and usable. However, existing tools fall short: legacy tools are not built for new cloud APIs and microservice-oriented architecture in mind and are not innovating quickly enough. Fluentd, created by Treasure Data, solves the challenges of building a unified logging layer with a modular architecture, an extensible plugin model, and a performance optimized engine. In addition to these features, Fluentd Enterprise addresses Enterprise requirements such as Trusted Packaging. Security. Certified Enterprise Connectors, Management / Monitoring, and Enterprise SLA-Based Support, Assurance, and Enterprise Consulting Services
  • 23
    Greenovative

    Greenovative

    Greenovative Energy

    Greenovative Energy is a smart sustainability platform enabling industries to manage energy, water, and emissions using AI, IoT, and real-time analytics. Based in Pune, we help businesses cut costs, meet compliance, and accelerate their net-zero goals. Our unified, AI-powered system integrates with enterprise tools to offer predictive insights, automated workflows, and intuitive dashboards. Our solutions span energy optimization, water tracking, asset lifecycle management, and a Net Zero Transition Program, designed for industrial setups like manufacturing plants and ESG-focused teams. Recognized with ISO 50001, ISO 27001, and featured in LinkedIn Top Startups and Microsoft for Startups, Greenovative is your trusted partner for achieving sustainability with impact.
  • 24
    Greenplum

    Greenplum

    Greenplum Database

    Greenplum Database® is an advanced, fully featured, open source data warehouse. It provides powerful and rapid analytics on petabyte scale data volumes. Uniquely geared toward big data analytics, Greenplum Database is powered by the world’s most advanced cost-based query optimizer delivering high analytical query performance on large data volumes. Greenplum Database® project is released under the Apache 2 license. We want to thank all our current community contributors and are interested in all new potential contributions. For the Greenplum Database community no contribution is too small, we encourage all types of contributions. An open-source massively parallel data platform for analytics, machine learning and AI. Rapidly create and deploy models for complex applications in cybersecurity, predictive maintenance, risk management, fraud detection, and many other areas. Experience the fully featured, integrated, open source analytics platform.
  • 25
    HugeGraph

    HugeGraph

    HugeGraph

    HugeGraph is a fast-speed and highly-scalable graph database. Billions of vertices and edges can be easily stored into and queried from HugeGraph due to its excellent OLTP ability. As compliance to Apache TinkerPop 3 framework, various complicated graph queries can be accomplished through Gremlin (a powerful graph traversal language). Among its features, it provides compliance to Apache TinkerPop 3, supporting Gremlin. Schema Metadata Management, including VertexLabel, EdgeLabel, PropertyKey and IndexLabel. Multi-type Indexes, supporting exact query, range query and complex conditions combination query. Plug-in Backend Store Driver Framework, supporting RocksDB, Cassandra, ScyllaDB, HBase and MySQL now and easy to add other backend store driver if needed. Integration with Hadoop/Spark. HugeGraph relies on the TinkerPop framework, we refer to the storage structure of Titan and the schema definition of DataStax.
  • 26
    Apache Ranger

    Apache Ranger

    The Apache Software Foundation

    Apache Ranger™ is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. The vision with Ranger is to provide comprehensive security across the Apache Hadoop ecosystem. With the advent of Apache YARN, the Hadoop platform can now support a true data lake architecture. Enterprises can potentially run multiple workloads, in a multi tenant environment. Data security within Hadoop needs to evolve to support multiple use cases for data access, while also providing a framework for central administration of security policies and monitoring of user access. Centralized security administration to manage all security related tasks in a central UI or using REST APIs. Fine grained authorization to do a specific action and/or operation with Hadoop component/tool and managed through a central administration tool. Standardize authorization method across all Hadoop components. Enhanced support for different authorization methods - Role based access control etc.
  • 27
    PHEMI Health DataLab
    The PHEMI Trustworthy Health DataLab is a unique, cloud-based, integrated big data management system that allows healthcare organizations to enhance innovation and generate value from healthcare data by simplifying the ingestion and de-identification of data with NSA/military-grade governance, privacy, and security built-in. Conventional products simply lock down data, PHEMI goes further, solving privacy and security challenges and addressing the urgent need to secure, govern, curate, and control access to privacy-sensitive personal healthcare information (PHI). This improves data sharing and collaboration inside and outside of an enterprise—without compromising the privacy of sensitive information or increasing administrative burden. PHEMI Trustworthy Health DataLab can scale to any size of organization, is easy to deploy and manage, connects to hundreds of data sources, and integrates with popular data science and business analysis tools.
  • 28
    Informatica Persistent Data Masking
    Retain context, form, and integrity while preserving privacy. Enhance data protection by de-sensitizing and de-identifying sensitive data, and pseudonymize data for privacy compliance and analytics. Obscured data retains context and referential integrity remain consistent, so the masked data can be used in testing, analytics, or support environments. As a highly scalable, high-performance data masking solution, Informatica Persistent Data Masking shields confidential data—such as credit card numbers, addresses, and phone numbers—from unintended exposure by creating realistic, de-identified data that can be shared safely internally or externally. It also allows you to reduce the risk of data breaches in nonproduction environments, produce higher-quality test data and streamline development projects, and ensure compliance with data-privacy mandates and regulations.
  • 29
    Actian Avalanche
    Actian Avalanche is a fully managed hybrid cloud data warehouse service designed from the ground up to deliver high performance and scale across all dimensions – data volume, concurrent user, and query complexity – at a fraction of the cost of alternative solutions. It is a true hybrid platform that can be deployed on-premises as well as on multiple clouds, including AWS, Azure, and Google Cloud, enabling you to migrate or offload applications and data to the cloud at your own pace. Actian Avalanche delivers the best price-performance in the industry outof-the-box without DBA tuning and optimization techniques. For the same cost as alternative solutions, you can benefit from substantially better performance or chose the same performance for significantly lower cost. For example, Avalanche provides up to 6x the price-performance advantage over Snowflake as measured by GigaOm’s TPC-H industry standard benchmark and even more against many of the appliance vendors.
  • 30
    Toad

    Toad

    Quest

    Toad Software is a database management toolset from Quest that database developers, database administrators and data analysts use to manage both relational and non-relational databases using SQL. Take a proactive approach to database management. Re-focus your teams on more strategic initiatives, and move your business forward in today’s data-driven economy. Toad solutions enable you to maximize your investment in data technology by empowering data professionals to automate processes, minimize risks and cut project delivery timelines by nearly half. Lower the total cost of ownership for new applications by reducing the impact of inefficient code on productivity, future development cycles, performance and availability. See why millions of users trust Toad for their most critical systems and data environments. It’s time to gain the competitive edge. Work smarter and meet the demands of today’s database environments.