Compare the Top Big Data Platforms for Linux as of January 2026 - Page 2

  • 1
    Oracle Big Data Preparation
    Oracle Big Data Preparation Cloud Service is a managed Platform as a Service (PaaS) cloud-based offering that enables you to rapidly ingest, repair, enrich, and publish large data sets with end-to-end visibility in an interactive environment. You can integrate your data with other Oracle Cloud Services, such as Oracle Business Intelligence Cloud Service, for down-stream analysis. Profile metrics and visualizations are important features of Oracle Big Data Preparation Cloud Service. When a data set is ingested, you have visual access to the profile results and summary of each column that was profiled, and the results of duplicate entity analysis completed on your entire data set. Visualize governance tasks on the service Home page with easily understood runtime metrics, data health reports, and alerts. Keep track of your transforms and ensure that files are processed correctly. See the entire data pipeline, from ingestion to enrichment and publishing.
  • 2
    IRI Voracity

    IRI Voracity

    IRI, The CoSort Company

    Voracity is the only high-performance, all-in-one data management platform accelerating AND consolidating the key activities of data discovery, integration, migration, governance, and analytics. Voracity helps you control your data in every stage of the lifecycle, and extract maximum value from it. Only in Voracity can you: 1) CLASSIFY, profile and diagram enterprise data sources 2) Speed or LEAVE legacy sort and ETL tools 3) MIGRATE data to modernize and WRANGLE data to analyze 4) FIND PII everywhere and consistently MASK it for referential integrity 5) Score re-ID risk and ANONYMIZE quasi-identifiers 6) Create and manage DB subsets or intelligently synthesize TEST data 7) Package, protect and provision BIG data 8) Validate, scrub, enrich and unify data to improve its QUALITY 9) Manage metadata and MASTER data. Use Voracity to comply with data privacy laws, de-muck and govern the data lake, improve the reliability of your analytics, and create safe, smart test data
  • 3
    jethro

    jethro

    jethro

    Data-driven decision-making has unleashed a surge of business data and a rise in user demand to analyze it. This trend drives IT departments to migrate off expensive Enterprise Data Warehouses (EDW) toward cost-effective Big Data platforms like Hadoop or AWS. These new platforms come with a Total Cost of Ownership (TCO) that is about 10 times lower. They are not ideal for interactive BI applications, however, as they fail to match the high performance and user concurrency of legacy EDWs. For this exact reason, we developed Jethro. Customers use Jethro for interactive BI on Big Data. Jethro is a transparent middle tier that requires no changes to existing apps or data. It is self-driving with no maintenance required. Jethro is compatible with BI tools like Tableau, Qlik, and Microstrategy and is data source agnostic. Jethro delivers on the demands of business users allowing for thousands of concurrent users to run complicated queries over billions of records.
  • 4
    Analance
    Combining Data Science, Business Intelligence, and Data Management Capabilities in One Integrated, Self-Serve Platform. Analance is a robust, salable end-to-end platform that combines Data Science, Advanced Analytics, Business Intelligence, and Data Management into one integrated self-serve platform. It is built to deliver core analytical processing power to ensure data insights are accessible to everyone, performance remains consistent as the system grows, and business objectives are continuously met within a single platform. Analance is focused on turning quality data into accurate predictions allowing both data scientists and citizen data scientists with point and click pre-built algorithms and an environment for custom coding. Company – Overview Ducen IT helps Business and IT users of Fortune 1000 companies with advanced analytics, business intelligence and data management through its unique end-to-end data science platform called Analance.
  • 5
    Centralpoint
    Centralpoint is a Digital Experience Platform, and in Gartner's Magic Quadrant. It is used by over 350 clients worldwide going beyond Enterprise Content Management, securely authenticating (AD/SAML,OpenID, oAuth) all users for self service interaction. Centralpoint automatically aggregates your information from disparate sources, applying rich metadata against your rules, yielding true Knowledge Management; allowing you to search and relate disparate sets of data from anywhere. Centralpoint offers the most robust Module Gallery, out of the box, and can be installed on premise or in the Cloud. Be sure to see our solutions for Automating Metadata, Automating retention Policy Management, and simplifying the mash up of disparate data for the benefit of AI (Artificial Intelligence). Centralpoint is often used as an intelligent altternative to Sharepoint, allowing easy Migration tools. It can also be used for any secure portal solution for your public sites, Intranets, Members or Extranets.
  • 6
    Astro by Astronomer
    For data teams looking to increase the availability of trusted data, Astronomer provides Astro, a modern data orchestration platform, powered by Apache Airflow, that enables the entire data team to build, run, and observe data pipelines-as-code. Astronomer is the commercial developer of Airflow, the de facto standard for expressing data flows as code, used by hundreds of thousands of teams across the world.
  • 7
    SAP HANA
    SAP HANA in-memory database is for transactional and analytical workloads with any data type — on a single data copy. It breaks down the transactional and analytical silos in organizations, for quick decision-making, on premise and in the cloud. Innovate without boundaries on a database management system, where you can develop intelligent and live solutions for quick decision-making on a single data copy. And with advanced analytics, you can support next-generation transactional processing. Build data solutions with cloud-native scalability, speed, and performance. With the SAP HANA Cloud database, you can gain trusted, business-ready information from a single solution, while enabling security, privacy, and anonymization with proven enterprise reliability. An intelligent enterprise runs on insight from data – and more than ever, this insight must be delivered in real time.
  • 8
    Sigma

    Sigma

    Sigma Computing

    Sigma is a modern business intelligence (BI) and analytics application built for the cloud. Trusted by data-first companies, Sigma provides live access to cloud data warehouses using an intuitive spreadsheet interface empowering business experts to ask more of their data without writing a single line of code. With the full power of SQL, the cloud, and a familiar interface, business users have the freedom to analyze data in real time without limits. Sigma is self-service analytics as it was meant to be.
  • 9
    HEAVY.AI

    HEAVY.AI

    HEAVY.AI

    HEAVY.AI is the pioneer in accelerated analytics. The HEAVY.AI platform is used in business and government to find insights in data beyond the limits of mainstream analytics tools. Harnessing the massive parallelism of modern CPU and GPU hardware, the platform is available in the cloud and on-premise. HEAVY.AI originated from research at Harvard and MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). Expand beyond the limitations of traditional BI and GIS by leveraging the full power of modern GPU and CPU hardware so you can extract decision-quality information from your massive datasets without lag. Unify and explore your largest geospatial and time-series datasets to get the complete picture of the what, when, and where. Combine interactive visual analytics, hardware-accelerated SQL, and an advanced analytics & data science framework to find opportunity and risk hidden in your enterprise when you need to most.
  • 10
    TiMi

    TiMi

    TIMi

    With TIMi, companies can capitalize on their corporate data to develop new ideas and make critical business decisions faster and easier than ever before. The heart of TIMi’s Integrated Platform. TIMi’s ultimate real-time AUTO-ML engine. 3D VR segmentation and visualization. Unlimited self service business Intelligence. TIMi is several orders of magnitude faster than any other solution to do the 2 most important analytical tasks: the handling of datasets (data cleaning, feature engineering, creation of KPIs) and predictive modeling. TIMi is an “ethical solution”: no “lock-in” situation, just excellence. We guarantee you a work in all serenity and without unexpected extra costs. Thanks to an original & unique software infrastructure, TIMi is optimized to offer you the greatest flexibility for the exploration phase and the highest reliability during the production phase. TIMi is the ultimate “playground” that allows your analysts to test the craziest ideas!
  • 11
    Hypertable

    Hypertable

    Hypertable

    Hypertable delivers scalable database capacity at maximum performance to speed up your big data application and reduce your hardware footprint. Hypertable delivers maximum efficiency and superior performance over the competition which translates into major cost savings. A proven scalable design that powers hundreds of Google services. All the benefits of open source with a strong and thriving community. C++ implementation for optimum performance. 24/7/365 support for your business-critical big data application. Unparalleled access to Hypertable brain power by the employer of all core Hypertable developers. Hypertable was designed for the express purpose of solving the scalability problem, a problem that is not handled well by a traditional RDBMS. Hypertable is based on a design developed by Google to meet their scalability requirements and solves the scale problem better than any of the other NoSQL solutions out there.
  • 12
    Apache Gobblin

    Apache Gobblin

    Apache Software Foundation

    A distributed data integration framework that simplifies common aspects of Big Data integration such as data ingestion, replication, organization, and lifecycle management for both streaming and batch data ecosystems. Runs as a standalone application on a single box. Also supports embedded mode. Runs as an mapreduce application on multiple Hadoop versions. Also supports Azkaban for launching mapreduce jobs. Runs as a standalone cluster with primary and worker nodes. This mode supports high availability and can run on bare metals as well. Runs as an elastic cluster on public cloud. This mode supports high availability. Gobblin as it exists today is a framework that can be used to build different data integration applications like ingest, replication, etc. Each of these applications is typically configured as a separate job and executed through a scheduler like Azkaban.
  • 13
    DataSort

    DataSort

    Inventale

    A portal based on mobile- and enriched third-party data that allows one to: — reconstruct users’ sociodemographic (gender, age) — develop user segments (eg., young parents, frequent travellers, blue collars, university students, wealthy residents, etc.) — provide analytics according to clients’ requirements (places with users’ concentrations, customers’ loyalty, trends and variances, comparison with competitors, etc.) — determine the best location for opening a new kindergarten/supermarket/mall based on users' concentration, interests and sociodemographic factors. The solution started as a custom project for one of our UAE clients, but due to high demand further developed into a full-scale product that helps different businesses to answer important questions and solve principal tasks such as: — launch of granular targeted ad campaigns; — finding the best location for opening a business unit; — identification of best locations for placing outdoor banners and so on.
    Starting Price: $50,000
  • 14
    WhereScape

    WhereScape

    WhereScape Software

    WhereScape helps IT organizations of all sizes leverage automation to design, develop, deploy, and operate data infrastructure faster. More than 700 customers worldwide rely on WhereScape automation to eliminate hand-coding and other repetitive, time-intensive aspects of data infrastructure projects to deliver data warehouses, vaults, lakes and marts in days or weeks rather than in months or years. From data warehouses and vaults to data lakes and marts, deliver data infrastructure and big data integration fast. Quickly and easily plan, model and design all types of data infrastructure projects. Use sophisticated data discovery and profiling capabilities to bulletproof design and rapid prototyping to collaborate earlier with business users. Fast-track the development, deployment and operation of your data infrastructure projects. Dramatically reduce the delivery time, effort, cost and risk of new projects, and better position projects for future business change.
  • 15
    Toustone

    Toustone

    Toustone

    Toustone are data people with a passion for making every business data-driven. Located in regional Australia, Albury-Wodonga, Toustone have developed a BI solution for several different industries to support in addressing challenges across productivity, data visibility, and cohesion along with data trust that will enable you to make better more informed data-driven decisions. Tailored solutions that offer: • Hosting & Data Warehousing • Data Modelling & Visualisation • AI & Data Science • Fully Managed Service This is a fully integrated and customisable solution based on real industry experience to enable any business to produce automated daily KPI reports and visual dashboards. This will give you the ability to quickly and easily drill into the ‘why’ behind the numbers and get meaningful, actionable insights, creating more value from the data you already have! Contact Toustone today to begin your journey to becoming a data-driven company.