DataHub
We help organizations of all sizes to design, develop and scale solutions to manage their data and unleash its potential. At Datahub, we have over thousands of datasets for free and a Premium Data Service for additional or customised data with guaranteed updates. Datahub provides important, commonly-used data as high quality, easy-to-use and open data packages. Securely share and elegantly put data online with quality checks, versioning, data APIs, notifications & integrations. Power and simplicity, data is the fastest way for individuals, teams and organizations to publish, deploy and share structured data. Automate your data processes with our open source framework. Store, share and showcase your data with the world or just privately. Completely open source with professional maintenance and support. End-to-end solution with all parts are fully integrated. Not just tools but a standardized approach and pattern for working with your data.
Learn more
StarTree
StarTree, powered by Apache Pinot™, is a fully managed real-time analytics platform built for customer-facing applications that demand instant insights on the freshest data. Unlike traditional data warehouses or OLTP databases—optimized for back-office reporting or transactions—StarTree is engineered for real-time OLAP at true scale, meaning:
- Data Volume: query performance sustained at petabyte scale
- Ingest Rates: millions of events per second, continuously indexed for freshness
- Concurrency: thousands to millions of simultaneous users served with sub-second latency
With StarTree, businesses deliver always-fresh insights at interactive speed, enabling applications that personalize, monitor, and act in real time.
Learn more
IRI FieldShield
IRI FieldShield® is powerful and affordable data discovery and masking software for PII in structured and semi-structured sources, big and small. Use FieldShield utilities in Eclipse to profile, search and mask data at rest (static data masking), and the FieldShield SDK to mask (or unmask) data in motion (dynamic data masking).
Classify PII centrally, find it globally, and mask it consistently. Preserve realism and referential integrity via encryption, pseudonymization, redaction, and other rules for production and test environments.
Delete, deliver, or anonymize data subject to DPA, FERPA, GDPR, GLBA, HIPAA, PCI, POPI, SOX, etc. Verify compliance via human- and machine-readable search reports, job audit logs, and re-identification risk scores.
Optionally mask data as you map it. Apply FieldShield functions in IRI Voracity ETL, federation, migration, replication, subsetting, or analytic jobs. Or, run FieldShield from Actifio, Commvault or Windocks to mask DB clones.
Learn more
Cloudera Data Warehouse
Cloudera Data Warehouse is a cloud-native, self-service analytics solution that lets IT rapidly deliver query capabilities to BI analysts, enabling users to go from zero to query in minutes. It supports all data types, structured, semi-structured, unstructured, real-time, and batch, and scales cost-effectively from gigabytes to petabytes. It is fully integrated with streaming, data engineering, and AI services, and enforces a unified security, governance, and metadata framework across private, public, or hybrid cloud deployments. Each virtual warehouse (data warehouse or mart) is isolated and automatically configured and optimized, ensuring that workloads do not interfere with each other. Cloudera leverages open source engines such as Hive, Impala, Kudu, and Druid, along with tools like Hue and more, to handle diverse analytics, from dashboards and operational analytics to research and discovery over vast event or time-series data.
Learn more