Apache Iceberg
Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, Hive and Impala to safely work with the same tables, at the same time. Iceberg supports flexible SQL commands to merge new data, update existing rows, and perform targeted deletes. Iceberg can eagerly rewrite data files for read performance, or it can use delete deltas for faster updates. Iceberg handles the tedious and error-prone task of producing partition values for rows in a table and skips unnecessary partitions and files automatically. No extra filters are needed for fast queries, and the table layout can be updated as data or queries change.
Learn more
Cloudflare R2
Cloudflare R2 is a global object storage service that allows developers to store large amounts of unstructured data without the costly egress bandwidth fees associated with typical cloud storage services. It supports multiple scenarios, including storage for cloud-native applications, web content, podcast episodes, data lakes, and outputs for large batch processes such as machine learning model artifacts or datasets. R2 offers features like location hints to optimize data access, CORS configuration for interacting with objects, public buckets to expose contents directly to the Internet, and bucket-scoped tokens for granular access control. It integrates with Cloudflare Workers, enabling developers to perform authentication, route requests, and deploy edge functions across a network of over 330 data centers. Additionally, R2 supports Apache Iceberg through its data catalog, transforming object storage into a fully functional data warehouse without management overhead.
Learn more
Tabular
Tabular is an open table store from the creators of Apache Iceberg. Connect multiple computing engines and frameworks. Decrease query time and storage costs by up to 50%. Centralize enforcement of data access (RBAC) policies. Connect any query engine or framework, including Athena, BigQuery, Redshift, Snowflake, Databricks, Trino, Spark, and Python. Smart compaction, clustering, and other automated data services reduce storage costs and query times by up to 50%. Unify data access at the database or table. RBAC controls are simple to manage, consistently enforced, and easy to audit. Centralize your security down to the table. Tabular is easy to use plus it features high-powered ingestion, performance, and RBAC under the hood. Tabular gives you the flexibility to work with multiple “best of breed” compute engines based on their strengths. Assign privileges at the data warehouse database, table, or column level.
Learn more
StarTree
StarTree, powered by Apache Pinot™, is a fully managed real-time analytics platform built for customer-facing applications that demand instant insights on the freshest data. Unlike traditional data warehouses or OLTP databases—optimized for back-office reporting or transactions—StarTree is engineered for real-time OLAP at true scale, meaning:
- Data Volume: query performance sustained at petabyte scale
- Ingest Rates: millions of events per second, continuously indexed for freshness
- Concurrency: thousands to millions of simultaneous users served with sub-second latency
With StarTree, businesses deliver always-fresh insights at interactive speed, enabling applications that personalize, monitor, and act in real time.
Learn more