IBM Cloud SQL Query
Serverless, interactive querying for analyzing data in IBM Cloud Object Storage. Query your data directly where it is stored, there's no ETL, no databases, and no infrastructure to manage. IBM Cloud SQL Query uses Apache Spark, an open-source, fast, extensible, in-memory data processing engine optimized for low latency and ad hoc analysis of data. No ETL or schema definition needed to enable SQL queries. Analyze data where it sits in IBM Cloud Object Storage using our query editor and REST API. Run as many queries as you need; with pay-per-query pricing, you pay only for the data scan. Compress or partition data to drive savings and performance. IBM Cloud SQL Query is highly available and executes queries using compute resources across multiple facilities. IBM Cloud SQL Query supports a variety of data formats such as CSV, JSON and Parquet, and allows for standard ANSI SQL.
Learn more
Amazon Redshift
More customers pick Amazon Redshift than any other cloud data warehouse. Redshift powers analytical workloads for Fortune 500 companies, startups, and everything in between. Companies like Lyft have grown with Redshift from startups to multi-billion dollar enterprises. No other data warehouse makes it as easy to gain new insights from all your data. With Redshift you can query petabytes of structured and semi-structured data across your data warehouse, operational database, and your data lake using standard SQL. Redshift lets you easily save the results of your queries back to your S3 data lake using open formats like Apache Parquet to further analyze from other analytics services like Amazon EMR, Amazon Athena, and Amazon SageMaker. Redshift is the world’s fastest cloud data warehouse and gets faster every year. For performance intensive workloads you can use the new RA3 instances to get up to 3x the performance of any cloud data warehouse.
Learn more
OpenObserve
OpenObserve is an open source observability platform for logs, metrics, and traces that emphasizes high performance, scalability, and dramatically lower cost. It supports petabyte-scale observability thanks to features like data compression using columnar storage and the ability to use “bring your own bucket” storage (local disk, S3, GCS, Azure Blob, etc.). It is written in Rust, uses the DataFusion query engine to directly query Parquet files, and provides a stateless, horizontally scalable architecture with caching (both result and disk) to maintain speed under heavy load. It embraces open standards (OpenTelemetry compatibility, vendor-neutral APIs), so it fits into existing monitoring/logging workflows. Key modules include logs, metrics, traces, frontend monitoring, pipelines, alerts, and dashboards/visualizations.
Learn more
SDF
SDF is a developer platform for data that enhances SQL comprehension across organizations, enabling data teams to unlock the full potential of their data. It provides a transformation layer to streamline query writing and management, an analytical database engine for local execution, and an accelerator for improved transformation processes. SDF also offers proactive quality and governance features, including reports, contracts, and impact analysis, to ensure data integrity and compliance. By representing business logic as code, SDF facilitates the classification and management of data types, enhancing the clarity and maintainability of data models. It integrates seamlessly with existing data workflows, supporting various SQL dialects and cloud environments, and is designed to scale with the growing needs of data teams. SDF's open-core architecture, built on Apache DataFusion, allows for customization and extension, fostering a collaborative ecosystem for data development.
Learn more