Best Key-Value Databases for Apache Spark

Compare the Top Key-Value Databases that integrate with Apache Spark as of December 2025

Sort By:

Apache Spark Key-Value Databases Clear Filters

This a list of Key-Value Databases that integrate with Apache Spark. Use the filters on the left to add additional filters for products that have integrations with Apache Spark. View the products that work with Apache Spark in the table below.

What are Key-Value Databases for Apache Spark?

Key-value databases are a type of NoSQL database that store data as pairs, where each unique key is associated with a value. This structure is simple and highly flexible, making key-value databases ideal for scenarios requiring fast access to data, such as caching, session management, and real-time applications. In these databases, the key acts as a unique identifier for retrieving or storing the value, which can be any type of data—strings, numbers, objects, or even binary data. Key-value stores are known for their scalability, performance, and ability to handle high volumes of read and write operations with low latency. These databases are particularly useful for applications that require quick lookups or high availability, such as online retail platforms, social networks, and recommendation systems. Compare and read user reviews of the best Key-Value Databases for Apache Spark currently available using the table below. This list is updated regularly.

1

Apache Cassandra

Apache Software Foundation

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

1 Rating

View Software
2

Speedb

Speedb

The next-generation key-value storage engine.bSpeedb is 100% RocksDB compatible enhancing stability, efficiency, and overall performance. Join the Hive, Speedb’s open-source community, to interact, improve, and share knowledge and best practices on RocksDB. Speedb is a compatible alternative for LevelDB and RocksDB users who would like to take their application to the next level. When using event streaming platforms like Kafka, Flink, Spark, Splunk, Elastic, or others, consider using Speedb to enhance its performance. The increase in metadata in modern data sets is causing significant performance issues for many applications. With Speedb you can keep costs low and ensure your applications continue to run smoothly even under heavy loads. When it comes to making a choice to upgrade or deploy a new key-value store with your platform, Speedb is up for the challenge. By seamlessly integrating Speedb's advanced key-value storage engine with your projects, you'll experience immediate relief.

Starting Price: Free

View Software
3

Apache HBase

The Apache Software Foundation

Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Automatic failover support between RegionServers. Easy to use Java API for client access. Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options. Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX.

View Software
4

Google Cloud Bigtable

Google

Google Cloud Bigtable is a fully managed, scalable NoSQL database service for large analytical and operational workloads. Fast and performant: Use Cloud Bigtable as the storage engine that grows with you from your first gigabyte to petabyte-scale for low-latency applications as well as high-throughput data processing and analytics. Seamless scaling and replication: Start with a single node per cluster, and seamlessly scale to hundreds of nodes dynamically supporting peak demand. Replication also adds high availability and workload isolation for live serving apps. Simple and integrated: Fully managed service that integrates easily with big data tools like Hadoop, Dataflow, and Dataproc. Plus, support for the open source HBase API standard makes it easy for development teams to get started.

View Software