Page 2 | Best Data Management Software for Docker

Kedro

Kedro is the foundation for clean data science code. It borrows concepts from software engineering and applies them to machine-learning projects. A Kedro project provides scaffolding for complex data and machine-learning pipelines. You spend less time on tedious "plumbing" and focus instead on solving new problems. Kedro standardizes how data science code is created and ensures teams collaborate to solve problems easily. Make a seamless transition from development to production with exploratory code that you can transition to reproducible, maintainable, and modular experiments. A series of lightweight data connectors is used to save and load data across many different file formats and file systems.

Starting Price: Free

View Software

Marqo

Marqo is more than a vector database, it's an end-to-end vector search engine. Vector generation, storage, and retrieval are handled out of the box through a single API. No need to bring your own embeddings. Accelerate your development cycle with Marqo. Index documents and begin searching in just a few lines of code. Create multimodal indexes and search combinations of images and text with ease. Choose from a range of open source models or bring your own. Build interesting and complex queries with ease. With Marqo you can compose queries with multiple weighted components. With Marqo, input pre-processing, machine learning inference, and storage are all included out of the box. Run Marqo in a Docker image on your laptop or scale it up to dozens of GPU inference nodes in the cloud. Marqo can be scaled to provide low-latency searches against multi-terabyte indexes. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images.

Starting Price: $86.58 per month

View Software

STRM

Creating and managing data policies is a slow pain. With PACE by STRM, you can make sure data is used securely. Apply data policies through code, wherever it lives. Farewell to long waits and costly meetings, meet your new open source data security engine. Data policies aren't just about controlling access; they are about extracting value from data with the right guardrails. PACE lets you collaborate on the why and when automating the how through code. With PACE you can programmatically define and apply data policies across platforms. Integrated into your data platform and catalog (optional), and by leveraging the native capabilities of the stack you already have. PACE enables automated policy application across key data platforms and catalogs to ease your governance processes. Ease the process of policy creation and implementation, centralize control, and decentralize execution. Fulfill auditing obligations by simply showing how controls are implemented.

Starting Price: Free

View Software

YDB

Entrust YDB with keeping your application state regardless of how large or frequently modified it is. Handling petabytes of data and millions of transactions per second is not an issue. Build analytical reports based on data you store in YDB with performance comparable to database management systems purpose-built for this use case. No compromises on consistency and availability are necessary. Use the YDB topics feature to reliably send data between your applications or consume change data capture feed from regular tables. Exactly-once and at-least-once semantics are available to choose from. YDB is designed to work in three availability zones, ensuring availability even if the whole availability zone goes offline. It recovers automatically after a disk, server, or data center failure with minimum latency disruptions for applications.

Starting Price: Free

View Software

PuppyGraph

PuppyGraph empowers you to seamlessly query one or multiple data stores as a unified graph model. Graph databases are expensive, take months to set up, and need a dedicated team. Traditional graph databases can take hours to run multi-hop queries and struggle beyond 100GB of data. A separate graph database complicates your architecture with brittle ETLs and inflates your total cost of ownership (TCO). Connect to any data source anywhere. Cross-cloud and cross-region graph analytics. No complex ETLs or data replication is required. PuppyGraph enables you to query your data as a graph by directly connecting to your data warehouses and lakes. This eliminates the need to build and maintain time-consuming ETL pipelines needed with a traditional graph database setup. No more waiting for data and failed ETL processes. PuppyGraph eradicates graph scalability issues by separating computation and storage.

Starting Price: Free

View Software

Timeplus

Timeplus is a simple, powerful, and cost-efficient stream processing platform. All in a single binary, easily deployed anywhere. We help data teams process streaming and historical data quickly and intuitively, in organizations of all sizes and industries. Lightweight, single binary, without dependencies. End-to-end analytic streaming and historical functionalities. 1/10 the cost of similar open source frameworks. Turn real-time market and transaction data into real-time insights. Leverage append-only streams and key-value streams to monitor financial data. Implement real-time feature pipelines using Timeplus. One platform for all infrastructure logs, metrics, and traces, the three pillars supporting observability. In Timeplus, we support a wide range of data sources in our web console UI. You can also push data via REST API, or create external streams without copying data into Timeplus.

Starting Price: $199 per month

View Software

PandaETL

Upload PDFs, spreadsheets, and other documents. No complex setup is required, just drag, drop, and start working. Choose your tasks and let the platform extract the precise data you need. Review and get organized, actionable data in a format you know and trust. Whether it’s contracts, invoices, images, websites, or reports, the platform helps you extract valuable information and organize it efficiently. Explore your files with an intuitive chat interface. Dialogue with your data to uncover insights in PDFs, spreadsheets, and more. Generate detailed reports quickly. Create overviews and summaries with references in minutes. Open the extraction tables, click on each cell, and immediately look at the source, in the context. Download highlighted files in batch. Ideal for businesses looking to enhance efficiency and reduce costs in document-intensive operations. Ensure automation is optimized to specific industries thanks to our plug-and-play modules or request your own customization.

Starting Price: Free

View Software

ApertureDB

Build your competitive edge with the power of vector search. Streamline your AI/ML pipeline workflows, reduce infrastructure costs, and stay ahead of the curve with up to 10x faster time-to-market. Break free of data silos with ApertureDB's unified multimodal data management, freeing your AI teams to innovate. Set up and scale complex multimodal data infrastructure for billions of objects across your entire enterprise in days, not months. Unifying multimodal data, advanced vector search, and innovative knowledge graph with a powerful query engine to build AI applications faster at enterprise scale. ApertureDB can enhance the productivity of your AI/ML teams and accelerate returns from AI investment with all your data. Try it for free or schedule a demo to see it in action. Find relevant images based on labels, geolocation, and regions of interest. Prepare large-scale multi-modal medical scans for ML and clinical studies.

Starting Price: $0.33 per hour

View Software

Starcounter

Our ACID in-memory technology and application server enable you to build lightning-fast enterprise software. Without custom tooling or new syntax. Starcounter applications let you achieve 50 to 1000 times better performance without adding complexity. Applications are written in regular C#, LINQ, and SQL. Even the ACID transactions are written in regular C# code. Full Visual Studio support including IntelliSense, debugger, and performance profiler. All the things you like, minus the headache. Write regular C# syntax with MVVM pattern to leverage ACID in-memory technology and thin client UI for extreme performance. Starcounter technology adds business value from day one. We leverage technology that’s already developed and in production, processing millions of business transactions for high-demand customers. Starcounter combines ACID in-memory database and application server into a single platform unmatched in performance, simplicity, and price.

Starting Price: Free

View Software

Stackable

The Stackable data platform was designed with openness and flexibility in mind. It provides you with a curated selection of the best open source data apps like Apache Kafka, Apache Druid, Trino, and Apache Spark. While other current offerings either push their proprietary solutions or deepen vendor lock-in, Stackable takes a different approach. All data apps work together seamlessly and can be added or removed in no time. Based on Kubernetes, it runs everywhere, on-prem or in the cloud. stackablectl and a Kubernetes cluster are all you need to run your first stackable data platform. Within minutes, you will be ready to start working with your data. Configure your one-line startup command right here. Similar to kubectl, stackablectl is designed to easily interface with the Stackable Data Platform. Use the command line utility to deploy and manage stackable data apps on Kubernetes. With stackablectl, you can create, delete, and update components.

Starting Price: Free

View Software

HarperDB

HarperDB is a distributed systems platform that combines database, caching, application, and streaming functions into a single technology. With it, you can start delivering global-scale back-end services with less effort, higher performance, and lower cost than ever before. Deploy user-programmed applications and pre-built add-ons on top of the data they depend on for a high throughput, ultra-low latency back end. Lightning-fast distributed database delivers orders of magnitude more throughput per second than popular NoSQL alternatives while providing limitless horizontal scale. Native real-time pub/sub communication and data processing via MQTT, WebSocket, and HTTP interfaces. HarperDB delivers powerful data-in-motion capabilities without layering in additional services like Kafka. Focus on features that move your business forward, not fighting complex infrastructure. You can't change the speed of light, but you can put less light between your users and their data.

Starting Price: Free

View Software

GlassFlow

GlassFlow is a serverless, event-driven data pipeline platform designed for Python developers. It enables users to build real-time data pipelines without the need for complex infrastructure like Kafka or Flink. By writing Python functions, developers can define data transformations, and GlassFlow manages the underlying infrastructure, offering auto-scaling, low latency, and optimal data retention. The platform supports integration with various data sources and destinations, including Google Pub/Sub, AWS Kinesis, and OpenAI, through its Python SDK and managed connectors. GlassFlow provides a low-code interface for quick pipeline setup, allowing users to create and deploy pipelines within minutes. It also offers features such as serverless function execution, real-time API connections, and alerting and reprocessing capabilities. The platform is designed to simplify the creation and management of event-driven data pipelines, making it accessible for Python developers.

Starting Price: $350 per month

View Software

txtai

NeuML

txtai is an all-in-one open source embeddings database designed for semantic search, large language model orchestration, and language model workflows. It unifies vector indexes (both sparse and dense), graph networks, and relational databases, providing a robust foundation for vector search and serving as a powerful knowledge source for LLM applications. With txtai, users can build autonomous agents, implement retrieval augmented generation processes, and develop multi-modal workflows. Key features include vector search with SQL support, object storage integration, topic modeling, graph analysis, and multimodal indexing capabilities. It supports the creation of embeddings for various data types, including text, documents, audio, images, and video. Additionally, txtai offers pipelines powered by language models that handle tasks such as LLM prompting, question-answering, labeling, transcription, translation, and summarization.

Starting Price: Free

View Software

Lightstreamer

Lightstreamer is an event broker optimized for the internet, ensuring seamless real-time data delivery across the web. Unlike traditional brokers, Lightstreamer automatically handles proxies, firewalls, disconnections, network congestion, and the general unpredictability of the internet. With its intelligent streaming feature, Lightstreamer guarantees real-time data transmission, always finding a way to deliver your data reliably and efficiently, ensuring robust last-mile messaging. Lightstreamer offers technology that is both mature and cutting-edge, continuously evolving to stay at the forefront of innovation. With a proven track record and years of field-tested performance, Lightstreamer ensures your data is delivered reliably and efficiently. Experience unparalleled reliability in any scenario with Lightstreamer.

Starting Price: Free

View Software

Enov8

End-to-end “Business Intelligence” for your IT organization. Promoting transparency, control, and productivity across environments, release and data. Promote scaled agility across your IT fabric. A complete environment and release picture supporting collaboration across teams and providing the insight that organizations require today to drive competitive innovation. Improve visibility of your complex IT fabric allowing better collaboration and decision making. Manage complex computer systems & the end-to-end IT fabric through a centralized portal. Measure test environment usage to reduce IT spend and increase project productivity. Eliminate chaotic and non-repeatable operations by establishing control via centralized runbooks and using automation on regular & time consuming tasks. Manage change and contention effectively whilst providing real time health status and powerful analytics to determine business impact.

Starting Price: $8 per month

View Software

AllegroGraph

Franz Inc.

AllegroGraph is a breakthrough solution that allows infinite data integration through a patented approach unifying all data and siloed knowledge into an Entity-Event Knowledge Graph solution that can support massive big data analytics. AllegroGraph utilizes unique federated sharding capabilities that drive 360-degree insights and enable complex reasoning across a distributed Knowledge Graph. AllegroGraph provides users with an integrated version of Gruff, a unique browser-based graph visualization software tool for exploring and discovering connections within enterprise Knowledge Graphs. Franz’s Knowledge Graph Solution includes both technology and services for building industrial strength Entity-Event Knowledge Graphs based on best-of-class tools, products, knowledge, skills and experience.

View Software

Fluentd

Fluentd Project

A single, unified logging layer is key to make log data accessible and usable. However, existing tools fall short: legacy tools are not built for new cloud APIs and microservice-oriented architecture in mind and are not innovating quickly enough. Fluentd, created by Treasure Data, solves the challenges of building a unified logging layer with a modular architecture, an extensible plugin model, and a performance optimized engine. In addition to these features, Fluentd Enterprise addresses Enterprise requirements such as Trusted Packaging. Security. Certified Enterprise Connectors, Management / Monitoring, and Enterprise SLA-Based Support, Assurance, and Enterprise Consulting Services

View Software

Greenplum

Greenplum Database

Greenplum Database® is an advanced, fully featured, open source data warehouse. It provides powerful and rapid analytics on petabyte scale data volumes. Uniquely geared toward big data analytics, Greenplum Database is powered by the world’s most advanced cost-based query optimizer delivering high analytical query performance on large data volumes. Greenplum Database® project is released under the Apache 2 license. We want to thank all our current community contributors and are interested in all new potential contributions. For the Greenplum Database community no contribution is too small, we encourage all types of contributions. An open-source massively parallel data platform for analytics, machine learning and AI. Rapidly create and deploy models for complex applications in cybersecurity, predictive maintenance, risk management, fraud detection, and many other areas. Experience the fully featured, integrated, open source analytics platform.

View Software

Trustgrid

Trustgrid is the SD-WAN for application providers. The Trustgrid platform uniquely addresses the needs of SaaS application providers who rely on remote systems. By combining an SD-WAN 2.0, edge computing, and zero trust remote access into a single platform we allow software providers to manage and support distributed application environments from the cloud to the edge. With the Trustgrid platform you can: • Build cloud to on-premise networks at scale • Manage and support 100s of networks from a single pane of glass • Control on-premise apps and appliances as if they were in the cloud • Run and support Docker containers in any cloud or on-premise • Provide your support teams secure access to edge application environments Simplify connectivity, enhance security, and guarantee network availability with Trustgrid.

View Software

Actian Zen

Actian

Actian Zen is an embedded, high-performance, and low-maintenance database management system designed for edge applications, mobile devices, and IoT environments. It offers a seamless integration of SQL and NoSQL data models, providing flexibility for developers working with structured and unstructured data. Actian Zen is known for its small footprint, scalability, and high reliability, making it ideal for resource-constrained environments where consistent performance and minimal administrative overhead are essential. With built-in security features and a self-tuning architecture, it supports real-time data processing and analytics without the need for constant monitoring or maintenance. Actian Zen is widely used in industries like healthcare, retail, and manufacturing, where edge computing and distributed data environments are critical for business operations.

View Software

SQL Server Data Tools (SSDT)

Microsoft

SQL Server Data Tools (SSDT) transforms database development by introducing a ubiquitous, declarative model that spans all the phases of database development inside Visual Studio. You can use SSDT Transact-SQL design capabilities to build, debug, maintain, and refactor databases. You can work with a database project, or directly with a connected database instance on or off-premise. Developers can use familiar Visual Studio tools for database development. Tools such as: code navigation, IntelliSense, language support that parallels what is available for C# and Visual Basic, platform-specific validation, debugging, and declarative editing in the Transact-SQL editor. SSDT also provides a visual Table Designer for creating and editing tables in either database projects or connected database instances. While you are working on your database projects in a team-based environment, you can use version control for all the files.

View Software

IRI Voracity

IRI, The CoSort Company

Voracity is the only high-performance, all-in-one data management platform accelerating AND consolidating the key activities of data discovery, integration, migration, governance, and analytics. Voracity helps you control your data in every stage of the lifecycle, and extract maximum value from it. Only in Voracity can you: 1) CLASSIFY, profile and diagram enterprise data sources 2) Speed or LEAVE legacy sort and ETL tools 3) MIGRATE data to modernize and WRANGLE data to analyze 4) FIND PII everywhere and consistently MASK it for referential integrity 5) Score re-ID risk and ANONYMIZE quasi-identifiers 6) Create and manage DB subsets or intelligently synthesize TEST data 7) Package, protect and provision BIG data 8) Validate, scrub, enrich and unify data to improve its QUALITY 9) Manage metadata and MASTER data. Use Voracity to comply with data privacy laws, de-muck and govern the data lake, improve the reliability of your analytics, and create safe, smart test data

View Software

Oracle Big Data SQL Cloud Service

Oracle

Oracle Big Data SQL Cloud Service enables organizations to immediately analyze data across Apache Hadoop, NoSQL and Oracle Database leveraging their existing SQL skills, security policies and applications with extreme performance. From simplifying data science efforts to unlocking data lakes, Big Data SQL makes the benefits of Big Data available to the largest group of end users possible. Big Data SQL gives users a single location to catalog and secure data in Hadoop and NoSQL systems, Oracle Database. Seamless metadata integration and queries which join data from Oracle Database with data from Hadoop and NoSQL databases. Utilities and conversion routines support automatic mappings from metadata stored in HCatalog (or the Hive Metastore) to Oracle Tables. Enhanced access parameters give administrators the flexibility to control column mapping and data access behavior. Multiple cluster support enables one Oracle Database to query multiple Hadoop clusters and/or NoSQL systems.

View Software

IBM Cloud Content Delivery Network

IBM

Your users expect fast load times for your web apps. But content delivery can be slow and inconsistent. IBM® Content Delivery Network provides content caching on the Akamai network, so content is delivered at record speed. Serve non-cacheable dynamic content with optimized performance, and automatically scale your service with pay-as-you-go pricing. IBM Cloud® has partnered with Akamai to offer the most comprehensive features, while ensuring affordability. This partnership combines Akamai's presence, nearly 1,700 networks in 136 countries, with IBM's global cloud footprint of 60-plus data centers in 19 countries, to bring the content closest to where you need it most, your users. Host and serve website assets, images, videos and documents, and user-generated content in cloud object storage. Ensure faster, more secure delivery to users worldwide. Deliver content at record speed to meet and exceed customer expectations.

Starting Price: $0.025 per GB per month

View Software

Supabase

Create a backend in less than 2 minutes. Start your project with a Postgres database, authentication, instant APIs, real-time subscriptions and storage. Build faster and focus on your products. Every project is a full Postgres database, the world's most trusted relational database. Add user sign-ups and logins, securing your data with Row Level Security. Store, organize and serve large files. Any media, including videos and images. Write custom code and cron jobs without deploying or scaling servers. There are many example apps and starter projects to get going. We introspect your database to provide APIs instantly. Stop building repetitive CRUD endpoints and focus on your product. Type definitions built directly from your database schema. Use Supabase in the browser without a build process. Develop locally and push to production when you're ready. Manage Supabase projects from your local machine.

Starting Price: $25 per month

View Software

jethro

Data-driven decision-making has unleashed a surge of business data and a rise in user demand to analyze it. This trend drives IT departments to migrate off expensive Enterprise Data Warehouses (EDW) toward cost-effective Big Data platforms like Hadoop or AWS. These new platforms come with a Total Cost of Ownership (TCO) that is about 10 times lower. They are not ideal for interactive BI applications, however, as they fail to match the high performance and user concurrency of legacy EDWs. For this exact reason, we developed Jethro. Customers use Jethro for interactive BI on Big Data. Jethro is a transparent middle tier that requires no changes to existing apps or data. It is self-driving with no maintenance required. Jethro is compatible with BI tools like Tableau, Qlik, and Microstrategy and is data source agnostic. Jethro delivers on the demands of business users allowing for thousands of concurrent users to run complicated queries over billions of records.

View Software

Nextflow

Seqera Labs

Data-driven computational pipelines. Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages. Its fluent DSL simplifies the implementation and deployment of complex parallel and reactive workflows on clouds and clusters. Nextflow is built around the idea that Linux is the lingua franca of data science. Nextflow allows you to write a computational pipeline by making it simpler to put together many different tasks. You may reuse your existing scripts and tools and you don't need to learn a new language or API to start using it. Nextflow supports Docker and Singularity containers technology. This, along with the integration of the GitHub code-sharing platform, allows you to write self-contained pipelines, manage versions, and rapidly reproduce any former configuration. Nextflow provides an abstraction layer between your pipeline's logic and the execution layer.

Starting Price: Free

View Software

MLReef

MLReef enables domain experts and data scientists to securely collaborate via a hybrid of pro-code & no-code development approaches. 75% increase in productivity due to distributed workloads. This enables teams to complete more ML projects faster. Domain experts and data scientists collaborate on the same platform reducing 100% of unnecessary communication ping-pong. MLReef works on your premises and uniquely enables 100% reproducibility and continuity. Rebuild all work at any time. You can use already well-known and established git repositories to create explorable, interoperable, and versioned AI modules. AI Modules created by your data scientists become drag-and-drop elements. These are adjustable by parameters, versioned, interoperable, and explorable within your entire organization. Data handling often requires expert knowledge that a single data scientist often lacks. MLReef enables your field experts to relieve your data processing task, reducing complexities.

View Software

DataOps.live

DataOps.live, the Data Products company, delivers productivity and governance breakthroughs for data developers and teams through environment automation, pipeline orchestration, continuous testing and unified observability. We bring agile DevOps automation and a powerful unified cloud Developer Experience (DX) to modern cloud data platforms like Snowflake. DataOps.live, a global cloud-native company, is used by Global 2000 enterprises including Roche Diagnostics and OneWeb to deliver 1000s of Data Product releases per month with the speed and governance the business demands.

View Software

JetBrains DataSpell

JetBrains

Switch between command and editor modes with a single keystroke. Navigate over cells with arrow keys. Use all of the standard Jupyter shortcuts. Enjoy fully interactive outputs – right under the cell. When editing code cells, enjoy smart code completion, on-the-fly error checking and quick-fixes, easy navigation, and much more. Work with local Jupyter notebooks or connect easily to remote Jupyter, JupyterHub, or JupyterLab servers right from the IDE. Run Python scripts or arbitrary expressions interactively in a Python Console. See the outputs and the state of variables in real-time. Split Python scripts into code cells with the #%% separator and run them individually as you would in a Jupyter notebook. Browse DataFrames and visualizations right in place via interactive controls. All popular Python scientific libraries are supported, including Plotly, Bokeh, Altair, ipywidgets, and others.

Starting Price: $229

View Software

Best Data Management Software for Docker - Page 2

Compare the Top Data Management Software that integrates with Docker as of December 2025 - Page 2

Kedro

Marqo

STRM

YDB

PuppyGraph

Timeplus

PandaETL

ApertureDB

Starcounter

Stackable

HarperDB

GlassFlow

txtai

Lightstreamer

Enov8

AllegroGraph

Fluentd

Greenplum

Trustgrid

Actian Zen

SQL Server Data Tools (SSDT)

IRI Voracity

Oracle Big Data SQL Cloud Service

IBM Cloud Content Delivery Network

Supabase

jethro

Nextflow

MLReef

DataOps.live

JetBrains DataSpell

Best Data Management Software for Docker - Page 2

Compare the Top Data Management Software that integrates with Docker as of December 2025 - Page 2

Kedro

Marqo

STRM

YDB

PuppyGraph

Timeplus

PandaETL

ApertureDB

Starcounter

Stackable

HarperDB

GlassFlow

txtai

Lightstreamer

Enov8

AllegroGraph

Fluentd

Greenplum

Trustgrid

Actian Zen

SQL Server Data Tools (SSDT)

IRI Voracity

Oracle Big Data SQL Cloud Service

IBM Cloud Content Delivery Network

Supabase

jethro

Nextflow

MLReef

DataOps.live

JetBrains DataSpell

Related Categories