DuckDB Integrations

9 Integrations with DuckDB

View a list of DuckDB integrations and software that integrates with DuckDB below. Compare the best DuckDB integrations as well as features, ratings, user reviews, and pricing of software that integrates with DuckDB. Here are the current DuckDB integrations in 2024:

1

Union Cloud

Union.ai

Union.ai is an award-winning, Flyte-based data and ML orchestrator for scalable, reproducible ML pipelines. With Union.ai, you can write your code locally and easily deploy pipelines to remote Kubernetes clusters. “Flyte’s scalability, data lineage, and caching capabilities enable us to train hundreds of models on petabytes of geospatial data, giving us an edge in our business.” — Arno, CTO at Blackshark.ai “With Flyte, we want to give the power back to biologists. We want to stand up something that they can play around with different parameters for their models because not every … parameter is fixed. We want to make sure we are giving them the power to run the analyses.” — Krishna Yeramsetty, Principal Data Scientist at Infinome “Flyte plays a vital role as a key component of Gojek's ML Platform by providing exactly that." — Pradithya Aria Pura, Principal Engineer at Goj

Starting Price: Free (Flyte)

View Software
2

Quary

Quary

Connect to your data warehouse, and SSO authenticates your team in seconds. Organize your business intelligence as SQL. Make changes with confidence, automated testing validates every update. Deploy models confidently and travel back in time when things go wrong. Quary connects to your sensitive data store. In order to protect it, Quary is built from the ground up with security in mind. By default, data does not leave your estate, data exchanges happen between your Quary client and your data store. Transform data together, model, test, and deploy as a team. SSO is not an extra, we include it in our base plan and even help set it up. No one should share credentials. Quary builds upon your data store's access management systems and helps you manage it. We are in the process of implementing SOC2 and have security credentials (CISSP) on the team to ensure your data stays secure.

Starting Price: Free

View Software
3

Flyte

Union.ai

The workflow automation platform for complex, mission-critical data and ML processes at scale. Flyte makes it easy to create concurrent, scalable, and maintainable workflows for machine learning and data processing. Flyte is used in production at Lyft, Spotify, Freenome, and others. At Lyft, Flyte has been serving production model training and data processing for over four years, becoming the de-facto platform for teams like pricing, locations, ETA, mapping, autonomous, and more. In fact, Flyte manages over 10,000 unique workflows at Lyft, totaling over 1,000,000 executions every month, 20 million tasks, and 40 million containers. Flyte has been battle-tested at Lyft, Spotify, Freenome, and others. It is entirely open-source with an Apache 2.0 license under the Linux Foundation with a cross-industry overseeing committee. Configuring machine learning and data workflows can get complex and error-prone with YAML.

Starting Price: Free

View Software
4

LanceDB

LanceDB

LanceDB is a developer-friendly, open source database for AI. From hyperscalable vector search and advanced retrieval for RAG to streaming training data and interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI application. Installs in seconds and fits seamlessly into your existing data and AI toolchain. An embedded database (think SQLite or DuckDB) with native object storage integration, LanceDB can be deployed anywhere and easily scales to zero when not in use. From rapid prototyping to hyper-scale production, LanceDB delivers blazing-fast performance for search, analytics, and training for multimodal AI data. Leading AI companies have indexed billions of vectors and petabytes of text, images, and videos, at a fraction of the cost of other vector databases. More than just embedding. Filter, select, and stream training data directly from object storage to keep GPU utilization high.

Starting Price: $16.03 per month

View Software
5

PuppyGraph

PuppyGraph

PuppyGraph empowers you to seamlessly query one or multiple data stores as a unified graph model. Graph databases are expensive, take months to set up, and need a dedicated team. Traditional graph databases can take hours to run multi-hop queries and struggle beyond 100GB of data. A separate graph database complicates your architecture with brittle ETLs and inflates your total cost of ownership (TCO). Connect to any data source anywhere. Cross-cloud and cross-region graph analytics. No complex ETLs or data replication is required. PuppyGraph enables you to query your data as a graph by directly connecting to your data warehouses and lakes. This eliminates the need to build and maintain time-consuming ETL pipelines needed with a traditional graph database setup. No more waiting for data and failed ETL processes. PuppyGraph eradicates graph scalability issues by separating computation and storage.

Starting Price: Free

View Software
6

Databricks Data Intelligence Platform

Databricks

The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.

View Software
7

MotherDuck

MotherDuck

We’re MotherDuck, a software company founded by a passionate flock of experienced data geeks. We’ve worked as leaders for some of the greatest companies in data. Scale-out is expensive and slow, let’s scale up. Big Data is dead, long live easy data. Your laptop is faster than your data warehouse. Why wait for the cloud? DuckDB slaps, so let’s supercharge it. When we founded MotherDuck we recognized that DuckDB might just be the next major game changer thanks to its ease of use, portability, lightning-fast performance, and rapid pace of community-driven innovation. At MotherDuck, we want to help the community, the DuckDB Foundation, and DuckDB Labs build greater awareness and adoption of DuckDB, whether users are working locally or want a serverless always-on way to execute their SQL. We are a world-class team of engineers and leaders with experience working on databases and cloud services at AWS, Databricks, Elastic, Facebook, Firebolt, Google BigQuery, Neo4j, SingleStore, and more.

View Software
8

Kestra

Kestra

Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence. Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate in the data pipeline creation process. The UI automatically adjusts the YAML definition any time you make changes to a workflow from the UI or via an API call. Therefore, the orchestration logic is defined declaratively in code, even if some workflow components are modified in other ways.

View Software
9

SQL

SQL

SQL is a domain-specific programming language used for accessing, managing, and manipulating relational databases and relational database management systems.

View Software