With CueLake, you can use SQL to build ELT (Extract, Load, Transform) pipelines on a data lakehouse.

You write Spark SQL statements in Zeppelin notebooks. You then schedule these notebooks using workflows (DAGs).

To extract and load incremental data, you write simple select statements. CueLake executes these statements against your databases and then merges incremental data into your data lakehouse (powered by Apache Iceberg).

To transform data, you write SQL statements to create views and tables in your data lakehouse.

CueLake uses Celery as the executor and celery-beat as the scheduler. Celery jobs trigger Zeppelin notebooks. Zeppelin auto-starts and stops the Spark cluster for every scheduled run of notebooks.

Features

  • Create DAGs. Group notebooks into workflows and create DAGs of these workflows.
  • Data Security. Your data always stays within your cloud account.
  • Create Views in data lakehouse. CueLake enables you to create views over Iceberg tables.
  • Upsert Incremental data. CueLake uses Iceberg’s merge into query to automatically merge incremental data.
  • Elastically Scale Cloud Infrastructure. CueLake uses Zeppelin to auto create and delete Kubernetes resources required to run data pipelines.
  • In-built Scheduler to schedule your pipelines.
  • Monitoring. Get Slack alerts when a pipeline fails.CueLake maintains detailed logs.
  • Versioning in Github. Commit and maintain versions of your Zeppelin notebooks in Github.

Project Activity

See All Activity >

Categories

ETL, Data Pipeline

Follow Cuelake

Cuelake Web Site

You Might Also Like
Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
Achieve perfect load balancing with a flexible Open Source Load Balancer

Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Cuelake!

Additional Project Details

Registered

2021-06-26