Showing 29 open source projects for "lake"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (or Deeplake, formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    AWS SDK for pandas

    AWS SDK for pandas

    Easy integration with Athena, Glue, Redshift, Timestream, Neptune

    ...With a few lines of code, you can read from and write to Amazon S3 in Parquet/CSV/JSON/ORC, register tables in the AWS Glue Data Catalog, and query with Amazon Athena directly into pandas. The library abstracts efficient patterns like partitioning, compression, and vectorized I/O so you get performant data lake operations without hand-rolling boilerplate. It also supports Redshift, OpenSearch, and other services, enabling ETL tasks that blend SQL engines and Python transformations. Operational helpers handle IAM, sessions, and concurrency while exposing knobs for encryption, versioning, and catalog consistency. The result is a productive workflow that keeps your analytics in Python while leveraging AWS-native storage and query engines at scale.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Sanity

    Sanity

    Rapidly configure content workspaces powered by structured content

    ...Instead of using predefined content templates, Sanity allows developers to define schemas in code that determine how content is structured and stored. The platform stores data in a real-time backend called the Content Lake, enabling collaborative editing and instant updates across connected applications. Because the system separates content management from presentation, developers can use any front-end framework to display the data. Sanity also includes APIs and query tools that allow developers to retrieve content dynamically and integrate it into websites, mobile apps, and other digital services.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    lakeFS

    lakeFS

    lakeFS - Git-like capabilities for your object storage

    ...It enables zero-copy Dev / Test isolated environments, continuous quality validation, atomic rollback on bad data, reproducibility, and more. Data is dynamic, it changes over time. Dealing with that without a data version control system is error-prone and labor-intensive. With lakeFS, your data lake is version controlled and you can easily time-travel between consistent snapshots of the lake. Easier ETL testing - test your ETLs on top of production data, in isolation, without copying anything. Safely experiment and test on full production data. Easily Collaborate on production data with your team. Automate data quality checks within data pipelines.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    SeaweedFS

    SeaweedFS

    Distributed storage system for blobs, objects, files, and data lake

    SeaweedFS is a distributed storage system for blobs, objects, files, and data lake, to store and serve billions of files fast! Blob store has O(1) disk seek, local tiering, cloud tiering. Filer supports cross-cluster active-active replication, Kubernetes, POSIX, S3 API, encryption, Erasure Coding for warm storage, FUSE mount, Hadoop, WebDAV. SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible because of the community.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file id, via an indexing mechanism. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    pg_analytics

    pg_analytics

    DuckDB-powered analytics for Postgres

    pg_analytics (formerly named pg_lakehouse) puts DuckDB inside Postgres. With pg_analytics installed, Postgres can query foreign object stores like AWS S3 and table formats like Iceberg or Delta Lake. Queries are pushed down to DuckDB, a high-performance analytical query engine. By transforming Postgres into a performant search and analytics engine, ParadeDB frees your team from the pain of scaling and syncing Elasticsearch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LakeSoul

    LakeSoul

    An end-to-end, realtime and cloud native Lakehouse framework

    ...Built on top of Apache Spark and leveraging Apache Arrow and Parquet, LakeSoul provides ACID transactions, schema evolution, and time travel. It is designed for large-scale data lake architectures that require consistency, efficiency, and easy integration with modern data stacks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DataEase

    DataEase

    Data visualization analysis tool

    ...Supports rich chart types (Apache ECharts / AntV), supports drag-and-drop method to quickly create dashboards. Support direct connection mode, local mode (based on Apache Doris / Kettle implementation). Support various data sources such as data warehouse/data lake, OLAP database, OLTP database, Excel data file, API, etc. Open source and open: zero threshold, quick access and installation online; quick access to user feedback, new versions released monthly. pport multiple data sharing methods to ensure data security.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 10
    Baritone

    Baritone

    Google maps for block game

    A Minecraft pathfinder bot. Baritone is the pathfinding system used in Impact since 4.4. How to immediately get started: Type #goto 1000 500 in chat to go to x=1000 z=500. Type #mine diamond_ore to mine diamond ore. Type #stop to stop. For more, read the usage page and/or watch this tutorial playlist. For other versions of Minecraft or more complicated situations or for development, see Installation & setup. Also consider just installing Impact, which comes with Baritone and is easier to...
    Downloads: 76 This Week
    Last Update:
    See Project
  • 11
    Brim

    Brim

    Application to efficiently search and analyze super-structured data

    ...Zed is a system that makes data easier by utilizing our new super-structured data model. Brim is a desktop app to explore, query, and shape the data in your super-structured data lake. Brim is an open source desktop application for security and network specialists. Brim makes it easy to search and analyze data from packet captures, like those created by Wireshark, and structured logs, especially from the Zeek network analysis framework. Brim is especially useful to security and network operators that need to handle large packet captures, especially those that are cumbersome for Wireshark, tshark, or other packet analyzers. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 12
    Foxglove Studio

    Foxglove Studio

    Robotics visualization and debugging

    ...Visualize images and point clouds, overlay bounding boxes, add classification labels and planned movements, and drill down into your data with plots or raw message views. Upload recordings to your private data lake for easy storage, searching, and analysis. Stream recorded data directly into Foxglove Studio to get insights into your robots' behavior. We're long-time fans and beneficiaries of open source software. Join our community on Github and Slack to contribute bug reports, feature requests, or pull requests.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    Rill

    Rill

    Fast SQL-based BI tool for real-time dashboards and analytics

    Rill is an operational BI tool that turns raw datasets into fast, interactive dashboards using SQL and a code-first approach. It helps data teams move from data lake to insight quickly, without the complexity of traditional BI systems. With an embedded in-memory database powered by DuckDB or ClickHouse, queries run in milliseconds, enabling real-time exploration and analysis. Rill supports local and remote data sources such as CSV, Parquet, S3, and GCS, making it flexible across environments. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Zingg

    Zingg

    Scalable master data management and identity resolution

    ...The project is designed for data engineering and analytics teams working on customer 360, supplier 360, deduplication, fuzzy matching, data quality, and golden record workflows. Zingg runs on Apache Spark and can scale to large data lake, warehouse, and cloud platform environments. It supports configuration-driven pipelines where users define input data, match fields, training data, models, and output destinations. Its main value is helping organizations unify fragmented records into reliable entity clusters while keeping the process trainable, explainable, and repeatable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Spice.ai OSS

    Spice.ai OSS

    A self-hostable CDN for databases

    Spice is a portable runtime offering developers a unified SQL interface to materialize, accelerate, and query data from any database, data warehouse, or data lake. Spice connects, fuses, and delivers data to applications, machine-learning models, and AI backends, functioning as an application-specific, tier-optimized Database CDN. The Spice runtime, written in Rust, is built-with industry-leading technologies such as Apache DataFusion, Apache Arrow, Apache Arrow Flight, SQLite, and DuckDB. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Apache Impala

    Apache Impala

    Apache Impala

    Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    gravitino

    gravitino

    Unified metadata lake for data & AI assets.

    Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    PI-Based Image Encoder / Converter

    PI-Based Image Encoder / Converter

    Python code able to convert / compress image to PI (3.14, π) Indexes

    ...ZIP also include 16 MB file with 16,7 mil numbers of PI Benchmark(Single-Thread): Hardware & Environment Apple Silicon: Apple M2 (Mac mini/MacBook) x86_64 Platform: Intel Core Ultra 5 225F (Arrow Lake, 10 Cores) OS 1: Fedora 43 (GNOME) OS 2: Windows 11 Pro (23H2/24H2) Software: Python 3.14.3 + Numba JIT (latest) Results (Lower is better) Platform / OS CPU Time (Seconds) macOS (Native) Apple M2 52.151311 s (in default setup) Fedora Linux Intel Core Ultra 5 225F 58.536457 s (in default Power Management: Balanced) Windows 11 Intel Core Ultra 5 225F 59.681427 s (important! ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    BitSail

    BitSail

    BitSail is a distributed high-performance data integration engine

    BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day. BitSail has been widely used and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Arduino Primary Avionics Module  (A-PAM)

    Arduino Primary Avionics Module (A-PAM)

    A foundational avionics system for model rockets

    If you do a search for model rocket electronic payload, you will see a lot of entries for medium to high power, large diameter rockets. These types of rockets are large, typically 80mm (3.1 inches) in diameter or larger. These rockets are flown in large open areas, often deserts and dry lake beds. These larger rockets also use larger propellants, typically in the High Power Rocketry (HPR) range of “H” and above impulse level. Not only do these motors require you to be certified in HPR, the motors themselves tend to be rather expensive. My goal was to develop a system that could be used by high school students. It would involve proven rocket designs that could be flown on school yards. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Music Lake PC

    Music Lake PC

    Electron Music Lake PC

    Electronic cross-platform music player; can search Netease Cloud, QQ Music, Xiami Music; support QQ, Weibo, Github login, cloud playlist; support one-click import of music platform playlist. Song Api covers NetEase Cloud, QQ Music, Xiami. The interface imitates QQ music. Mac > Windows > Linux will gradually adapt. For Android client, see caiyonglong/MusicLake. The process of login, collection, and playback is basically no problem, and it can be used as a daily work program to listen to songs...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    CloverDX

    CloverDX

    Design, automate, operate and publish data pipelines at scale

    Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 24

    phpBathymetry

    Using PHP-CLI to parse NMEA data to generate depth maps with GD

    Using PHP-CLI to parse NMEA data to generate depth maps with GD. As fishing is a passion of mine,and it got me to consider using the NMEA formatted data output from my Lowrance 5-DSI Elite to generate a bathymetric map of the lake at my cottage. It is currently very crude, just a bunch of CLI scripts that record NMEA data, and the other parses the stored data and builds a depth, water temperature and sample frequency map.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    APS SLC Layout

    Evaluation of APS accessibility in downtown Salt Lake City

    ...The effectiveness of these signals, however, is potentially undermined by their placement if planning considerations do not maximize their use. One hundred and twelve blocks of the downtown Salt Lake City area were inventoried in the months of November and December of 2013. Over 500 possible paths were then measured and compared to the shortest possible paths to determine the area's connectivity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo