Showing 39 open source projects for "lake"

View related business solutions
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    XiaoMi Pro Hackintosh

    XiaoMi Pro Hackintosh

    XiaoMi NoteBook Pro Hackintosh

    XiaoMi NoteBook Pro Hackintosh. If you are using XiaoMi-Pro with 8th Gen CPU, then it's a KBL (Kaby Lake) machine. (Actually Kaby Lake Refresh) If you are using XiaoMi-Pro with 10th Gen CPU, then it's a CML (Comet Lake) machine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Sanity

    Sanity

    Rapidly configure content workspaces powered by structured content

    ...Instead of using predefined content templates, Sanity allows developers to define schemas in code that determine how content is structured and stored. The platform stores data in a real-time backend called the Content Lake, enabling collaborative editing and instant updates across connected applications. Because the system separates content management from presentation, developers can use any front-end framework to display the data. Sanity also includes APIs and query tools that allow developers to retrieve content dynamically and integrate it into websites, mobile apps, and other digital services.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    SeaweedFS

    SeaweedFS

    Distributed storage system for blobs, objects, files, and data lake

    SeaweedFS is a distributed storage system for blobs, objects, files, and data lake, to store and serve billions of files fast! Blob store has O(1) disk seek, local tiering, cloud tiering. Filer supports cross-cluster active-active replication, Kubernetes, POSIX, S3 API, encryption, Erasure Coding for warm storage, FUSE mount, Hadoop, WebDAV. SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible because of the community.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file id, via an indexing mechanism. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AWS SDK for pandas

    AWS SDK for pandas

    Easy integration with Athena, Glue, Redshift, Timestream, Neptune

    ...With a few lines of code, you can read from and write to Amazon S3 in Parquet/CSV/JSON/ORC, register tables in the AWS Glue Data Catalog, and query with Amazon Athena directly into pandas. The library abstracts efficient patterns like partitioning, compression, and vectorized I/O so you get performant data lake operations without hand-rolling boilerplate. It also supports Redshift, OpenSearch, and other services, enabling ETL tasks that blend SQL engines and Python transformations. Operational helpers handle IAM, sessions, and concurrency while exposing knobs for encryption, versioning, and catalog consistency. The result is a productive workflow that keeps your analytics in Python while leveraging AWS-native storage and query engines at scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    lakeFS

    lakeFS

    lakeFS - Git-like capabilities for your object storage

    ...It enables zero-copy Dev / Test isolated environments, continuous quality validation, atomic rollback on bad data, reproducibility, and more. Data is dynamic, it changes over time. Dealing with that without a data version control system is error-prone and labor-intensive. With lakeFS, your data lake is version controlled and you can easily time-travel between consistent snapshots of the lake. Easier ETL testing - test your ETLs on top of production data, in isolation, without copying anything. Safely experiment and test on full production data. Easily Collaborate on production data with your team. Automate data quality checks within data pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    pg_analytics

    pg_analytics

    DuckDB-powered analytics for Postgres

    pg_analytics (formerly named pg_lakehouse) puts DuckDB inside Postgres. With pg_analytics installed, Postgres can query foreign object stores like AWS S3 and table formats like Iceberg or Delta Lake. Queries are pushed down to DuckDB, a high-performance analytical query engine. By transforming Postgres into a performant search and analytics engine, ParadeDB frees your team from the pain of scaling and syncing Elasticsearch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LakeSoul

    LakeSoul

    An end-to-end, realtime and cloud native Lakehouse framework

    ...Built on top of Apache Spark and leveraging Apache Arrow and Parquet, LakeSoul provides ACID transactions, schema evolution, and time travel. It is designed for large-scale data lake architectures that require consistency, efficiency, and easy integration with modern data stacks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Brim

    Brim

    Application to efficiently search and analyze super-structured data

    ...Zed is a system that makes data easier by utilizing our new super-structured data model. Brim is a desktop app to explore, query, and shape the data in your super-structured data lake. Brim is an open source desktop application for security and network specialists. Brim makes it easy to search and analyze data from packet captures, like those created by Wireshark, and structured logs, especially from the Zeek network analysis framework. Brim is especially useful to security and network operators that need to handle large packet captures, especially those that are cumbersome for Wireshark, tshark, or other packet analyzers. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 11
    Baritone

    Baritone

    Google maps for block game

    A Minecraft pathfinder bot. Baritone is the pathfinding system used in Impact since 4.4. How to immediately get started: Type #goto 1000 500 in chat to go to x=1000 z=500. Type #mine diamond_ore to mine diamond ore. Type #stop to stop. For more, read the usage page and/or watch this tutorial playlist. For other versions of Minecraft or more complicated situations or for development, see Installation & setup. Also consider just installing Impact, which comes with Baritone and is easier to...
    Downloads: 114 This Week
    Last Update:
    See Project
  • 12
    DataEase

    DataEase

    Data visualization analysis tool

    ...Supports rich chart types (Apache ECharts / AntV), supports drag-and-drop method to quickly create dashboards. Support direct connection mode, local mode (based on Apache Doris / Kettle implementation). Support various data sources such as data warehouse/data lake, OLAP database, OLTP database, Excel data file, API, etc. Open source and open: zero threshold, quick access and installation online; quick access to user feedback, new versions released monthly. pport multiple data sharing methods to ensure data security.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    Foxglove Studio

    Foxglove Studio

    Robotics visualization and debugging

    ...Visualize images and point clouds, overlay bounding boxes, add classification labels and planned movements, and drill down into your data with plots or raw message views. Upload recordings to your private data lake for easy storage, searching, and analysis. Stream recorded data directly into Foxglove Studio to get insights into your robots' behavior. We're long-time fans and beneficiaries of open source software. Join our community on Github and Slack to contribute bug reports, feature requests, or pull requests.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Spice.ai OSS

    Spice.ai OSS

    A self-hostable CDN for databases

    Spice is a portable runtime offering developers a unified SQL interface to materialize, accelerate, and query data from any database, data warehouse, or data lake. Spice connects, fuses, and delivers data to applications, machine-learning models, and AI backends, functioning as an application-specific, tier-optimized Database CDN. The Spice runtime, written in Rust, is built-with industry-leading technologies such as Apache DataFusion, Apache Arrow, Apache Arrow Flight, SQLite, and DuckDB. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Rill

    Rill

    Fast SQL-based BI tool for real-time dashboards and analytics

    Rill is an operational BI tool that turns raw datasets into fast, interactive dashboards using SQL and a code-first approach. It helps data teams move from data lake to insight quickly, without the complexity of traditional BI systems. With an embedded in-memory database powered by DuckDB or ClickHouse, queries run in milliseconds, enabling real-time exploration and analysis. Rill supports local and remote data sources such as CSV, Parquet, S3, and GCS, making it flexible across environments. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Apache Impala

    Apache Impala

    Apache Impala

    Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    gravitino

    gravitino

    Unified metadata lake for data & AI assets.

    Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PI-Based Image Encoder / Converter

    PI-Based Image Encoder / Converter

    Python code able to convert / compress image to PI (3.14, π) Indexes

    ...ZIP also include 16 MB file with 16,7 mil numbers of PI Benchmark(Single-Thread): Hardware & Environment Apple Silicon: Apple M2 (Mac mini/MacBook) x86_64 Platform: Intel Core Ultra 5 225F (Arrow Lake, 10 Cores) OS 1: Fedora 43 (GNOME) OS 2: Windows 11 Pro (23H2/24H2) Software: Python 3.14.3 + Numba JIT (latest) Results (Lower is better) Platform / OS CPU Time (Seconds) macOS (Native) Apple M2 52.151311 s (in default setup) Fedora Linux Intel Core Ultra 5 225F 58.536457 s (in default Power Management: Balanced) Windows 11 Intel Core Ultra 5 225F 59.681427 s (important! ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    BitSail

    BitSail

    BitSail is a distributed high-performance data integration engine

    BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day. BitSail has been widely used and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Arduino Primary Avionics Module  (A-PAM)

    Arduino Primary Avionics Module (A-PAM)

    A foundational avionics system for model rockets

    If you do a search for model rocket electronic payload, you will see a lot of entries for medium to high power, large diameter rockets. These types of rockets are large, typically 80mm (3.1 inches) in diameter or larger. These rockets are flown in large open areas, often deserts and dry lake beds. These larger rockets also use larger propellants, typically in the High Power Rocketry (HPR) range of “H” and above impulse level. Not only do these motors require you to be certified in HPR, the motors themselves tend to be rather expensive. My goal was to develop a system that could be used by high school students. It would involve proven rocket designs that could be flown on school yards. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    BISuporte

    BISuporte

    Data Lake and Data Warehouse Company

    Data Lake and Data Warehouse Company
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Music Lake PC

    Music Lake PC

    Electron Music Lake PC

    Electronic cross-platform music player; can search Netease Cloud, QQ Music, Xiami Music; support QQ, Weibo, Github login, cloud playlist; support one-click import of music platform playlist. Song Api covers NetEase Cloud, QQ Music, Xiami. The interface imitates QQ music. Mac > Windows > Linux will gradually adapt. For Android client, see caiyonglong/MusicLake. The process of login, collection, and playback is basically no problem, and it can be used as a daily work program to listen to songs...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    Lake Robotics Embedded Software Platform

    Platform/Framework for embedded software

    The Lake Robotics Embedded Software Platform is a toolkit/framework for developing embedded software for different microcontroller (at the moment only ARM and Cortex-M3 are supported). The platform provides libraries including other open source projects for Realtime OS (ChibiOS, FreeRTOS), tcp/ip networking, C-Runtime and C-Standard-Libs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB