Showing 31 open source projects for "big data"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Apache HBase

    Apache HBase

    Get random, realtime read/write access to your Big Data

    Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables, billions of rows X millions of columns, atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable. A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ODD Platform

    ODD Platform

    First open-source data discovery and observability platform

    Unlock the power of big data with OpenDataDiscovery Platform. Experience seamless end-to-end insights, powered by unprecedented observability and trust - from ingestion to production - while building your ideal tech stack! Democratize data and accelerate insights. Find data that fits your use case and discover hints left by your peers to leverage existing knowledge.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Fluid

    Fluid

    Fluid, elastic data abstraction and acceleration for BigData/AI apps

    Fluid, elastic data abstraction and acceleration for BigData/AI applications in the cloud. Provide DataSet abstraction for underlying heterogeneous data sources with multidimensional management in a cloud environment. Enable dataset warmup and acceleration for data-intensive applications by using a distributed cache in Kubernetes with observability, portability, and scalability. Taking characteristics of application and data into consideration for cloud application/dataset scheduling to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    ...A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. Million-level message accumulation capacity in a single queue. Multiple messaging protocols like JMS and OpenMessaging. Flexible distributed scale-out deployment architecture. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    BFG Repo-Cleaner

    BFG Repo-Cleaner

    Remove large or troublesome blobs

    The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history. You can use it for removing crazy big files, and for removing passwords, credentials and other private data. The git-filter-branch command is enormously powerful and can do things that the BFG can't, but the BFG is much better for the tasks above, because is faster and simpler. The BFG isn't particularily clever, but is focused on making the above tasks easy. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Grafana Alloy

    Grafana Alloy

    OpenTelemetry Collector distribution with programmable pipelines

    Grafana Alloy is an open source OpenTelemetry Collector distribution with built-in Prometheus pipelines and support for metrics, logs, traces, and profiles. Grafana Alloy is Grafana Labs’ distribution of the OpenTelemetry Collector. It is an OTLP-compatible collector with built-in Prometheus optimizations that also support signals across metrics, logs, traces, and profiles. Alloy was started at Grafana Labs and announced at GrafanaCON in 2024. The mission of the project is to create the best...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    LakeSoul

    LakeSoul

    An end-to-end, realtime and cloud native Lakehouse framework

    LakeSoul is a high-performance, unified table storage framework for big data lakes, supporting both streaming and batch data in a single format. Built on top of Apache Spark and leveraging Apache Arrow and Parquet, LakeSoul provides ACID transactions, schema evolution, and time travel. It is designed for large-scale data lake architectures that require consistency, efficiency, and easy integration with modern data stacks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    manticoresearch

    manticoresearch

    Easy to use open source fast database for search

    ...Modern MPP architecture and smart query parallelization capabilities allow to fully utilize all your CPU cores to lower response time as much as possible, when needed. Powerful and fast full-text searching which works fine for small and big datasets. Columnar storage support via the Manticore Columnar Library for bigger datasets (much bigger than can fit in RAM). SQL-first: Manticore's native syntax is SQL. It speaks SQL over HTTP and uses the MySQL protocol (you can use your preferred MySQL client). JSON over HTTP: to provide a more programmatic way to manage your data and schemas, Manticore provides a HTTP JSON protocol. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    StartOS

    StartOS

    Linux server OS optimized for self-hosting

    StartOS is a sovereign, self-hosted operating system built by Start9 Labs to empower individuals with digital independence. Designed to run on personal servers, it provides a privacy-first interface for installing, managing, and running decentralized applications without needing technical expertise. StartOS includes services like Bitcoin nodes, messaging platforms, file hosting, and password managers, all running locally and without third-party control. With a user-friendly UI and strong...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    MobX

    MobX

    A Simple, scalable state management

    MobX is a battle tested library that makes state management simple and scalable by transparently applying functional reactive programming (TFRP). Write minimalistic, boilerplate free code that captures your intent. Trying to update a record field? Use the good old JavaScript assignment. Updating data in an asynchronous process? No special tools are required, the reactivity system will detect all your changes and propagate them out to where they are being used. All changes to and uses of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    IPFS

    IPFS

    IPFS implementation in Go

    A peer-to-peer hypermedia protocol designed to make the web faster, safer, and more open. HTTP downloads files from one computer at a time instead of getting pieces from multiple computers simultaneously. Peer-to-peer IPFS saves big on bandwidth, up to 60% for video, making it possible to efficiently distribute high volumes of data without duplication. The average lifespan of a web page is 100 days before it's gone forever. It's not good enough for the primary medium of our era to be this fragile. IPFS keeps every version of your files and makes it simple to set up resilient networks for mirroring data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Foundatio

    Foundatio

    Pluggable foundation blocks for building distributed apps

    Pluggable foundation blocks for building loosely coupled distributed apps. Includes implementations in Redis, Azure, AWS, RabbitMQ and in memory (for development). When building several big cloud applications we found a lack of great solutions (that's not to say there aren't solutions out there) for many key pieces to building scalable distributed applications while keeping the development experience simple. Wanted to build against abstract interfaces so that we could easily change...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    json-scada

    A portable SCADA/IoT platform centered on the MongoDB database server.

    ...MongoDB as the real-time core database, persistence layer, config store, SOE historian. Portability and interoperability over Linux, Windows, x86/64, ARM. Horizontal scalability, from a single computer to big clusters (MongoDB-sharding), Bare Metal, Docker containers, VM, cloud, or hybrid deployments. Unlimited tags, servers, and users. HTML5 Web interface. UTF-8/I18N. Protocols: IEC61850 Client, IEC60870-5-101/104 Client and Server, DNP3 Client, OPC-UA Client/Server, MQTT/Sparkplug-B, Telegraf (various data sources for monitoring like Modbus, SNMP, etc.) ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    Old File Delete

    Old File Delete

    Clean up old files with a single click.

    OldFileDelete (OFD) is a lightweight and efficient utility designed for those who value minimalism and order. The app helps you instantly clear selected folders of accumulated digital clutter. Featuring a modern flat design, the interface is intuitive: simply select a folder, specify the number of days, and the program will find and remove outdated files. No complex settings—just cleanliness and speed.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Curve

    Curve

    Curve is a sandbox project hosted by the CNCF Foundation

    A cloud-native distributed storage system. A sandbox project hosted by the CNCF Foundation. Curve is a modern storage system developed by netease, currently supporting file storage(CurveFS) and block storage(CurveBS). Now it's hosted at CNCF as a sandbox project. The performance, mixed, capacity cloud disk or persistent volume of virtual machine/container, and remote disks of physical machines. High-performance separation of storage and computation architecture: high-performance and low...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Koha + DSpace 10.0 Live ISO (2026.06)

    Koha + DSpace 10.0 Live ISO (2026.06)

    Koha 26.05.00 + DSpace 10.0 Live ISO Installer based on Ubuntu 22.04.5

    Koha + DSpace Live is a Live Bootable and Installer ISO based on Ubuntu 22.04.5 (Koha Version: 26.05.00 and DSpace Version: 10.0) This ISO boots only in Legacy BIOS mode and not in Secure Boot Mode After booting from the Live DVD/USB, use the following login credentials: Login: library (Displayed as 'Open Digital Library') Password: library This Live ISO contains additional Desktop Environments (aka D.E) providing different User Interfaces. If you wish to choose a Lightweight...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 17
    DataSophon

    DataSophon

    The next generation of cloud-native big data management expert

    Aiming at quickly deploying, managing, monitoring and automating the operation and maintenance of Big Data service components and nodes, helping you quickly build stable, efficient Big Data cluster services. The Three-Body Problem, a Hugo Award-winning work of the world's highest science fiction literature, is known for its stunning "hard science fiction" style, and its author Liu Cixin is credited with "single-handedly raising Chinese science fiction to a world-class level". ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Parveshdhull AutoTyper

    Parveshdhull AutoTyper

    A Data Entry Tool for Windows and Linux

    Sometimes we have to write content in programs where copy-paste is not allowed, like in data entry software Notepad RT. There are many tools available online but almost all of them only provide trial versions. And requires big payment for continued access. And even if they are free, it is not wise to give complete access to a keyboard to any third-party software. So I wrote this simple-short python script that reads content from a text file then simulates keyboard typing. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    GnuCopy
    GnuCopy is an Open-Source tool to copy and archive all your important data. It supports all important archive typs like Zip and Tar to guaranty an easy and secure exchange between all types of operating systems. Additionally, you can create profiles to blacklist or whitelist specific file types or folders to seperate your big data stores for backups.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Annoy

    Annoy

    Approximate Nearest Neighbors in C++/Python optimized for memory usage

    Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data. There are some other libraries to do nearest neighbor search. Annoy is almost as fast as the fastest libraries, (see below), but there is actually another feature that really sets Annoy apart: it has the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    MC 6800 Emulator for Teensy 4

    6800 Emulator runs on Teensy 4 processor

    6800/ET-3400 Emulator, designed for the Teensy 4.0/4.1 processors. Build this with the Arduino IDE 2.0. Full Source and readme included in archive. Programs written in 6800 assembly can be run in the emulator. It supports two modes of operation, an internal monitor program and a mode that emulates the Heathkit ET3400 (and ET-6800) microprocessor trainers. Motorola released the MC6800 chip in 1974. It was a source of inspiration for the designers of the more popular (and cheaper) 6502 chip...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    MyCAT

    MyCAT

    Active, high-performance open source database middleware

    ...Regarded as MySQL cluster of enterprise database, MyCAT can take the place of expensive Oracle cluster. MyCAT is also a new type of database, which seems like a SQL Server integrated with the memory cache technology, NoSQL technology and HDFS big data. And as a new modern enterprise database product, MyCAT is combined with the traditional database and new distributed data warehouse. In a word, MyCAT is a fresh new middleware of database. MyCAT ’s objective is to smoothly migrate the current stand-alone database and applications to cloud side with low cost and to solve the bottleneck problem caused by the rapid growth of data storage and business scale.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23

    wzd

    Powerful storage server, designed for big data storage systems

    wZD is a server written in Go language that uses a modified version of the BoltDB database as a backend for saving and distributing any number of small and large files, NoSQL keys/values, in a compact form inside micro Bolt databases (archives), with distribution of files and values in BoltDB databases depending on the number of directories or subdirectories and the general structure of the directories. Using wZD can permanently solve the problem of a large number of files on any POSIX...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    pyFileSearcher

    pyFileSearcher

    simple searching tool for big fileservers

    pyFileSearcher was designed to be lightweight, easy to use, but capable of handling a large volume of files tool. A tool that I personally could use on large corporate servers to find out - which files have taken all my space in the last few days? It's free, it's opensource, it's for linux and windows. The program is written in Python 3 using the Qt5.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    KlipMan

    A ClipBoard Manager with some pretty unique copy paste features

    For both Windows and Linux: There are many clipboard managers available on the internet, some even better than this, however, I've tried to put some very unique features in this... * Secondary Clipboard * One after other mode: copy, copy, copy, copy ; paste, paste, paste, paste mode (the contents get pasted in the same order you copied them) * CAGR calculator * Permanently save Klippings * Append mode on GitHub: https://github.com/hemanshukale/KlipMan I've also attached a PDF...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo