Showing 68 open source projects for "java projects big"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Genie

    Genie

    Distributed Big Data Orchestration Service

    Genie is a completely open source distributed job orchestration engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Spark, Presto, Sqoop and more. It also provides APIs for managing the metadata of many distributed processing clusters and the commands and applications which run on them.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Apache HBase

    Apache HBase

    Get random, realtime read/write access to your Big Data

    Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables, billions of rows X millions of columns, atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable. A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    ElasticJob

    ElasticJob

    Distributed scheduled job framework

    ElasticJob is a distributed scheduling solution consisting of two separate projects, ElasticJob-Lite and ElasticJob-Cloud. ElasticJob-Lite is a lightweight, decentralized solution that provides distributed task sharding services. ElasticJob-Cloud uses Mesos to manage and isolate resources. It uses a unified job API for each project. Developers only need code one time and can deploy at will. Support job sharding and high availability in distributed system. Scale out for throughput and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    HugeGraph

    HugeGraph

    A graph database that supports more than 100+ billion data

    HugeGraph is a convenient, efficient, and adaptable graph database compatible with the Apache TinkerPop3 framework and the Gremlin query language. HugeGraph supports fast import performance in the case of more than 10 billion Vertices and Edges Graph, millisecond-level OLTP query capability, and can be integrated into big data platforms like Hadoop or Spark for OLAP analysis. The main scenarios of HugeGraph include correlation search, fraud detection, and knowledge graph. Not only supports...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    ...Financial grade transactional message. Built-in fault tolerance and high availability configuration options base on DLedger. A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do....
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    ODD Platform

    ODD Platform

    First open-source data discovery and observability platform

    Unlock the power of big data with OpenDataDiscovery Platform. Experience seamless end-to-end insights, powered by unprecedented observability and trust - from ingestion to production - while building your ideal tech stack! Democratize data and accelerate insights. Find data that fits your use case and discover hints left by your peers to leverage existing knowledge. Explore tags, ownership details, links to other sources and other information to shorten and simplify data discovery phase....
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Nebula Graph

    Nebula Graph

    A distributed, fast open-source graph database

    The graph database built for super large-scale graphs with milliseconds of latency. Optimized SUBGRAPH and FIND PATH for better performance. Optimized query paths to reduce redundant paths and time complexity. Optimized the method to get properties for better performance of MATCH statements. Nebula Graph adopts the Apache 2.0 license, one of the most permissive free software licenses in the world. Free as in freedom, because, under the Apache 2.0 license, you can use, copy, modify and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    MentDB Projects

    MentDB Projects

    Generalized Interoperability and Strong AI

    MentDB is an open-source platform driving research into next-generation AI and universal data exchange. Our architecture is built around the revolutionary Mentalese Query Language (MQL). MentDB Weak (Generalized Interoperability): A unified data layer enabling seamless data exchange and application integration (SOA, ETL, Data Quality). We eliminate data silos through a single, generalized data language. MentDB Strong (Strong AI / AGI): The framework for exploring and building Machine...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Apache Polaris

    Apache Polaris

    Apache Polaris, the interoperable, open source catalog

    Apache Polaris is an open-source metadata catalog and data management service designed to manage Apache Iceberg tables in modern data lakehouse environments. It provides a centralized catalog that allows multiple compute engines and analytics systems to interact with the same datasets through a standardized interface. By implementing the Iceberg REST catalog API, Polaris enables distributed data platforms to access shared table metadata without tightly coupling storage systems and query...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    JavaParser

    JavaParser

    Java 1-17 Parser and Abstract Syntax Tree for Java

    This project contains a set of libraries implementing a Java 1.0 - Java 17 Parser with advanced analysis functionalities. The project binaries are available in Maven Central. We strongly advise users to adopt Maven, Gradle or another build system for their projects. If you are not familiar with them we suggest taking a look at the maven quickstart projects. Since Version 3.5.10, the JavaParser project includes the JavaSymbolSolver.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Zingg

    Zingg

    Scalable master data management and identity resolution

    Zingg is an open-source entity resolution and master data management platform for finding duplicate, related, or matching records across large datasets. It uses machine learning to learn how records should be compared, reducing the need for brittle hand-written matching rules. The project is designed for data engineering and analytics teams working on customer 360, supplier 360, deduplication, fuzzy matching, data quality, and golden record workflows. Zingg runs on Apache Spark and can scale...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Emerge

    Emerge

    Browser-based interactive codebase and dependency visualization tool

    Emerge (or emerge-viz) is an interactive code analysis tool to gather insights about source code structure, metrics, dependencies, and complexity of software projects. You can scan the source code of a project, calculate metric results and statistics, generate an interactive web app with graph structures (e.g. a dependency graph or a filesystem graph), and export the results in some file formats. Emerge currently has parsing support for the following languages: C, C++, Groovy, Java, JavaScript, TypeScript, Kotlin, ObjC, Ruby, Swift, Python, and Go. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MOA - Massive Online Analysis

    MOA - Massive Online Analysis

    Big Data Stream Analytics Framework.

    A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.
    Leader badge
    Downloads: 45 This Week
    Last Update:
    See Project
  • 17
    Parkiet

    Parkiet

    Parquet format file GUI editor

    Parquet file viewer and editor written in Java and SWT. It uses Apache Avro library for reading and writing edited parquet files. Only Parquet files with simple data type columns are supported.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    gravitino

    gravitino

    Unified metadata lake for data & AI assets.

    Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Pentaho

    Pentaho

    Pentaho offers comprehensive data integration and analytics platform.

    Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. The Pentaho Enterprise Edition Free Trial...
    Leader badge
    Downloads: 1,538 This Week
    Last Update:
    See Project
  • 20

    TabuVis

    An Interactive Visualisation for Tabular Data

    TabuVis is a comprehensive visual analysis tool that provides a flexible, customizable and interactive visualization for tabular (or multidimensional) data. It utilizes scatter-plot visualization approach to provide comprehensive and interactive views for different attribute mappings. It provides single scatter-plot and multiple scatter-plots together with Map (GIS) capability. The project is available at: http://staff.scem.uws.edu.au/~vinh/projects/TabuVis/ Related...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Muse: Middleware Universal Scripting idE

    Muse: Middleware Universal Scripting idE

    Automate: WebSphere; WebLogic; JBoss; Glassfish; Tomcat; Linux, WinRM

    ...Use Python / Jython to automate WebSphere, WebLogic, JBoss, Glassfish and Tomcat Middleware Estates over JMX, both SSL and non-SSL + Linux SSH (agent-less) + WinRM Target all 5 servers, Linux and WinRM from the same workspace. Familiar Eclipse based Jython and Python Development IDE, pre-configured and ready to go. 4-Click Installer. Win x64, Linux WINE x64. Built-In JVM. Java 8/9/10, Amazon Corretto, JETPack13/14/16, IBM SDK Compatible. *** Now with powerful JBoss / GlassFish / Tomcat / Linux Active Auditing Framework. Tomcat / Glassfish 2 Python - Configuration Snapshots *** Infrastructure-as-Code, Code-Writing-Code Designed to Run on JETPack: https://sourceforge.net/projects/jetpack Muse.2026.03.x - Win 10 / Win11 / Linux (WINE) Muse.2025.12.x - Win8 / Win 10 / Win11 / Linux (WINE)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OSHMI - Open Substation HMI

    OSHMI - Open Substation HMI

    SCADA HMI for substations, IoT and automation applications

    Now with IEC61850 support! This project combines existing open source projects and tools to create a very capable, mobile and cloud-friendly HMI system that can rival proprietary software. This approach makes it possible to join forces of each project (Chromium, SVG/HTML5, PHP, Lua, SQLite, Inkscape, Lib61850, OpenDNP3, Nginx, Vega, PostgreSQL, Grafana,…) to achieve a great set of open, evergreen, modular and customizable tools for building great HMIs for automation projects. This is not...
    Leader badge
    Downloads: 21 This Week
    Last Update:
    See Project
  • 23
    DataGym.ai

    DataGym.ai

    Open source annotation and labeling tool for image and video assets

    DATAGYM enables data scientists and machine learning experts to label images up to 10x faster. AI-assisted annotation tools reduce manual labeling effort, give you more time to finetune ML models and speed up your go to market of new products. Accelerate your computer vision projects by cutting down data preparation time up to 50%. A machine learning model is only as good as its training data. DATAGYM is an end-to-end workbench to create, annotate, manage, and export the right training data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Learn Julia the Hard Way

    Learn Julia the Hard Way

    Learn Julia the hard way

    ...R is a great language, but relatively slow, to the point that most people use it to rapidly prototype, and then implement the algorithm for production in Python or Java. Julia seeks to be as approachable as R but without the speed penalty.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    MyCAT

    MyCAT

    Active, high-performance open source database middleware

    MyCAT is an Open-Source software, “a large database cluster” oriented to enterprises. MyCAT is an enforced database which is a replacement for MySQL and supports transaction and ACID. Regarded as MySQL cluster of enterprise database, MyCAT can take the place of expensive Oracle cluster. MyCAT is also a new type of database, which seems like a SQL Server integrated with the memory cache technology, NoSQL technology and HDFS big data. And as a new modern enterprise database product, MyCAT is...
    Downloads: 34 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB