Showing 1131 open source projects for "java"

View related business solutions
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • Run applications fast and securely in a fully managed environment Icon
    Run applications fast and securely in a fully managed environment

    Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

    Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.
    Try for free
  • 1
    Java Tablesaw

    Java Tablesaw

    Java dataframe and visualization library

    Tablesaw is a dataframe and visualization library that supports loading, cleaning, transforming, filtering, and summarizing data. If you work with data in Java, it may save you time and effort. Tablesaw also supports descriptive statistics and can be used to prepare data for working with machine learning libraries like Smile, Tribuo, H20.ai, DL4J. Import data from RDBMS, Excel, CSV, TSV, JSON, HTML, or Fixed Width text files, whether they are local or remote (http, S3, etc.) Tablesaw supports data visualization by providing a wrapper for the Plot.ly JavaScript plotting library. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    PlantUML

    PlantUML

    Generate diagrams from textual description

    Generate UML diagram from textual description. PlantUML is not affected by the log4j vulnerability. The easiest way to test PlantUML is in an online solution that has PlantUML embedded, such as our online server. After testing, you may want to install PlantUML locally. Run (or have your software call) PlantUML, using sequenceDiagram.txt as input. The output is an image, which either appears in the other software, or is written to an image file on disk. Diagrams are defined using a simple and...
    Downloads: 43 This Week
    Last Update:
    See Project
  • 3
    Apache HBase

    Apache HBase

    Get random, realtime read/write access to your Big Data

    Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables, billions of rows X millions of columns, atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable. A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a...
    Downloads: 25 This Week
    Last Update:
    See Project
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 5
    DataEase

    DataEase

    Data visualization analysis tool

    An open source data visualization analysis tool available to everyone. DataEase is an open-source data visualization analysis tool that helps users quickly analyze data and gain insight into business trends, so as to achieve business improvement and optimization. DataEase supports rich data source connections, can quickly create charts by dragging and dropping, and can easily share with others. Supports rich chart types (Apache ECharts / AntV), supports drag-and-drop method to quickly create...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 6
    Logstash

    Logstash

    Centralize, transform and stash your data

    Logstash is a server-side data processing pipeline that dynamically ingests data from numerous sources, transforms it, and ships it to your favorite “stash” regardless of format or complexity. It supports and ingests data of all shapes, sizes and sources, dynamically transforms and prepares this data, and transports it to the output of your choice. Logstash is extensible, with over 200 plugins available to let you create and configure your pipeline how you choose.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 8
    JavaParser

    JavaParser

    Java 1-17 Parser and Abstract Syntax Tree for Java

    This project contains a set of libraries implementing a Java 1.0 - Java 17 Parser with advanced analysis functionalities. The project binaries are available in Maven Central. We strongly advise users to adopt Maven, Gradle or another build system for their projects. If you are not familiar with them we suggest taking a look at the maven quickstart projects. Since Version 3.5.10, the JavaParser project includes the JavaSymbolSolver.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Redisson

    Redisson

    Valkey & Redis Java client. Real-Time Data Platform

    Redisson is a Java client library for Redis that offers distributed data structures, services, and frameworks to build scalable and reliable applications. It simplifies Redis usage by providing in-memory Java objects like maps, sets, locks, queues, and semaphores that are backed by Redis. Redisson supports advanced features like distributed locking, asynchronous APIs, and integrates with frameworks like Spring and Quarkus for reactive and cloud-native development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 10
    Semantic Type Detection

    Semantic Type Detection

    Metadata/data identification Java library

    Metadata/data identification Java library. Identifies Base Type (e.g. Boolean, Double, Long, String, LocalDate, LocalTime, ...) and Semantic Type information (e.g. Gender, Age, Color, Country, ...). Extensive country/language support. Extensible via user-defined plugins. Comprehensive Profiling support. Large set of built-in Semantic Types (extensible via JSON defined plugins).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    OrientDB

    OrientDB

    DBMS supporting graph, document, full-text and geospatial models

    OrientDB is an Open Source Multi-Model NoSQL DBMS with the support of Native Graphs, Documents, Full-Text search, Reactivity, Geo-Spatial and Object Oriented concepts. It's written in Java and it's amazingly fast. No expensive run-time JOINs, connections are managed as persistent pointers between records. You can traverse thousands of records in no time. Supports schema-less, schema-full and schema-mixed modes. Has a strong security profiling system based on user, roles and predicate security and supports SQL amongst the query languages. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Reactor Core

    Reactor Core

    Non-Blocking Reactive Foundation for the JVM

    Reactor Core is a foundational library for building reactive applications in Java, providing a powerful API for asynchronous, non-blocking programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Dolphin Scheduler

    Dolphin Scheduler

    A distributed and extensible workflow scheduler platform

    Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dedicated to solving the complex task dependencies in data processing, making the scheduler system out of the box for data processing. Decentralized multi-master and multi-worker, HA is supported by itself, overload processing. All process...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    ...Financial grade transactional message. Built-in fault tolerance and high availability configuration options base on DLedger. A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Kestra

    Kestra

    Kestra is an infinitely scalable orchestration and scheduling platform

    Build reliable workflows, blazingly fast, deploy in just a few clicks. Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence. Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do....
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Apache SeaTunnel

    Apache SeaTunnel

    SeaTunnel is a distributed, high-performance data integration platform

    SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of nearly 100 companies. There are hundreds of commonly-used data sources of which versions are incompatible. With the emergence of new technologies, more data sources are appearing. It is difficult for users to find a tool that can...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    IoTDB

    IoTDB

    Apache IoTDB

    Apache IoTDB (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Due to its light-weight architecture, high performance and rich feature set together with its deep integration with Apache Hadoop, Spark and Flink, Apache IoTDB can meet the requirements of massive data storage, high-speed data ingestion and complex data analysis in the IoT industrial fields. In the scene of factories, there...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PULSAR

    PULSAR

    Distributed pub-sub messaging system

    ...Supports isolation, authentication, authorization and quotas. Persistent message storage based on Apache BookKeeper. IO-level isolation between write and read operations. Flexible messaging models with high-level APIs for Java, Go, Python, C++, Node.js, WebSocket and C#.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Java Treeview - An Open Source, Extensible Viewer for Microarray Data in the PCL or CDT format
    Downloads: 40 This Week
    Last Update:
    See Project
  • 22
    Genie

    Genie

    Distributed Big Data Orchestration Service

    Genie is a completely open source distributed job orchestration engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Spark, Presto, Sqoop and more. It also provides APIs for managing the metadata of many distributed processing clusters and the commands and applications which run on them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Alluxio

    Alluxio

    Open Source Data Orchestration for the Cloud

    Alluxio is the world’s first open source data orchestration technology for analytics and AI for the cloud. It bridges the gap between computation frameworks and storage systems, bringing data from the storage tier closer to the data driven applications. This enables applications to connect to numerous storage systems through a common interface. It makes data local, more accessible and as elastic as compute.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Qualitis

    Qualitis

    Qualitis is a one-stop data quality management platform

    Qualitis is a data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. Based on Spring Boot, Qualitis submits quality model task to Linkis platform. It provides functions such as data quality model construction, data quality model execution, data quality verification, reports of data quality generation and so on. At the same time, Qualitis provides...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DQO Data Quality Operations Center

    DQO Data Quality Operations Center

    Data Quality Operations Center

    DQO is an DataOps friendly data quality monitoring tool with customizable data quality checks and data quality dashboards. DQO comes with around 100 predefined data quality checks which helps you monitor the quality of your data. Table and column-level checks which allows writing your own SQL queries. Daily and monthly date partition testing. Data segmentation by up to 9 different data streams. Build-in scheduling. Calculation of data quality KPIs which can be displayed on multiple built-in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next