Alternatives to Apache Axiom

Compare Apache Axiom alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Apache Axiom in 2026. Compare features, ratings, user reviews, pricing, and more from Apache Axiom competitors and alternatives in order to make an informed decision for your business.

  • 1
    Apache Santuario

    Apache Santuario

    The Apache Software Foundation

    Apache XML Security for Java: This library includes the standard JSR-105 (Java XML Digital Signature) API, a mature DOM-based implementation of both XML Signature and XML Encryption, as well as a more recent StAX-based (streaming) XML Signature and XML Encryption implementation. Ability to set a security provider when using org.apache.xml.security.signature.XMLSignature. Added support for customizing how to parse a Inputstream into a DOM Document.
  • 2
    Apache Xerces

    Apache Xerces

    The Apache Software Foundation

    Apache Xerces is a collaborative software development project dedicated to providing robust, full-featured, commercial-quality, and freely available XML parsers and closely related technologies on a wide variety of platforms supporting several languages. This project is managed in cooperation with various individuals worldwide (both independent and company-affiliated experts), who use the Internet to communicate, plan, and develop XML software and related documentation. Apache Xerces exists to promote the use of XML. We view XML as a compelling paradigm that structures data as information, thereby facilitating the exchange, transformation, and presentation of knowledge. The ability to transform raw data into usable information has great potential to improve the functionality and use of information systems. We intend to build freely available XML parsers and closely related technologies in order to engender such improvements.
  • 3
    Apache Anakia

    Apache Anakia

    The Apache Software Foundation

    Anakia is potentially easier to learn than XSL, but it maintains a similar level of functionality. Learning cryptic <xsl:> tags is unnecessary; you only need to know how to use the provided Context objects, JDOM, and Velocity's simple directives. Anakia seems to perform much faster than Xalan's XSL processor at creating pages. (23 pages are generated in 7-8 seconds on a PIII 500mhz running Win98 and JDK 1.3 with client Hotspot. A similar system using Ant's <style> task took 14-15 seconds -- nearly a 2x speed improvement.) Anakia -- intended to replace Stylebook, which was originally used to generate simple, static web sites in which all pages had the same look and feel -- is great for documentation/project web sites, such as the sites on www.apache.org and jakarta.apache.org. As it is more targeted to a specific purpose, it does not provide some of XSL's "extra" functionality.
  • 4
    Apache Xalan

    Apache Xalan

    The Apache Software Foundation

    The Apache Xalan Project develops and maintains libraries and programs that transform XML documents using XSLT standard stylesheets. Our subprojects use the Java and C++ programing languages to implement the XSLT libraries. The Xalan-Java 2.7.2 was released in April 2014. You can download the current release the current Xalan-Java 2.7.2 release for your development. The current work in progress can be found in the subversion repository. The current release fixes a security issue that was registered against version 2.7.1. The old Xalan-J 2.7.1 distributions are still available on the Apache Archives. This is a mature project. There has been some discussion about supporting XPath-2. We could use your support in this major rework of the library. You can follow the efforts and post your own contributions on the Java users and developers mail lists.
  • 5
    Astra Streaming
    Responsive applications keep users engaged and developers inspired. Rise to meet these ever-increasing expectations with the DataStax Astra Streaming service platform. DataStax Astra Streaming is a cloud-native messaging and event streaming platform powered by Apache Pulsar. Astra Streaming allows you to build streaming applications on top of an elastically scalable, multi-cloud messaging and event streaming platform. Astra Streaming is powered by Apache Pulsar, the next-generation event streaming platform which provides a unified solution for streaming, queuing, pub/sub, and stream processing. Astra Streaming is a natural complement to Astra DB. Using Astra Streaming, existing Astra DB users can easily build real-time data pipelines into and out of their Astra DB instances. With Astra Streaming, avoid vendor lock-in and deploy on any of the major public clouds (AWS, GCP, Azure) compatible with open-source Apache Pulsar.
  • 6
    Apache Spark

    Apache Spark

    Apache Software Foundation

    Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources. You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.
  • 7
    Apache Storm

    Apache Storm

    Apache Software Foundation

    Apache Storm is a free and open source distributed realtime computation system. Apache Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Apache Storm is simple, can be used with any programming language, and is a lot of fun to use! Apache Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Apache Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate. Apache Storm integrates with the queueing and database technologies you already use. An Apache Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed. Read more in the tutorial.
  • 8
    Amazon MSK
    Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning and analytics applications. Apache Kafka clusters are challenging to setup, scale, and manage in production. When you run Apache Kafka on your own, you need to provision servers, configure Apache Kafka manually, replace servers when they fail, orchestrate server patches and upgrades, architect the cluster for high availability, ensure data is durably stored and secured, setup monitoring and alarms, and carefully plan scaling events to support load changes.
    Starting Price: $0.0543 per hour
  • 9
    Apache Gump

    Apache Gump

    Apache Software Foundation

    The Apache Gump continuous integration tool was the first one developed at the Apache Software Foundation. It is written in Python and fully supports Apache Ant, Apache Maven (1.x to 3.x) and other build tools. Gump is unique in that it builds and compiles software against the latest development versions of those projects. This allows Gump to detect potentially incompatible changes to that software just a few hours after those changes are checked into the version control system. Notifications are sent to the project team as soon as such a change is detected, referencing more detailed reports available online. You can set up and run Gump on your own machine and run it on your own projects, however it is currently most famous for building many of Apache's projects and their dependencies. For this purpose, the Gump project maintains its own dedicated server.
  • 10
    Apache Sentry

    Apache Sentry

    Apache Software Foundation

    Apache Sentry™ is a system for enforcing fine grained role based authorization to data and metadata stored on a Hadoop cluster. Apache Sentry has successfully graduated from the Incubator in March of 2016 and is now a Top-Level Apache project. Apache Sentry is a granular, role-based authorization module for Hadoop. Sentry provides the ability to control and enforce precise levels of privileges on data for authenticated users and applications on a Hadoop cluster. Sentry currently works out of the box with Apache Hive, Hive Metastore/HCatalog, Apache Solr, Impala and HDFS (limited to Hive table data). Sentry is designed to be a pluggable authorization engine for Hadoop components. It allows you to define authorization rules to validate a user or application’s access requests for Hadoop resources. Sentry is highly modular and can support authorization for a wide variety of data models in Hadoop.
  • 11
    Amazon EMR
    Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. If you have existing on-premises deployments of open-source tools such as Apache Spark and Apache Hive, you can also run EMR clusters on AWS Outposts. Analyze data using open-source ML frameworks such as Apache Spark MLlib, TensorFlow, and Apache MXNet. Connect to Amazon SageMaker Studio for large-scale model training, analysis, and reporting.
  • 12
    Apache ServiceMix

    Apache ServiceMix

    Apache Software Foundation

    Apache ServiceMix is a flexible, open-source integration container that unifies the features and functionality of Apache ActiveMQ, Camel, CXF, and Karaf into a powerful runtime platform you can use to build your own integrations solutions. It provides a complete, enterprise ready ESB exclusively powered by OSGi. Reliable messaging with Apache ActiveMQ. Messaging, routing and Enterprise Integration Patterns with Apache Camel. WS and RESTful web services with Apache CXF. OSGi-based server runtime powered by Apache Karaf. BPM engine via Activiti. Full JPA support via Apache OpenJPA. XA transaction management via JTA via Apache Aries. Legacy support for the JBI standard (deprecated after the ServiceMix 3.x series) through the Apache ServiceMix NMR that includes a rich Event, Messaging and Audit API. Applications for ServiceMix can be built using OSGi Blueprint, OSGi Declarative Services, and Spring DM (legacy).
  • 13
    SelectDB

    SelectDB

    SelectDB

    SelectDB is a modern data warehouse based on Apache Doris, which supports rapid query analysis on large-scale real-time data. From Clickhouse to Apache Doris, to achieve the separation of the lake warehouse and upgrade to the lake warehouse. The fast-hand OLAP system carries nearly 1 billion query requests every day to provide data services for multiple scenes. Due to the problems of storage redundancy, resource seizure, complicated governance, and difficulty in querying and adjustment, the original lake warehouse separation architecture was decided to introduce Apache Doris lake warehouse, combined with Doris's materialized view rewriting ability and automated services, to achieve high-performance data query and flexible data governance. Write real-time data in seconds, and synchronize flow data from databases and data streams. Data storage engine for real-time update, real-time addition, and real-time pre-polymerization.
    Starting Price: $0.22 per hour
  • 14
    Red Hat OpenShift Streams
    Red Hat® OpenShift® Streams for Apache Kafka is a managed cloud service that provides a streamlined developer experience for building, deploying, and scaling new cloud-native applications or modernizing existing systems. Red Hat OpenShift Streams for Apache Kafka makes it easy to create, discover, and connect to real-time data streams no matter where they are deployed. Streams are a key component for delivering event-driven and data analytics applications. The combination of seamless operations across distributed microservices, large data transfer volumes, and managed operations allows teams to focus on team strengths, speed up time to value, and lower operational costs. OpenShift Streams for Apache Kafka includes a Kafka ecosystem and is part of a family of cloud services—and the Red Hat OpenShift product family—which helps you build a wide range of data-driven solutions.
  • 15
    WarpStream

    WarpStream

    WarpStream

    WarpStream is an Apache Kafka-compatible data streaming platform built directly on top of object storage, with no inter-AZ networking costs, no disks to manage, and infinitely scalable, all within your VPC. WarpStream is deployed as a stateless and auto-scaling agent binary in your VPC with no local disks to manage. Agents stream data directly to and from object storage with no buffering on local disks and no data tiering. Create new “virtual clusters” in our control plane instantly. Support different environments, teams, or projects without managing any dedicated infrastructure. WarpStream is protocol compatible with Apache Kafka, so you can keep using all your favorite tools and software. No need to rewrite your application or use a proprietary SDK. Just change the URL in your favorite Kafka client library and start streaming. Never again have to choose between reliability and your budget.
    Starting Price: $2,987 per month
  • 16
    Apache TomEE
    Apache TomEE, pronounced “Tommy”, is an all-Apache Jakarta EE 9.1 certified application server that extends Apache Tomcat that is assembled from a vanilla Apache Tomcat zip file. We start with Apache Tomcat, add our jars, and zip up the rest. The result is Tomcat plus EE features, TomEE. Stable and ready for production, Apache TomEE 8.0 implements Java EE 8/Jakarta EE 8 and supports the javax namespace. Runs on Java 8 or higher. Mostly Jakarta EE 9.1 web profile compliant and supports the new jakarta namespace. Runs on Java 11 or higher. Apache TomEE comes in four different flavors, web profile, MicroProfile, Plus and Plume. Apache TomEE web profile delivers servlets, JSP, JSF, JTA, JPA, CDI, bean validation and EJB Lite. Apache TomEE MicroProfile adds support for MicroProfile. Apache TomEE Plus and Plume add support for JMS, JAX-WS, and more. Mostly Jakarta EE 9.1 Web Profile compliant and supports the new jakarta namespace.
  • 17
    JMeter

    JMeter

    Apache Software Foundation

    The Apache JMeter™ application is open source software, a 100% pure Java application designed to load test functional behavior and measure performance. It was originally designed for testing Web Applications but has since expanded to other test functions. Apache JMeter may be used to test performance both on static and dynamic resources, Web dynamic applications. It can be used to simulate a heavy load on a server, group of servers, network or object to test its strength or to analyze overall performance under different load types.
  • 18
    Apache Lucene

    Apache Lucene

    Apache Software Foundation

    The Apache Lucene™ project develops open-source search software. The project releases a core search library, named Lucene™ core, as well as PyLucene, a python binding for Lucene. Lucene Core is a Java library providing powerful indexing and search features, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities. The PyLucene sub project provides Python bindings for Lucene Core. The Apache Software Foundation provides support for the Apache community of open-source software projects. Apache Lucene is distributed under a commercially friendly Apache Software license. Apache Lucene set the standard for search and indexing performance. Lucene is the search core of both Apache Solr™ and Elasticsearch™. Our core algorithms along with the Solr search server power applications the world over, ranging from mobile devices to sites like Twitter, Apple and Wikipedia. The goal of Apache Lucene is to provide world class search capabilities.
  • 19
    Apache Synapse

    Apache Synapse

    Apache Software Foundation

    Apache Synapse is a lightweight and high-performance Enterprise Service Bus (ESB). Powered by a fast and asynchronous mediation engine, Apache Synapse provides exceptional support for XML, Web Services and REST. In addition to XML and SOAP, Apache Synapse supports several other content interchange formats, such as plain text, binary, Hessian and JSON. The wide range of transport adapters available for Synapse, enables it to communicate over many application and transport layer protocols. As of now, Apache Synapse supports HTTP/S, Mail (POP3, IMAP, SMTP), JMS, TCP, UDP, VFS, SMS, XMPP and FIX. High-performing PassThrough HTTP transport support for all mediation scenarios. Ultra-fast, low latency mediation of HTTP requests. Supporting a very large number of inbound (client -> ESB) and outbound (ESB -> server) connections concurrently. Intelligently handle message content and content awareness built into the engine with shared buffer for handling data.
  • 20
    Conduktor

    Conduktor

    Conduktor

    We created Conduktor, the all-in-one friendly interface to work with the Apache Kafka ecosystem. Develop and manage Apache Kafka with confidence. With Conduktor DevTools, the all-in-one Apache Kafka desktop client. Develop and manage Apache Kafka with confidence, and save time for your entire team. Apache Kafka is hard to learn and to use. Made by Kafka lovers, Conduktor best-in-class user experience is loved by developers. Conduktor offers more than just an interface over Apache Kafka. It provides you and your teams the control of your whole data pipeline, thanks to our integration with most technologies around Apache Kafka. Provide you and your teams the most complete tool on top of Apache Kafka.
  • 21
    Apache APISIX

    Apache APISIX

    Apache APISIX

    Apache APISIX provides rich traffic management features like Load Balancing, Dynamic Upstream, Canary Release, Circuit Breaking, Authentication, Observability, etc. Apache APISIX provides open source API Gateway to help you manage microservices, delivering the ultimate performance, security, and scalable platform for all your APIs and microservices. Apache APISIX is the first open-source API Gateway that includes a built-in low-code Dashboard, which offers a powerful and flexible UI for developers to use. The Apache APISIX Dashboard is designed to make it as easy as possible for users to operate Apache APISIX through a frontend interface. It’s open-source and ever evolving, feel free to contribute. The Apache APISIX dashboard is flexible to User demand, providing option to create custom modules through code matching your requirements, alongside the existing no-code toolchain.
  • 22
    Apache ServiceComb
    Open-source, full-stack microservice solution. With out-of-the-box, high performance, compatible with popular ecology, and multi-language support. Service contract guarantee based on OpenAPI. One-click scaffolding, out of the box, speeds up the building of microservice applications. The ecological extension supports multiple development languages such as Java/Golang/PHP/NodeJS. Apache ServiceComb is an open-source solution for microservices. It consists of multiple components that can be flexibly adapted to different scenarios through the combination of components. This guide can help you get started quickly with Apache ServiceComb, which is the best place to start trying for first-time users. To decouple the programming and communication models, so that a programming model can be combined with any communication models as needed. Application developers only need to focus on APIs during development and can flexibly switch communication models during deployment.
  • 23
    HugeGraph

    HugeGraph

    HugeGraph

    HugeGraph is a fast-speed and highly-scalable graph database. Billions of vertices and edges can be easily stored into and queried from HugeGraph due to its excellent OLTP ability. As compliance to Apache TinkerPop 3 framework, various complicated graph queries can be accomplished through Gremlin (a powerful graph traversal language). Among its features, it provides compliance to Apache TinkerPop 3, supporting Gremlin. Schema Metadata Management, including VertexLabel, EdgeLabel, PropertyKey and IndexLabel. Multi-type Indexes, supporting exact query, range query and complex conditions combination query. Plug-in Backend Store Driver Framework, supporting RocksDB, Cassandra, ScyllaDB, HBase and MySQL now and easy to add other backend store driver if needed. Integration with Hadoop/Spark. HugeGraph relies on the TinkerPop framework, we refer to the storage structure of Titan and the schema definition of DataStax.
  • 24
    Apache Beam

    Apache Beam

    Apache Software Foundation

    The easiest way to do batch and streaming data processing. Write once, run anywhere data processing for mission-critical production workloads. Beam reads your data from a diverse set of supported sources, no matter if it’s on-prem or in the cloud. Beam executes your business logic for both batch and streaming use cases. Beam writes the results of your data processing logic to the most popular data sinks in the industry. A simplified, single programming model for both batch and streaming use cases for every member of your data and application teams. Apache Beam is extensible, with projects such as TensorFlow Extended and Apache Hop built on top of Apache Beam. Execute pipelines on multiple execution environments (runners), providing flexibility and avoiding lock-in. Open, community-based development and support to help evolve your application and meet the needs of your specific use cases.
  • 25
    Amazon Managed Service for Apache Flink
    Thousands of customers use Amazon Managed Service for Apache Flink to run stream processing applications. With Amazon Managed Service for Apache Flink, you can transform and analyze streaming data in real-time using Apache Flink and integrate applications with other AWS services. There are no servers and clusters to manage, and there is no computing and storage infrastructure to set up. You pay only for the resources you use. Build and run Apache Flink applications, without setting up infrastructure and managing resources and clusters. Process gigabytes of data per second with subsecond latencies and respond to events in real-time. Deploy highly available and durable applications with Multi-AZ deployments and APIs for application lifecycle management. Develop applications that transform and deliver data to Amazon Simple Storage Service (Amazon S3), Amazon OpenSearch Service, and more.
    Starting Price: $0.11 per hour
  • 26
    Apache Hive

    Apache Hive

    Apache Software Foundation

    The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive. Apache Hive is an open source project run by volunteers at the Apache Software Foundation. Previously it was a subproject of Apache® Hadoop®, but has now graduated to become a top-level project of its own. We encourage you to learn about the project and contribute your expertise. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (HiveQL) into the underlying Java without the need to implement queries in the low-level Java API.
  • 27
    E-MapReduce
    EMR is an all-in-one enterprise-ready big data platform that provides cluster, job, and data management services based on open-source ecosystems, such as Hadoop, Spark, Kafka, Flink, and Storm. Alibaba Cloud Elastic MapReduce (EMR) is a big data processing solution that runs on the Alibaba Cloud platform. EMR is built on Alibaba Cloud ECS instances and is based on open-source Apache Hadoop and Apache Spark. EMR allows you to use the Hadoop and Spark ecosystem components, such as Apache Hive, Apache Kafka, Flink, Druid, and TensorFlow, to analyze and process data. You can use EMR to process data stored on different Alibaba Cloud data storage service, such as Object Storage Service (OSS), Log Service (SLS), and Relational Database Service (RDS). You can quickly create clusters without the need to configure hardware and software. All maintenance operations are completed on its Web interface.
  • 28
    HtmlUnit

    HtmlUnit

    HtmlUnit

    HtmlUnit is a "GUI-Less browser for Java programs" that models HTML documents and provides an API to interact with web pages, such as invoking pages, filling out forms, and clicking links, similar to a standard web browser. It offers fairly good JavaScript support, which is constantly improving and is capable of handling complex AJAX libraries, simulating browsers like Chrome, Firefox, or Edge depending on the configuration used. Typically used for testing purposes or retrieving information from websites, HtmlUnit is not a generic unit testing framework but is intended to simulate a browser within another testing framework such as JUnit or TestNG. It is utilized as the underlying "browser" by various open source tools like WebDriver, Arquillian Drone, and Serenity BDD, and is employed by many projects for automated web testing, including Apache Shiro, Apache Struts, and Quarkus.
  • 29
    Amazon MWAA
    Amazon Managed Workflows for Apache Airflow (MWAA) is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale. Apache Airflow is an open-source tool used to programmatically author, schedule, and monitor sequences of processes and tasks referred to as “workflows.” With Managed Workflows, you can use Airflow and Python to create workflows without having to manage the underlying infrastructure for scalability, availability, and security. Managed Workflows automatically scales its workflow execution capacity to meet your needs, and is integrated with AWS security services to help provide you with fast and secure access to data.
    Starting Price: $0.49 per hour
  • 30
    ApacheBooster

    ApacheBooster

    NdimensionZ

    ApacheBooster has been specifically designed to enhance the working of web servers based on cPanel. ApacheBooster as the name suggests boosts the working ability of the Apache web server, which is according to the census the most used web server in the world! Nginx and varnish has been fused together in ApacheBooster to make it effectively efficient in its working. Nginx is a super quality high performing web server software that speeds up the working of the web server. The best feature of Nginx is that it is very fast in its working i.e in retrieving static files and also helps in saving memory by using less memory for processing of concurrent requests. It is very efficient in handling traffic requests. With the less amount of memory used, it is capable of handling more requests/clients when compared to Apache. Nginx is a reverse proxy server of open source type that smartly balances the load, a web server and web cache (also known as HTTP cache).
  • 31
    Apache Accumulo

    Apache Accumulo

    Apache Corporation

    With Apache Accumulo, users can store and manage large data sets across a cluster. Accumulo uses Apache Hadoop's HDFS to store its data and Apache ZooKeeper for consensus. While many users interact directly with Accumulo, several open source projects use Accumulo as their underlying store. To learn more about Accumulo, take the Accumulo tour, read the user manual and run the Accumulo example code. Feel free to contact us if you have any questions. Accumulo has a programming mechanism (called Iterators) that can modify key/value pairs at various points in the data management process. Every Accumulo key/value pair has its own security label which limits query results based off user authorizations. Accumulo runs on a cluster using one or more HDFS instances. Nodes can be added or removed as the amount of data stored in Accumulo changes.
  • 32
    MLlib

    MLlib

    Apache Software Foundation

    ​Apache Spark's MLlib is a scalable machine learning library that integrates seamlessly with Spark's APIs, supporting Java, Scala, Python, and R. It offers a comprehensive suite of algorithms and utilities, including classification, regression, clustering, collaborative filtering, and tools for constructing machine learning pipelines. MLlib's high-quality algorithms leverage Spark's iterative computation capabilities, delivering performance up to 100 times faster than traditional MapReduce implementations. It is designed to operate across diverse environments, running on Hadoop, Apache Mesos, Kubernetes, standalone clusters, or in the cloud, and accessing various data sources such as HDFS, HBase, and local files. This flexibility makes MLlib a robust solution for scalable and efficient machine learning tasks within the Apache Spark ecosystem. ​
  • 33
    ODFToEPub

    ODFToEPub

    Pincette

    With ODFToEPub everyone can write an e-book with full control over how it will look. All you need is a word processor that can produce a document in a format suitable for Apache OpenOffice or LibreOffice. That is, of course, Apache OpenOffice and LibreOffice, but also Microsoft Word, iWork, WordPerfect, Zoho, Google Docs, etc. In Apache OpenOffice and LibreOffice you choose the export function to convert an ODT-file to ePub. With this tool self publishers have immediate feedback about what their e-book will look like. Publishers can give their authors a standard template and build the tool into their systems in order to streamline their ePub production process. Companies can cut down printing by publishing their internal documents as e-books. ODFToEPub is an Apache OpenOffice and LibreOffice extension as well as a stand-alone program. After receiving the license.xml file by e-mail, you should save it on your computer and install it.
    Starting Price: $52.00/one-time/user
  • 34
    Apache James

    Apache James

    The Apache Software Foundation

    James stands for Java Apache Mail Enterprise Server. It has a modular architecture based on a rich set of modern and efficient components which provides at the end complete, stable, secure, and extendable Mail Servers running on the JVM. Create your own personal solution of email treatment by assembling the components you need thanks to the Inversion of Control mail platform offered and go further customizing filtering and routing rules using James Mailet Container. The Apache James project wires together the different libraries composing James to provide running services, ready to download on the Apache mirrors.
  • 35
    Apache Geronimo
    Apache Geronimo is an open-source set of projects that are focused on providing JavaEE/JakartaEE libraries and Microprofile implementations. We are actively delivering reusable Java EE components though. They are widely used and still actively maintained! Apache Geronimo provides libraries for the implementations of the Java EE and Jakarta EE specifications. The implementations are also focused on providing OSGi bundle metadata. The goal of XBean project is to create a plugin-based server analogous to Eclipse is a plugin-based IDE. XBean will be able to discover, download and install server plugins from an Internet-based repository. In addition, we include support for multiple IoC systems, support for running with no IoC system, JMX without JMX code, lifecycle and class loader management, and rock-solid Spring integration. Apache Geronimo hosts several Microprofile implementations. Apache Geronimo Arthur is an effort to build a thin layer on top of Oracle GraalVM.
  • 36
    PDFBox

    PDFBox

    Apache Software Foundation

    The Apache PDFBox® library is an open-source Java tool for working with PDF documents. This project allows the creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0. Extract Unicode text from PDF files. Split a single PDF into many files or merge multiple PDF files. Extract data from PDF forms or fill a PDF form. Validate PDF files against the PDF/A-1b standard. Print a PDF file using the standard Java printing API. Create a PDF from scratch, with embedded fonts and images. Save PDFs as image files, such as PNG or JPEG and digitally sign PDF files. See also the export control information related to the encryption features included in Apache PDFBox.
  • 37
    MXNet

    MXNet

    The Apache Software Foundation

    A hybrid front-end seamlessly transitions between Gluon eager imperative mode and symbolic mode to provide both flexibility and speed. Scalable distributed training and performance optimization in research and production is enabled by the dual parameter server and Horovod support. Deep integration into Python and support for Scala, Julia, Clojure, Java, C++, R and Perl. A thriving ecosystem of tools and libraries extends MXNet and enables use-cases in computer vision, NLP, time series and more. Apache MXNet is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision-making process have stabilized in a manner consistent with other successful ASF projects. Join the MXNet scientific community to contribute, learn, and get answers to your questions.
  • 38
    Apache PredictionIO
    Apache PredictionIO® is an open-source machine learning server built on top of a state-of-the-art open-source stack for developers and data scientists to create predictive engines for any machine learning task. It lets you quickly build and deploy an engine as a web service on production with customizable templates. Respond to dynamic queries in real-time once deployed as a web service, evaluate and tune multiple engine variants systematically, and unify data from multiple platforms in batch or in real-time for comprehensive predictive analytics. Speed up machine learning modeling with systematic processes and pre-built evaluation measures, support machine learning and data processing libraries such as Spark MLLib and OpenNLP. Implement your own machine learning models and seamlessly incorporate them into your engine. Simplify data infrastructure management. Apache PredictionIO® can be installed as a full machine learning stack, bundled with Apache Spark, MLlib, HBase, Akka HTTP, etc.
  • 39
    Apache Subversion

    Apache Subversion

    Apache Software Foundation

    Welcome to subversion, the online home of the Apache® Subversion® software project. Subversion is an open-source version control system. Founded in 2000 by CollabNet, Inc., the Subversion project and software have seen incredible success over the past decade. Subversion has enjoyed and continues to enjoy widespread adoption in both the open-source arena and the corporate world. Subversion is developed as a project of the Apache Software Foundation, and as such is part of a rich community of developers and users. We're always in need of individuals with a wide range of skills, and we invite you to participate in the development of Apache Subversion. Subversion exists to be universally recognized and adopted as an open-source, centralized version control system characterized by its reliability as a safe haven for valuable data; the simplicity of its model and usage; and its ability to support the needs of a wide variety of users and projects.
  • 40
    Apache Pulsar

    Apache Pulsar

    Apache Software Foundation

    Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project. Easy to deploy, lightweight compute process, developer-friendly APIs, no need to run your own stream processing engine. Run in production at Yahoo! scale for over 5 years, with millions of messages per second across millions of topics. Built from the ground up as a multi-tenant system. Supports isolation, authentication, authorization and quotas. Configurable replication between data centers across multiple geographic regions. Persistent message storage based on Apache BookKeeper. IO-level isolation between write and read operations. Rest admin API for provisioning, administration, tools and monitoring.
  • 41
    Apache Tomcat
    The Apache Tomcat® software is an open source implementation of the Jakarta Servlet, Jakarta Server Pages, Jakarta Expression Language, Jakarta WebSocket, Jakarta Annotations and Jakarta Authentication specifications. These specifications are part of the Jakarta EE platform. Apache Tomcat software powers numerous large-scale, mission-critical web applications across a diverse range of industries and organizations. Some of these users and their stories are listed on the PoweredBy wiki page. The Apache Tomcat Project is proud to announce the release of version 10.0.10 of Apache Tomcat. This release implements specifications that are part of the Jakarta EE 9 platform.
  • 42
    Apache Giraph

    Apache Giraph

    Apache Software Foundation

    Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections. Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in a 2010 paper. Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant. Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more. With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale. Apache Giraph is an iterative graph processing framework, built on top of Apache Hadoop.
  • 43
    Google Cloud Managed Service for Apache Spark
    Managed Service for Apache Spark is a Google Cloud solution that simplifies running Apache Spark workloads with either serverless execution or fully managed clusters. It allows users to process large-scale data without needing to manage infrastructure, reducing operational complexity. The platform features Lightning Engine, which accelerates Spark performance by up to 4.9 times compared to open-source Spark. It supports data engineering, data science, and machine learning workflows at scale. Integration with Gemini enables AI-powered development, including automated code generation and troubleshooting. The service works seamlessly with open data formats like Apache Iceberg and integrates with tools like BigQuery and Knowledge Catalog. It offers flexible deployment options to suit different workloads and use cases. Overall, it provides a faster, smarter, and more efficient way to run Spark workloads in the cloud.
  • 44
    Apache Mahout

    Apache Mahout

    Apache Software Foundation

    Apache Mahout is a powerful, scalable, and versatile machine learning library designed for distributed data processing. It offers a comprehensive set of algorithms for various tasks, including classification, clustering, recommendation, and pattern mining. Built on top of the Apache Hadoop ecosystem, Mahout leverages MapReduce and Spark to enable data processing on large-scale datasets. Apache Mahout(TM) is a distributed linear algebra framework and mathematically expressive Scala DSL designed to let mathematicians, statisticians, and data scientists quickly implement their own algorithms. Apache Spark is the recommended out-of-the-box distributed back-end or can be extended to other distributed backends. Matrix computations are a fundamental part of many scientific and engineering applications, including machine learning, computer vision, and data analysis. Apache Mahout is designed to handle large-scale data processing by leveraging the power of Hadoop and Spark.
  • 45
    Apache Derby
    Apache Derby, an Apache DB subproject, is an open source relational database implemented entirely in Java and available under the Apache License, Version 2.0. Derby has a small footprint - about 3.5 megabytes for the base engine and embedded JDBC driver. Derby provides an embedded JDBC driver that lets you embed Derby in any Java-based solution. Derby also supports the more familiar client/server mode with the Derby Network Client JDBC driver and Derby Network Server.
  • 46
    Apache Ivy

    Apache Ivy

    Apache Software Foundation

    Apache Ivy™ is a popular dependency manager focusing on flexibility and simplicity. Find out more about its unique enterprise features, what people say about it, and how it can improve your build system! Ivy is a tool for managing (recording, tracking, resolving, and reporting) project dependencies. Ivy is essentially process agnostic and is not tied to any methodology or structure. Instead, it provides the necessary flexibility and reconfigurability to be adapted to a broad range of dependency management and build processes. While available as a standalone tool, Ivy works particularly well with Apache Ant providing a number of powerful Ant tasks ranging from dependency resolution to dependency reporting and publication. Ivy has a lot of powerful features, the most popular and useful being its flexibility, integration with Ant, and strong transitive dependencies management engine. Ivy is open source and released under a very permissive Apache License.
  • 47
    OpenMeetings

    OpenMeetings

    Apache Software Foundation

    Openmeetings provides video conferencing, instant messaging, white board, collaborative document editing and other groupware tools. It uses API functions of Media Server for Remoting and Streaming Kurento. OpenMeetings is a project of The Apache Software Foundation, the old project website at GoogleCode will receive no updates anymore. The website at Apache is the only place that receives updates.
  • 48
    Apache DataFusion

    Apache DataFusion

    Apache Software Foundation

    Apache DataFusion is an extensible, high-performance query engine written in Rust that utilizes Apache Arrow as its in-memory format. Designed for developers building data-centric systems such as databases, data frames, machine learning, and streaming applications, DataFusion offers SQL and DataFrame APIs, a vectorized, multi-threaded, streaming execution engine, and support for partitioned data sources. It natively supports formats like CSV, Parquet, JSON, and Avro, and allows for seamless integration with object stores including AWS S3, Azure Blob Storage, and Google Cloud Storage. The engine features a comprehensive query planner, a state-of-the-art optimizer with capabilities like expression coercion and simplification, projection and filter pushdown, sort and distribution-aware optimizations, and automatic join reordering. DataFusion is highly customizable, enabling the addition of user-defined scalar, aggregate, and window functions, custom data sources, query languages, etc.
  • 49
    Spark Streaming

    Spark Streaming

    Apache Software Foundation

    Spark Streaming brings Apache Spark's language-integrated API to stream processing, letting you write streaming jobs the same way you write batch jobs. It supports Java, Scala and Python. Spark Streaming recovers both lost work and operator state (e.g. sliding windows) out of the box, without any extra code on your part. By running on Spark, Spark Streaming lets you reuse the same code for batch processing, join streams against historical data, or run ad-hoc queries on stream state. Build powerful interactive applications, not just analytics. Spark Streaming is developed as part of Apache Spark. It thus gets tested and updated with each Spark release. You can run Spark Streaming on Spark's standalone cluster mode or other supported cluster resource managers. It also includes a local run mode for development. In production, Spark Streaming uses ZooKeeper and HDFS for high availability.
  • 50
    Azure Databricks
    Unlock insights from all your data and build artificial intelligence (AI) solutions with Azure Databricks, set up your Apache Spark™ environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn. Azure Databricks provides the latest versions of Apache Spark and allows you to seamlessly integrate with open source libraries. Spin up clusters and build quickly in a fully managed Apache Spark environment with the global scale and availability of Azure. Clusters are set up, configured, and fine-tuned to ensure reliability and performance without the need for monitoring. Take advantage of autoscaling and auto-termination to improve total cost of ownership (TCO).