A unified analytics engine for large-scale data processing
Spark-TTS Inference Code
State of the Art Natural Language Processing
Apache Spark to Apache Cassandra connector
Docker image used to run data processing workloads
A free, open-source, and cross-platform big data analytics framework
Apache Kyuubi is a distributed and multi-tenant gateway
Jupyter magics and kernels for working with remote Spark clusters
A unified interface for distributed computing
Simple and distributed Machine Learning
R interface for Apache Spark
A drop-in Apache Spark replacement written in Rust
Deequ is a library built on top of Apache Spark
Command-line tool from the Alire project and supporting library
A Spark library for Amazon SageMaker
A Scala kernel for Jupyter
Cluster computing framework for processing large-scale geospatial data
Distributed DataFrame for Python designed for the cloud
An end-to-end, realtime and cloud native Lakehouse framework
Scalable and Flexible Gradient Boosting
A Cloud Native Batch System (Project under CNCF)
Python Stream Processing
Apache Iceberg
Monitor the stability of a Pandas or Spark dataframe
Apache IoTDB