Apache IoTDB
R interface for Apache Spark
ETL framework to index data for AI, such as RAG
A data management tool that enables working with other SQL tools
Open Source Data Orchestration for the Cloud
pprof is a tool for visualization and analysis of profiling data
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
Data visualization analysis tool
AI-data warehouse to enrich, transform and analyze unstructured data
High-Performance Serverless event and data processing platform
Centralize, transform and stash your data
JuiceFS is a distributed POSIX file system built on top of Redis
Vector database for scalable similarity search and AI applications
A multi-cloud framework for big data analytics
Docker image used to run data processing workloads
Streamline your ML workflow
Java dataframe and visualization library
Python module that helps you build complex pipelines of batch jobs
Metadata and data identification tool and Python library
Build concurrent, distributed, and resilient message-driven apps
A toolkit to run Ray applications on Kubernetes
Data Quality Operations Center
Koito is a modern, themeable scrobbler
LinDB is a scalable, high performance, high availability database
The open protocol for real-time sync to client applications