Showing 14 open source projects for "cluster"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 1
    Gatling

    Gatling

    Modern Load Testing as Code

    Gatling is a high-performance load testing tool built on the JVM that emphasizes realism, scalability, and developer ergonomics. Test scenarios are scripted in a concise Scala-based DSL, allowing you to model user journeys with think times, feeders (dynamic data), checks, and assertions all in code. Its asynchronous, non-blocking engine (backed by Netty) can drive very high concurrency from a single injector, reducing the need for large injector farms. Gatling supports HTTP out of the box as...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Metarank

    Metarank

    A low code Machine Learning service that personalizes articles

    Metarank is a service that can personalize any type of content: product listings, articles, recommendations and search results in 3 easy steps with a few lines of code. It’s often considered "too risky" to spend 6+ months on an in-house moonshot project to reinvent the wheel without an experienced team and no existing open-source tools. Metarank makes it easy not only for Amazon to do personalization but for everyone else. Ingest historical item listings, clicks and item metadata so Metarank...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Akka

    Akka

    Build concurrent, distributed, and resilient message-driven apps

    ...Small memory footprint; ~2.5 million actors per GB of heap. Distributed systems without single points of failure. Load balancing and adaptive routing across nodes. Event Sourcing and CQRS with Cluster Sharding. Distributed Data for eventual consistency using CRDTs. Asynchronous non-blocking stream processing with backpressure.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    SageMaker Spark

    SageMaker Spark

    A Spark library for Amazon SageMaker

    SageMaker Spark is an open-source Spark library for Amazon SageMaker. With SageMaker Spark you construct Spark ML Pipelines using Amazon SageMaker stages. These pipelines interleave native Spark ML stages and stages that interact with SageMaker training and model hosting. With SageMaker Spark, you can train on Amazon SageMaker from Spark DataFrames using Amazon-provided ML algorithms like K-Means clustering or XGBoost, and make predictions on DataFrames against SageMaker endpoints hosting...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    Apache Spark is a unified engine for large-scale data processing, offering APIs for batch jobs, streaming, machine learning, and graph computation. It builds on resilient distributed datasets (RDDs) and the newer DataFrame/Dataset abstractions to provide fault-tolerant, in-memory computation across clusters. Spark’s execution engine handles scheduling, shuffles, caching, and data locality so users can focus on transformations rather than infrastructure plumbing. With Spark Streaming...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    ...With the HTTP on Spark project, users can embed any web service into their SparkML models. For production-grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    FiloDB

    FiloDB

    Distributed Prometheus time series database

    FiloDB is an open-source distributed, real-time, in-memory, massively scalable, multi-schema time series / event / operational database with Prometheus query support and some Spark support as well. The normal configuration for real-time ingestion is deployment as stand-alone processes in a cluster, ingesting directly from Apache Kafka. The processes form a cluster using peer-to-peer Akka Cluster technology. Designed to ingest many millions of entities, sharded across multiple processes, with distributed querying built in. Support for indexing and fast querying over flexible tags for each time series/partition, just like Prometheus. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    SnappyData

    SnappyData

    Memory optimized analytics database, based on Apache Spark

    ...By fusing an in-memory hybrid database inside Apache Spark, it provides analytic query processing, mutability/transactions, access to virtually all big data sources and stream processing all in one unified cluster. One common use case for SnappyData is to provide analytics at interactive speeds over large volumes of data with minimal or no pre-processing of the dataset. For instance, there is no need to often pre-aggregate/reduce or generate cubes over your large data sets for ad-hoc visual analytics. This is made possible by smartly managing data in memory, dynamically generating code using vectorization optimizations, and maximizing the potential of modern multi-core CPUs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CMAK

    CMAK

    A tool for managing Apache Kafka clusters

    CMAK (previously known as Kafka Manager) is a tool for managing Apache Kafka clusters. Easy inspection of cluster state (topics, consumers, offsets, brokers, replica distribution, partition distribution). Generate partition assignments with option to select brokers to use. Run reassignment of partition (based on generated assignments). Create a topic with optional topic configs (0.8.1.1 has different configs than 0.8.2+). Delete topic (only supported on 0.8.2+ and remember set delete.topic.enable=true in broker config). ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 10
    Spark JobServer

    Spark JobServer

    REST job server for Apache Spark

    ...The architecture isolates Spark contexts (optionally in separate JVMs), isolates job dependencies, and persists job / jar metadata via pluggable DAOs. It supports deployment across cluster managers (YARN, Mesos, etc.) and aims to simplify Spark-as-a-service scenarios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Marathon

    Marathon

    Deploy and manage containers (including Docker) on top of Apache Mesos

    ...Marathon is a production-grade container orchestration platform for Mesosphere’s Datacenter Operating System (DC/OS) and Apache Mesos. Marathon has first-class support for both Mesos containers (using cgroups) and Docker. Marathon runs as an active/passive cluster with leader election for 100% uptime. Marathon can bind persistent storage volumes to your application. You can run databases like MySQL and Postgres, and have storage accounted for by Mesos. Supply an HTTP endpoint to receive notifications, for example to integrate with an external load balancer. Query them at /metrics in JSON format, push them to systems like Graphite, StatsD and DataDog, or scrape them using Prometheus.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Summingbird

    Summingbird

    Streaming MapReduce with Scalding and Storm

    Summingbird is a streaming + batch hybrid computation framework developed by Twitter. Its aim is to let developers express data aggregation pipelines in a unified way, where the same logic can run either in real time (stream) or in batch mode, and the results can be merged or reconciled. In effect, Summingbird abstracts over multiple execution engines (such as Storm, Scalding, etc.) to provide one high-level program that composes transformations and aggregations, and then executes them in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Deem

    Analyze time-course data with significance tests, clustering, modeling

    Use statistical methods to analyze time-course data (gene expression microarray and RNA-seq data in particular, but not limited to). Apply significance tests to filter out only significant genes or time series. Cluster time series into similar groups. Generate network models, including linear or non-linear models. Variable selection and optimization routines included. Written in Scala and R. The application is a cross-platform desktop app with a simple GUI and is fully functional currently. The app was and is developed at the University of Rochester (http://cbim.urmc.rochester.edu) under the GPL 3.0 license. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Gizzard

    Gizzard

    Framework for creating eventually-consistent distributed datastores

    ...In Gizzard, data is stored in underlying storage shards (which could be databases or other stores) and Gizzard handles the process of routing requests correctly as the cluster topology changes. Gizzard's architecture is designed for operational flexibility: you can change the shard layout over time, reassign replicas, migrate data between nodes, and have requests redirected during transitions. It also supports secondary indexing and provides hooks for custom logic in migrations and consistency. Because Gizzard handles much of the complexity of shard routing and cluster transitions, it was used to support large-scale, evolving storage backends in production.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.