Showing 102 open source projects for "spark gap linux"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    Apache Spark is a unified engine for large-scale data processing, offering APIs for batch jobs, streaming, machine learning, and graph computation. It builds on resilient distributed datasets (RDDs) and the newer DataFrame/Dataset abstractions to provide fault-tolerant, in-memory computation across clusters. Spark’s execution engine handles scheduling, shuffles, caching, and data locality so users can focus on transformations rather than infrastructure plumbing. With Spark Streaming...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    .NET for Apache Spark

    .NET for Apache Spark

    A free, open-source, and cross-platform big data analytics framework

    ...This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer. .NET for Apache Spark runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. It also runs on all major cloud providers including Azure HDInsight Spark, Amazon EMR Spark, AWS & Azure Databricks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The SageMaker Spark Container is a Docker image used to run batch data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    SageMaker Spark

    SageMaker Spark

    A Spark library for Amazon SageMaker

    SageMaker Spark is an open-source Spark library for Amazon SageMaker. With SageMaker Spark you construct Spark ML Pipelines using Amazon SageMaker stages. These pipelines interleave native Spark ML stages and stages that interact with SageMaker training and model hosting. With SageMaker Spark, you can train on Amazon SageMaker from Spark DataFrames using Amazon-provided ML algorithms like K-Means clustering or XGBoost, and make predictions on DataFrames against SageMaker endpoints hosting...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Deequ

    Deequ

    Deequ is a library built on top of Apache Spark

    Deequ is a library built atop Apache Spark that enables defining “unit tests for data” — that is, formal constraints or checks on datasets to ensure data quality along dimensions such as completeness, uniqueness, value ranges, correlations, etc. It can scale to large datasets (billions of rows) by translating those data checks into Spark jobs. Deequ supports advanced features like a metrics repository for storing computed statistics over time, anomaly detection of data quality metrics, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Serverless Java container

    Serverless Java container

    A Java wrapper to run Spring, Spring Boot, Jersey, and other apps

    The AWS Serverless Java Container library is a framework that allows developers to run existing or new Java web applications—built with frameworks such as Spring, Jersey, Spark, Struts—inside AWS Lambda with minimal modifications. It bridges the gap between traditional servlet or web-framework models and serverless functions by mapping HTTP events from API Gateway into requests your framework understands and routing responses back appropriately. This means you can keep much of your familiar Java-based architecture (controllers, filters, dependency injection) and deploy it in a serverless environment without rewriting everything from scratch. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Alire

    Alire

    Command-line tool from the Alire project and supporting library

    Alire is a source-based package manager for the Ada and SPARK programming languages. It facilitates the building and sharing of projects within the Ada community, allowing developers to easily manage dependencies and publish their own libraries or programs. Alire aims to streamline the development process for Ada and SPARK by providing a standardized approach to package management. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Apache Sedona

    Apache Sedona

    Cluster computing framework for processing large-scale geospatial data

    Apache Sedona™ is a cluster computing system for processing large-scale spatial data. Sedona extends existing cluster computing systems, such as Apache Spark and Apache Flink, with a set of out-of-the-box distributed Spatial Datasets and Spatial SQL that efficiently load, process, and analyze large-scale spatial data across machines. According to our benchmark and third-party research papers, Sedona runs 2X - 10X faster than other Spark-based geospatial data systems on computation-intensive...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    XGBoost is an optimized distributed gradient boosting library, designed to be scalable, flexible, portable and highly efficient. It supports regression, classification, ranking and user defined objectives, and runs on all major operating systems and cloud platforms. XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems....
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Volcano

    Volcano

    A Cloud Native Batch System (Project under CNCF)

    Volcano is a batch system built on Kubernetes. It provides a suite of mechanisms that are commonly required by many classes of batch & elastic workload including machine learning/deep learning, bioinformatics/genomics, and other "big data" applications. These types of applications typically run on generalized domain frameworks like TensorFlow, Spark, Ray, PyTorch, MPI, etc, which Volcano integrates with. Volcano builds upon a decade and a half of experience running a wide variety of...
    Downloads: 64 This Week
    Last Update:
    See Project
  • 12
    Laravel Lang

    Laravel Lang

    List of 126 languages for Laravel Framework, Laravel Jetstream, etc.

    List of 126 languages for Laravel Framework, Laravel Jetstream, Laravel Fortify, Laravel Breeze, Laravel Cashier, Laravel Nova, Laravel Spark and Laravel UI. It is recommended to use this particular package as it will allow you to very quickly update all the necessary dependencies that ensure application localization.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Soot

    Soot

    Soot - A Java optimization framework

    Soot is a Java optimization framework. It provides four intermediate representations for analyzing and transforming Java bytecode. Baf: a streamlined representation of bytecode which is simple to manipulate. Jimple: a typed 3-address intermediate representation suitable for optimization. Shimple: an SSA variation of Jimple. Grimp: an aggregated version of Jimple suitable for decompilation and code inspection.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    go-chart

    go-chart

    go chart is a basic charting library in go

    Package chart is a very simple golang native charting library that supports time-series and continuous line charts. Master should now be on the v3.x codebase, which overhauls the api significantly. Per usual, see examples for more information. Actual chart configurations and examples can be found in the ./examples/ directory. They are simple CLI programs that write to output.png (they are also updated with go generate. Everything on the chart.Chart object has defaults that can be overridden....
    Downloads: 12 This Week
    Last Update:
    See Project
  • 15
    Three.js Skills for Claude Code

    Three.js Skills for Claude Code

    Collection of Three.js skill files

    Three.js Skills for Claude Code repository is a curated collection of modular skills and educational code aimed at helping developers learn and apply Three.js, the popular JavaScript library for 3D graphics on the web. It groups foundational lessons, examples, and utilities that make it easier to set up 3D scenes, work with cameras, lighting, materials, shaders, and animation loops, and handle user interactions in a browser context. The project functions as a toolbox of practical snippets...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 16
    JavaFamily

    JavaFamily

    Java Interview + Java Study Guide

    JavaFamily is a large educational repository that aggregates knowledge, tutorials, and resources related to Java development and backend engineering. It covers a wide range of topics including core Java, Spring framework, microservices, distributed systems, and performance optimization. The project is designed to help developers build a strong foundation while also exploring advanced concepts used in enterprise environments. It includes explanations, code samples, and curated resources that...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    Pomerium

    Pomerium

    Pomerium is an identity and context-aware access proxy

    Secure, context-aware access that just works. Access internal resources securely. Implement zero trust. Achieve compliance. All without the headache of a VPN. For teams that prefer a hosted solution while keeping data governance. For organizations that need advanced scaling, access control, and governance capabilities. IT and developers need a scalable access control solution to keep users productive, happy, and secure. Pomerium uses identity and context to ensure secure access to internal...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 18
    Apache Beam

    Apache Beam

    Unified programming model for Batch and Streaming

    Apache Beam is an open source, unified programming model to define both batch and streaming data-parallel processing pipelines, as well as certain language-specific SDKs for constructing pipelines and Runners. These pipelines are executed on one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. Beam is especially useful for Embarrassingly Parallel data processing tasks, and caters to the different needs...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Numba

    Numba

    NumPy aware dynamic Python compiler using LLVM

    Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. You don't need to replace the Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Just apply one of the Numba decorators to your...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Gherkin

    Gherkin

    A parser and compiler for the Gherkin language

    Gherkin is a domain-specific language used in behavior-driven development (BDD) to describe software behaviors in a human-readable format. It allows stakeholders to write test cases in plain language, bridging the gap between technical and non-technical team members.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Nushell

    Nushell

    A new type of shell

    NuShell (often shortened to “Nu”) is a modern, cross-platform shell written in Rust that treats all data as structured tables rather than plain text. It supports pipelines on rich typed data, has built-in commands for JSON/CSV/SQL/excel, and offers scripting, autocompletion, scoped variables, and strong error handling—bridging the gap between shell scripting and programming.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Grida

    Grida

    Open Source Canvas Framework

    Grida is an open-source platform that transforms design assets (like Figma files) into production-ready code. It helps developers bridge the gap between design and development by automating code generation for UI components and layout systems. Grida supports multiple frontend frameworks and aims to streamline the handoff process for teams.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Triton

    Triton

    Development repository for the Triton language and compiler

    Triton is a programming language and compiler framework specifically designed for writing highly efficient custom deep learning operations, particularly for GPUs. It aims to bridge the gap between low-level GPU programming, such as CUDA, and higher-level abstractions by providing a more productive and flexible environment for developers. Triton enables users to write optimized kernels for machine learning workloads while maintaining readability and control over performance-critical aspects...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    SQL Formatter

    SQL Formatter

    A whitespace formatter for different query languages

    SQL Formatter is a JavaScript library for pretty-printing SQL queries. It started as a port of a PHP Library, but has since considerably diverged. It supports various SQL dialects: GCP BigQuery, IBM DB2, Apache Hive, MariaDB, MySQL, Couchbase N1QL, Oracle PL/SQL, PostgreSQL, Amazon Redshift, SingleStoreDB, Snowflake, Spark, SQL Server Transact-SQL, Trino/Presto. See language option docs for more details. The CLI tool will be installed under sql-formatter and may be invoked via npx...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Front-End Design Checklist

    Front-End Design Checklist

    The Design Checklist for Creative Web Designers

    Front-End-Design-Checklist bridges the gap between design and implementation by capturing the essential details that make handoffs smooth and outcomes consistent. It encourages designers and developers to align on typography scales, color tokens, spacing systems, and grid behavior before coding begins. The resource includes checks for responsive breakpoints, interaction states, accessibility considerations, and asset preparation, reducing rework later in the build. It promotes shared...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB