50 projects for "data processing" with 2 filters applied:

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Kapacitor

    Kapacitor

    Open source framework for processing, monitoring, and alerting

    Open source framework for processing, monitoring, and alerting on time series data. Kapacitor is a real-time data processing engine for monitoring and alerting, specifically designed to work with time-series data from InfluxDB.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    Apache Spark is a unified engine for large-scale data processing, offering APIs for batch jobs, streaming, machine learning, and graph computation. It builds on resilient distributed datasets (RDDs) and the newer DataFrame/Dataset abstractions to provide fault-tolerant, in-memory computation across clusters. Spark’s execution engine handles scheduling, shuffles, caching, and data locality so users can focus on transformations rather than infrastructure plumbing. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Broadway

    Broadway

    Concurrent and multi-stage data ingestion and data processing

    Broadway is a data processing library for Elixir designed to handle high-throughput, concurrent workloads with ease. It provides an abstraction for defining pipelines that consume data from sources like RabbitMQ, Kafka, Amazon SQS, or custom producers. Each pipeline is fault-tolerant and backpressure-aware, ensuring stable throughput even under load.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Apache Sedona

    Apache Sedona

    Cluster computing framework for processing large-scale geospatial data

    Apache Sedona™ is a cluster computing system for processing large-scale spatial data. Sedona extends existing cluster computing systems, such as Apache Spark and Apache Flink, with a set of out-of-the-box distributed Spatial Datasets and Spatial SQL that efficiently load, process, and analyze large-scale spatial data across machines. According to our benchmark and third-party research papers, Sedona runs 2X - 10X faster than other Spark-based geospatial data systems on computation-intensive query workloads. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Spring Batch

    Spring Batch

    Spring Batch is a framework for writing batch applications using Java

    A lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management. It also provides more advanced technical services and features that will enable extremely high-volume and high...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    Apache Flink

    Apache Flink

    Stream processing framework with powerful stream

    Apache Flink is a distributed engine for stateful computations over data streams and batches, designed for low-latency processing at scale. Its core runtime executes dataflow graphs with fine-grained backpressure and checkpointing, allowing applications to recover consistently from failures. Flink’s event-time model and watermarks enable accurate out-of-order processing, windowing, and complex time semantics that typical real-time systems struggle with.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Bacalhau

    Bacalhau

    Community-driven, simple, yet powerful framework

    Bacalhau is a decentralized compute platform for running jobs on data stored across distributed networks, like IPFS or Filecoin, without moving the data to centralized cloud environments. It allows developers to run containerized workloads close to where the data lives, reducing latency, cost, and privacy risks. Bacalhau supports various runtime environments and is designed to make decentralized data processing as accessible as traditional cloud computing. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Smallpond

    Smallpond

    A lightweight data processing framework built on DuckDB and 3FS

    smallpond is a lightweight distributed data processing framework built by DeepSeek, designed to scale DuckDB workloads over clusters using their 3FS (Fire-Flyer File System) backend. The idea is to preserve DuckDB’s fast analytics engine but lift it from single-node to multi-node settings, giving you the ability to operate on large datasets (e.g. petabyte scale) without moving to a heavyweight system like Spark.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Argo Workflows

    Argo Workflows

    Workflow engine for Kubernetes

    ...Model multi-step workflows as a sequence of tasks or capture the dependencies between tasks using a directed acyclic graph (DAG). Easily run compute intensive jobs for machine learning or data processing in a fraction of the time using Argo Workflows on Kubernetes. Run CI/CD pipelines natively on Kubernetes without configuring complex software development products. Argo Workflows is the most popular workflow execution engine for Kubernetes. It can run 1000s of workflows a day, each with 1000s of concurrent tasks. Our users say it is lighter-weight, faster, more powerful, and easier to use. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    GeoStats.jl

    GeoStats.jl

    An extensible framework for geospatial data science

    GeoStats.jl is a Julia framework for geospatial data science and geostatistical modeling. It’s fully implemented in Julia and designed to provide an extensible, high-performance stack that handles spatial domains, interpolation, simulation, learning, and visualization. The package is modular: it breaks out geometry, spatial domains, transforms, variograms, covariance models, and modeling into subpackages (e.g., GeoStatsBase, GeoStatsModels, GeoStatsTransforms). Users can represent...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    Trame

    Trame

    Weave various components and technologies into a Web App

    ...It enables the integration of various components and technologies, such as VTK and ParaView, into web applications written entirely in Python. With best-in-class platforms at its core, trame provides complete control of 3D visualizations and data processing. Developers benefit from a write-once environment from trame. trame is an open source project licensed under Apache License Version 2.0 which allows users to create open source or commercial applications without any licensing worries. By relying simply on Python and HTML, trame focuses on one's data and associated analysis and visualizations while hiding the complications of web development.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    RuleGo

    RuleGo

    Component orchestration rule engine framework for Go

    ...It’s lightweight, embeddable, orchestration-ready, and built for flexible composition of business logic into reusable components. No external middleware dependencies, efficient data processing and linkage on low-cost devices, suitable for IoT edge computing. Embedded and Standalone Deployment modes. Supports embedding RuleGo into existing applications. It can also be deployed independently as middleware, providing rule engine and orchestration services.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    dxos

    dxos

    TypeScript implementation of the DXOS protocols, SDK, toolchain

    DXOS is a decentralized operating system framework that empowers developers to build local-first, collaborative applications without relying on central servers. By providing a comprehensive SDK and toolchain, DXOS facilitates the creation of apps that prioritize user privacy, offline functionality, and seamless peer-to-peer synchronization. Its flagship application, Composer, exemplifies the platform's capabilities by enabling users to organize and sync knowledge across devices, with support...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    KubeEdge

    KubeEdge

    Kubernetes Native Edge Computing Framework (project under CNCF)

    ...It also supports MQTT which enables edge devices to access through edge nodes. With KubeEdge it is easy to get and deploy existing complicated machine learning, image recognition, event processing, and other high-level applications to the Edge. With business logic running at the Edge, much larger volumes of data can be secured & processed locally where the data is produced. With data processed at the Edge, the responsiveness is increased dramatically and data privacy is protected.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Visual Blocks

    Visual Blocks

    Visual Blocks for ML is a Google visual programming framework

    Visual Blocks is a node-based, in-browser environment for building AI and data-processing workflows with drag-and-drop components. It lets you connect sources, transforms, models, and visualizers into a live graph, so changes propagate instantly and results are observable without writing glue code. Under the hood it leans on web-friendly runtimes (e.g., WebGPU/WebGL/WebNN or TensorFlow.js backends) to execute pipelines locally, which is great for demos, teaching, and privacy-sensitive prototypes. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    The Related Values Processing Framework helps the integration of Process Control Data Historian Systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MetalPetal

    MetalPetal

    A GPU accelerated image and video processing framework built on Metal

    MetalPetal is an image processing framework based on Metal designed to provide real-time processing for still images and video with easy-to-use programming interfaces. This chapter covers the key concepts of MetalPetal, and will help you to get a better understanding of its design, implementation, performance implications, and best practices. A MTIImage object is a representation of an image to be processed or produced.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    LuaRadio

    LuaRadio

    A lightweight, embeddable software-defined radio framework

    LuaRadio is a lightweight, embeddable flow graph signal processing framework for software-defined radio. It provides a suite of source, sink, and processing blocks, with a simple API for defining flow graphs, running flow graphs, creating blocks, and creating data types. LuaRadio is built on LuaJIT, has a small binary footprint of under 750 KB (including LuaJIT), has no external hard dependencies, and is MIT-licensed.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Forms

    Forms

    An easy way to create, parse and validate forms in node.js

    ...It integrates with popular Node.js web frameworks, enabling seamless handling of form submissions and data processing. The design emphasizes reusability and maintainability, allowing developers to define forms once and use them across different parts of an application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    JuliaFEM.jl

    JuliaFEM.jl

    The JuliaFEM software library is a framework

    The JuliaFEM software library is a framework that allows for the distributed processing of large Finite Element Models across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. The JuliaFEM software library is a framework that allows for the distributed processing of large Finite Element Models across clusters of computers using simple programming models. It is designed...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    Twint

    Twint

    An advanced Twitter scraping & OSINT tool written in Python

    Twint is an advanced open-source Twitter scraping and OSINT tool written in Python that extracts tweets, user data, followers, likes, and more—without relying on Twitter’s API—making it highly useful for researchers, analysts, and hobbyists who want to bypass rate limits and access public Twitter data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    S-MVP

    S-MVP

    Optimized version of MVP, using annotation generics to simplify code

    ...Complete the writing of repetitive modules, use ASpect+GradlePlugin to complete horizontal AOP programming+Javassist dynamic bytecode injection+Tinker to achieve hot repair+Retrofit to achieve elegant network operations+RxJava to easily play with data processing. In MVP, Presenter completely separates Model and View, and the main program logic is implemented in Presenter. Moreover, the Presenter is not directly related to the specific View, but interacts through the defined interface (we only need to pass parameters according to the interface when testing alone), so that the Presenter can be kept unchanged when changing the View. i.e. reuse! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Tables

    Tables

    Bulma themed, VueJS powered Datatable with server-side loading

    Data Table package with server-side processing, unlimited exporting and VueJS components. Quickly build any complex table based on a JSON template. This package can work independently of the Enso ecosystem. The front-end assets that utilize this API are present in the tables package.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB