Showing 86 open source projects for "processing"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Lightspeed golf course management software Icon
    Lightspeed golf course management software

    Lightspeed Golf is all-in-one golf course management software to help courses simplify operations, drive revenue and deliver amazing golf experiences.

    From tee sheet management, point of sale and payment processing to marketing, automation, reporting and more—Lightspeed is built for the pro shop, restaurant, back office, beverage cart and beyond.
    Learn More
  • 1
    Bytewax

    Bytewax

    Python Stream Processing

    ...Bytewax is a Python framework and Rust distributed processing engine that uses a dataflow computational model to provide parallelizable stream processing and event processing capabilities similar to Flink, Spark, and Kafka Streams. You can use Bytewax for a variety of workloads from moving data à la Kafka Connect style all the way to advanced online machine learning workloads. Bytewax is not limited to streaming applications but excels anywhere that data can be distributed at the input and output.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Pathway

    Pathway

    Python ETL framework for stream processing, real-time analytics, LLM

    ...Pathway is especially well-suited for scenarios like financial analytics, IoT, fraud detection, and logistics, where high-velocity and continuously changing data is the norm. Unlike traditional batch processing frameworks, Pathway continuously updates the results of your data logic as new events arrive, functioning more like a database that reacts in real-time. It supports Python, integrates with modern data tools, and offers a deterministic dataflow model to ensure reproducibility and correctness.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-First Supply Chain Management Icon
    AI-First Supply Chain Management

    Supply chain managers, executives, and businesses seeking AI-powered solutions to optimize planning, operations, and decision-making across the supply

    Logility is a market-leading provider of AI-first supply chain management solutions engineered to help organizations build sustainable digital supply chains that improve people’s lives and the world we live in. The company’s approach is designed to reimagine supply chain planning by shifting away from traditional “what happened” processes to an AI-driven strategy that combines the power of humans and machines to predict and be ready for what’s coming. Logility’s fully integrated, end-to-end platform helps clients know faster, turn uncertainty into opportunity, and transform the supply chain from a cost center to an engine for growth.
    Learn More
  • 5
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    ...It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components. It’s built for use in research and production, empowering data scientists to streamline dataset curation and preprocessing workflows efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Lithops

    Lithops

    A multi-cloud framework for big data analytics

    ...It abstracts cloud providers like IBM Cloud, AWS, Azure, and Google Cloud into a unified interface and turns your Python functions into scalable, event-driven workloads. Lithops is ideal for data processing, ML inference, and embarrassingly parallel workloads, giving you the power of FaaS (Function-as-a-Service) without vendor lock-in. It also supports hybrid cloud setups, object storage access, and simple integration with Jupyter notebooks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Pyper

    Pyper

    Concurrent Python made simple

    Pyper is a Python-native orchestration and scheduling framework designed for modern data workflows, machine learning pipelines, and any task that benefits from a lightweight DAG-based execution engine. Unlike heavier platforms like Airflow, Pyper aims to remain lean, modular, and developer-friendly, embracing Pythonic conventions and minimizing boilerplate. It focuses on local development ergonomics and seamless transition to production environments, making it ideal for small teams and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    loonflow

    loonflow

    A workflow engine base on django python

    ...Scenario services), if there is a certain development capability, it is recommended to use only the back-end engine function, and the front-end customized development according to the scenario can be dispersed in various internal background management systems (such as personnel, operation and maintenance, monitoring, cmdb, etc.). Since version 1.1.x, loonflow comes with a front-end interface for creating and processing work orders, which can be used directly. The official version is shown in the release . It is recommended to use the latest version.
    Downloads: 0 This Week
    Last Update:
    See Project
  • DAT Freight and Analytics - DAT Icon
    DAT Freight and Analytics - DAT

    DAT Freight and Analytics operates DAT One truckload freight marketplace

    DAT Freight & Analytics operates DAT One, North America’s largest truckload freight marketplace; DAT iQ, the industry’s leading freight data analytics service; and Trucker Tools, the leader in load visibility. Shippers, transportation brokers, carriers, news organizations, and industry analysts rely on DAT for market trends and data insights, informed by nearly 700,000 daily load posts and a database exceeding $1 trillion in freight market transactions. Founded in 1978, DAT is a business unit of Roper Technologies (Nasdaq: ROP), a constituent of the Nasdaq 100, S&P 500, and Fortune 1000. Headquartered in Beaverton, Ore., DAT continues to set the standard for innovation in the trucking and logistics industry.
    Learn More
  • 10
    Bayesian Optimization

    Bayesian Optimization

    Python implementation of global optimization with gaussian processes

    This is a constrained global optimization package built upon bayesian inference and gaussian process, that attempts to find the maximum value of an unknown function in as few iterations as possible. This technique is particularly suited for optimization of high cost functions, situations where the balance between exploration and exploitation is important. More detailed information, other advanced features, and tips on usage/implementation can be found in the examples folder. Follow the basic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    ...Datachain is especially helpful if batch operations can be optimized – for instance, when synchronous API calls can be parallelized or where an LLM API offers batch processing.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Astropy

    Astropy

    Repository for the Astropy core package

    ...It is at the core of the Astropy Project, which aims to enable the community to develop a robust ecosystem of affiliated packages covering a broad range of needs for astronomical research, data processing, and data analysis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Neuroglancer

    Neuroglancer

    WebGL-based viewer for volumetric data

    ...Neuroglancer operates entirely client-side, fetching data over HTTP in a variety of supported formats including Neuroglancer precomputed, N5, Zarr, and NIfTI, among others. The viewer is built with a multi-threaded architecture, separating rendering and data processing to ensure smooth performance even with massive datasets. Extensively used in neuroscience research, Neuroglancer supports integration with tools.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    VisPy

    VisPy

    Main repository for Vispy

    Vispy is an open-source, high-performance interactive visualization library in Python, designed for creating scientific visualizations and interactive plots. It leverages the power of modern Graphics Processing Units (GPUs) through OpenGL to render large datasets efficiently. Vispy supports a wide range of visualization types, including 2D plots, 3D visualizations, volume rendering, and more, making it suitable for scientific research, data analysis, and educational purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AutoGluon

    AutoGluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    ...Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning, model selection/ensembling, architecture search, and data processing. Easily improve/tune your bespoke models and data pipelines, or customize AutoGluon for your use-case. AutoGluon is modularized into sub-modules specialized for tabular, text, or image data. You can reduce the number of dependencies required by solely installing a specific sub-module via: python3 -m pip install <submodule>.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GemGIS

    GemGIS

    Spatial data processing for geomodeling

    GemGIS is a Python-based, open-source geographic information processing library. It is capable of preprocessing spatial data such as vector data (shape files, geojson files, geopackages,…), raster data (tif, png,…), data obtained from online services (WCS, WMS, WFS) or XML/KML files (soon). Preprocessed data can be stored in a dedicated Data Class to be passed to the geomodeling package GemPy in order to accelerate the model-building process.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 22,337 This Week
    Last Update:
    See Project
  • 20
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows. No in-app data processing: command line tools as the main tool for interacting with databases and data. Single machine pipeline execution based on Python's multiprocessing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    PyMca
    Stand-alone application and Python tools for interactive and/or batch processing analysis of X-Ray Fluorescence Spectra. Graphical user interface (GUI) and batch processing capabilities provided.
    Leader badge
    Downloads: 164 This Week
    Last Update:
    See Project
  • 22
    Gwyddion

    Gwyddion

    Scanning probe microscopy data visualisation and analysis

    A data visualization and processing tool for scanning probe microscopy (SPM, i.e. AFM, STM, MFM, SNOM/NSOM, ...) and profilometry data, useful also for general image and 2D data analysis.
    Leader badge
    Downloads: 1,886 This Week
    Last Update:
    See Project
  • 23
    Datapipe

    Datapipe

    Real-time, incremental ETL library for ML with record-level depend

    Datapipe is a real-time, incremental ETL library for Python with record-level dependency tracking. Datapipe is designed to streamline the creation of data processing pipelines. It excels in scenarios where data is continuously changing, requiring pipelines to adapt and process only the modified data efficiently. This library tracks dependencies for each record in the pipeline, ensuring minimal and efficient data processing.
    Downloads: 33 This Week
    Last Update:
    See Project
  • 24
    SMILI

    SMILI

    Scientific Visualisation Made Easy

    ...The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with graphical user interfaces and/or via the command-line. See our YouTube channel for tutorial videos via the homepage. The applications are all built out of a uniform user-interface framework that provides a very high level (Qt) interface to powerful image processing and scientific visualisation algorithms from the Insight Toolkit (ITK) and Visualisation Toolkit (VTK). ...
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    HEALPix

    HEALPix

    Data Analysis, Simulations and Visualization on the Sphere

    Software for pixelization, hierarchical indexation, synthesis, analysis, and visualization of data on the sphere. Please acknowledge HEALPix by quoting the web page http://healpix.sourceforge.net (or https://healpix.sourceforge.io) and publication: K.M. Gorski et al., 2005, Ap.J., 622, p.759 Full software documentation available at https://healpix.sourceforge.io/documentation.php Wiki Pages: https://sourceforge.net/p/healpix/wiki/Home Exchanging Data with HEALPix (in FITS files):...
    Leader badge
    Downloads: 168 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next