Showing 74 open source projects for "data processing"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Breakpad

    Breakpad

    Mirror of Google Breakpad project

    Breakpad is an open source crash reporting system developed by Google that provides both client and server components to capture, analyze, and report software crashes across platforms. It is designed to help developers diagnose and fix crashes efficiently by generating detailed crash dumps, stack traces, and diagnostic data whenever an application fails unexpectedly. The framework includes libraries for embedding crash-handling functionality directly into applications and tools for processing and symbolizing crash dumps on the server side. Breakpad supports multiple operating systems, including Linux, macOS, and Windows, and integrates easily into existing build systems. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Swiss File Knife

    Swiss File Knife

    One hundred command line tools in a small and portable binary.

    Create zip files, extract zip files, replace text in files, search in files using expressions, stream text editor, instant command line ftp and http server, send folder via network, copy folder excluding sub folders and files, find duplicate files, run a command on all files of a folder, split and join large files, make md5 checksum lists of files, remove tab characters, convert CR/LF, list newest or biggest files of a folder, compare folders, treesize, show first or last lines of a file,...
    Leader badge
    Downloads: 494 This Week
    Last Update:
    See Project
  • 3
    zpaqfranz

    zpaqfranz

    Zpaq compatible archiver for Win, Linux, Free/OpenBSD, Solaris & MacOS

    ...Get forever storage of your files, managing critical backups with bulletproof archival solutions and enterprise-grade reliability Far more efficient than Time Machine or ZFS snapshots-perfect for VM backups and permanent archiving, effortlessly handling TBs and millions of files Optimized for cloud/NAS/USB with ultra-low bandwidth, military-grade encryption, and 1GB/s+ speeds on modern hardware GUI (Win/Linux/Mac) https://sourceforge.net/projects/catpaq Why choose catpaq/zpaqfranz? ✓ Complete: single/multi-file storage architecture ✓ Modern: SHA-2, SHA-3, BLAKE3, XXH3 and more ✓ Paranoid: anti-ransomware data verification with integrity checks ✓ Runs everywhere: TrueNAS, ARM-powered, even ESXi ✓ Lightning-fast: multi-core processing + hardware acceleration ✓ Deduplicated disk imaging ✓ Battle-tested: 15+ years of active development since 2009 https://github.com/fcorbelli/zpaqfranz https://www.francocorbelli.it/zpaqfranz 100% FOSS • forever free
    Downloads: 85 This Week
    Last Update:
    See Project
  • 4
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common...
    Downloads: 252 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    SageMaker Experiments Python SDK

    SageMaker Experiments Python SDK

    Experiment tracking and metric logging for Amazon SageMaker notebooks

    ...Each step in the workflow is described by a Trial Component. There is no relationship between Trial Components such as ordering. Trial Component: A description of a single step in a machine learning workflow. For example data cleaning, feature extraction, model training, model evaluation, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    dispy

    Distributed and Parallel Computing with/for Python.

    dispy is a generic and comprehensive, yet easy to use framework for creating and using compute clusters to execute computations in parallel across multiple processors in a single machine (SMP), among many machines in a cluster, grid or cloud. dispy is well suited for data parallel (SIMD) paradigm where a computation (Python function or standalone program) is evaluated with different (large) datasets independently. dispy supports public / private / hybrid cloud computing, fog / edge computing.
    Leader badge
    Downloads: 49 This Week
    Last Update:
    See Project
  • 7
    wasmboy

    wasmboy

    Game Boy / Game Boy Color Emulator Library

    wasmboy is a Game Boy and Game Boy Color emulator built using WebAssembly and JavaScript, designed to run efficiently in both browsers and Node environments. It leverages modern web technologies such as HTML5 canvas and the Web Audio API to deliver graphics and sound directly within a web interface. The project emphasizes portability and integration, allowing it to be embedded into other applications as a reusable dependency. It supports a wide range of emulator features including save...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    A collection of small utilities for: data extraction (text or binary files), data buffering, message queue control, column addition, date/time manipulation, and data recovery testing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Albedo

    Albedo

    A recommender system for discovering GitHub repos

    ...It treats repositories and developers as a graph of interactions and applies large-scale matrix factorization to model affinities, with Apache Spark providing the distributed data processing. The project focuses on implicit feedback—stars, watches, and other engagement metrics—so it can build useful recommendations without explicit ratings. A reproducible setup and Makefile-driven workflow streamline tasks like spinning up services, loading datasets, training models, and generating candidate lists. Because it’s built around Spark’s scalable primitives, Albedo can experiment on substantial snapshots of GitHub metadata rather than toy corpora. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Orange

    Orange

    OpenResty/Nginx Gateway for API monitoring and management

    ...In addition, various variables in the request can be extracted for subsequent processing in two ways.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    mod_psldap

    Apache LDAP Directory Manager

    mod_psldap is an Apache module for leveraging LDAP services built on the OpenLDAP library and the Apache APIs, to include web based A&A, web based updates to the LDAP store, server-side XSLT processing, and session management across servers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    confd

    confd

    Manage local application configuration files using templates from etcd

    confd is a lightweight configuration management tool focused on keeping local configuration files up-to-date using data stored in etcd, consul, dynamodb, redis, vault, zookeeper, aws ssm parameter store or env vars and processing template resources. confd is also focused on reloading applications to pick up new config file changes. Go 1.10 is required to build confd, which uses the new vendor directory. You should have a working etcd, or consul server up and running and the ability to add new keys. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Clu-Linux-Live

    Clu-Linux-Live

    Various Processing and Data Rescue Tools over Wired or Wireless Networ

    This Linux Live CD provides Various Processing Command Line Utilities (Clu) and Data Rescue Tools which can be used on a Wired or Wireless Network. On Startup it prompts the user to change password, mount all filesystems available locally, start wireless network ( if wifi interface present ), start network services (samba/ssh/sftp) and present user with a console for executing various utilities i.e Text, Image, Audio, Video, Downloading etc. on their FileSystems that are mounted. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14

    LBSP

    Real-Time Processing Library for OSHW Biomedical Sensors

    Applications involving data acquisition from sensors need samples at a preset frequency rate, the filtering out of noise and/or analysis of certain frequency components. We propose a novel software architecture based on open-software hardware platforms which allows programmers to create data streams from input channels and easily implement filters and frequency analysis objects. The performances of the different classes given in the size of memory allocated and execution time (number of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    VTD-XML is the next generation XML parser/indexer/editor/slicer/assembler/xpath-engine that goes beyond DOM, SAX and PULL in performance, memory usage, and ease of use.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 16

    LogDruid

    Generate charts and reports using data gathered in log files

    An application to gather, aggregate, chart and report information originating from any log files. It uses regular expressions that are constructed graphically and can be tested in the application against samples. Once configured for a specific type of log file set, the gathering and display of the chart for a new files set can be done in just one click. Contains a sample template to handle few log types: Java GC log, OpenDS access log, Apache access log
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Sesame

    Sesame

    Java RDF Framework

    This project is no longer actively maintained. It is succeeded by the Eclipse RDF4J project, which can be found at GitHub and at http://www.rdf4j.org/. Sesame is a de-facto standard framework for processing RDF data. This includes parsing, scalable storage, reasoning and full SPARQL 1.1 query/update support. Sesame offers a fully modular toolkit and an easy-to-use Java API that can be connected to all leading RDF storage solutions.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 18
    AnomalyDetection

    AnomalyDetection

    Anomaly Detection with R

    AnomalyDetection is an R package developed by Twitter for detecting anomalies in seasonal univariate time series. It implements the Seasonal Hybrid Extreme Studentized Deviate (S‑H‑ESD) test, which reliably identifies both global and local outliers in data with trends and seasonality—commonly applied to system metrics, engagement data, and business KPIs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    NFDUMP - Netflow processing tools

    netflow collecting and processing tools

    *** This project moved to Github *** https://github.com/phaag/nfdump However, you may want to download older versions from here. nfdump is a set of tools to collect and process netflow data. It's fast and has a powerful filter pcap like syntax. It supports netflow versions v1, v5, v7, v9 and IPFIX as well as a limited set of sflow. It includes support for CISCO ASA (NSEL) and CISCO NAT (NEL) devices, which export event logging records as v9 flows. nfdump is fully IPv6 compatible.
    Leader badge
    Downloads: 34 This Week
    Last Update:
    See Project
  • 20
    ...This application has been designed to perform simple or timestamped copies of directories and files as well as unidirectional synchronization of file tree. Fisy is in command line, you can so start copying tasks at any time or plug them on an automated processing. Sine version 2.0, Fisy contains cipher features for encrypting data for storage in a remote service like the Cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    FlowVR
    FlowVR is an open source middleware tailored for high performance in situ data processing and analytics running on large parallel machines
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    clip5 text clipboard manager

    clip5 text clipboard manager

    Simple interface which can be used as a clipboard manager

    It's a Java application which can be used as a clipboard manager. Just click on any of the textfield/textarea to copy the the displyed text into the system clipboard, where it will be ready to be pasted in another application like an e-mail client, a word processor ... Data are saved in text files. Edit the .txt files in order to best suit your needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    LabMonkey Embedded Automation Platform

    Create networks of embedded devices for data logging/automation tasks.

    LabMonkey is a collection of designs for embedded devices which can be networked together to provide a range of automation, data logging and signal processing functions. A key design objective is to use as little dedicated hardware as possible for communication between nodes in the network, and to be able to adapt the network topology in real-time so as to minimize the occurrence of collisions between packets. To achieve this, a protocol has been designed specifically for the task, and implemented in assembler for AVR microcontrollers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    distributedPHP client

    distributedPHP client

    A simple script for distributed computing through PHP:

    distributedPHP client is a simple PHP script that can simultaneously activate/send data to as many web scripts as you want. You must open and configure the distributedPHP .php file prior to running it. ditributedPHP client supports activating scripts without data, sending the same data to all scripts, sending unique data to each script or sending user input to each script. Examples of use include: distributed math computation, encryption breaking, SETI@home/folding@home (well, if they made the projects in php..) distributed bruteforce attacks, ddos attacks, distributed processing, etc.. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Visual Disk Diet

    Helps you visualize the space used on your drives with colorful chart.

    Visual Disk Diet helps you visualize the space used on your drives with a colorful radial tree chart. Now you can see quickly which folders take so much space, and get rid of them (coming soon). Coded with java/processing. Feel free to use, modify or suggest ideas! It's pretty much a copy of Disk Space Fan but Open Source so it can be adapted to extended purposes (as browsing file system...) It's very fast on data drives (your data partition, usb drive...) It's NOT optimized for very large and complex drives such as big C: drive for now so it can take up to 10 minutes to scan it (and there is no feedback during this scan, so it can be frustrating).
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB