Page 2 | data processing free download

Showing 75 open source projects for "data processing"

View related business solutions

Database Mac Clear Filters & Widen Search

Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
Earn up to 16% annual interest with Nexo.
Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

MongoDB PHP Library

The Official MongoDB PHP library

...Built on top of the underlying MongoDB PHP extension, the library handles serialization, connection pooling, and error handling in a way that feels natural in idiomatic PHP. It supports rich query expressions, bulk writes, change streams, transactions, and GridFS, making it suitable for everything from simple content apps to complex data processing services. The project also includes helpers for working with BSON types such as ObjectId, UTC datetime, and decimals, which helps bridge the gap between native PHP types and MongoDB’s storage model.

Downloads: 8 This Week

Last Update: 2026-02-11
See Project
2

ArangoDB-Community/pyArango

Python Driver for ArangoDB with built-in validation

PyArango is a Python driver for ArangoDB, a multi-model NoSQL database. It provides a Pythonic way to interact with ArangoDB, allowing developers to manage collections, execute AQL queries, and integrate ArangoDB's document, graph, and key-value storage models into Python applications.

Downloads: 7 This Week

Last Update: 2025-02-22
See Project
3

xsd2pgschema

Relational database replication tool based on XML Schema

xsd2pgschema is a Java application suite, which converts XML Schema 1.1 (hierarchical data model) to PostgreSQL DDL (relational data model) and supports XML data migration into PostgreSQL based on the XML Schema without defects on information content. It also supports full-text indexing via either Apache Lucene or Sphinx Search utilizing the relational data model. File conversion from XML to CSV, TSV, or JSON is possible as well as mapping XML Schema to JSON Schema. Obtained PostgreSQL...

1 Review

Downloads: 3 This Week

Last Update: 2024-09-19
See Project
4

CursusDB

CursusDB is an open-source distributed in-memory database

CursusDB is a time-series database built for high-performance analytics and data processing, optimized for handling large volumes of sequential data efficiently.

Downloads: 8 This Week

Last Update: 2025-02-19
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

RedisGraph

A graph database as a Redis module

A high-performance graph database module for Redis that enables fast graph processing and analytics using a query engine based on Cypher.

Downloads: 6 This Week

Last Update: 2025-02-17
See Project
6

GETL

ETL engine based on Groovy

P.S. Dear friends. Repository migration to https://github.com/ascrus/getl . You can download jar file from this site or maven. GETL - based package in Groovy, which automates the work of loading and transforming data. His name is an acronym for «Groovy ETL». GETL is a set of libraries of pre-built classes and objects that can be used to solve problems unpacking, transform and load data into programs written in Groovy, or Java, as well as from any software that supports the work with...

1 Review

Downloads: 2 This Week

Last Update: 2023-12-22
See Project
7

Cetus

Cetus is a high performance middleware that provides routing

...Cetus is divided into two versions: read-write separation and sub-library (sub-table is a special form of sub-library). Multi-process lock-free improves operating efficiency. Supports transparent backend connection pooling. Support SQL read-write separation. Support data sub-database. Support distributed transaction processing. Support insert batch operations. Support for conditional distinct operations. Enhanced SQL route parsing and injection.

Downloads: 0 This Week

Last Update: 2023-04-25
See Project
8

FeatureBase

A crazy fast analytical database, built on bitmaps

FeatureBase is an Open Source, in-memory, MLAP engine providing SQL support, real-time updates, and analytical processing for your growing data. A binary tree index improves the performance & efficiency of analytical queries by reducing I/O operation. Simple or complex, FeatureBase knocks it out in milliseconds. On-the-fly updates and deletes. Operate instantly on your freshest data without the need for preaggregation. Built on bitmaps, FeatureBase offers up to 5-10X reduction in storage footprint and 90% reduction in hardware footprint. ...

Downloads: 0 This Week

Last Update: 2023-04-04
See Project
9

Sqlite Index Blaster

Create huge Sqlite indexes at breakneck speeds

SQLite Blaster is an advanced SQLite extension that enhances database performance by enabling multi-threading, data compression, and memory optimizations. It is designed for applications that require fast local storage with improved query efficiency.

Downloads: 0 This Week

Last Update: 2025-02-25
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

SnappyData

Memory optimized analytics database, based on Apache Spark

...SnappyData delivers high throughput, low latency, and high concurrency for a unified analytics workload. By fusing an in-memory hybrid database inside Apache Spark, it provides analytic query processing, mutability/transactions, access to virtually all big data sources and stream processing all in one unified cluster. One common use case for SnappyData is to provide analytics at interactive speeds over large volumes of data with minimal or no pre-processing of the dataset. For instance, there is no need to often pre-aggregate/reduce or generate cubes over your large data sets for ad-hoc visual analytics. ...

Downloads: 5 This Week

Last Update: 2024-10-15
See Project
11

eXist-db

eXist-db is a feature rich Open Source native XML database

eXist-db is a native XML database featuring efficient, index-based XQuery processing, extensions for keyword search, XUpdate support, XSLT support, XForms support, REST and tight integration with existing XML development tools. Moved to Github - https://www.github.com/exist-db/exist

Downloads: 26 This Week

Last Update: 2026-03-06
See Project
12

Datatables.AspNet

Microsoft AspNet bindings and automatic parsing for jQuery DataTables

Formerly known as DataTables.Mvc, this project started with small objectives around 2014, aiming to provide intermediate and experienced developers a tool to avoid the boring process of handling DataTables parameters. More than a year later after a full rewrite, we are now proud to support Asp.net MVC, WebApi, and Asp.Net Core (full .NET Core support). Unit-testing is a priority to avoid breaking your app and every stable release should provide better and wider test cases. Datatables.AspNet...

Downloads: 1 This Week

Last Update: 2024-11-08
See Project
13

Demo Scene

Scripts and samples to support Confluent Demos, Talks, and Blogs

Demo Scene is a collection of resources and examples provided by Confluent Inc. to demonstrate the capabilities of Apache Kafka and its ecosystem. It includes various demos showcasing real-time data streaming, processing, and integration patterns

Downloads: 0 This Week

Last Update: 2025-01-14
See Project
14

Heroic

The Heroic Time Series Database

Heroic is a scalable time-series database developed by Spotify, designed for real-time analytics and monitoring of large-scale systems.

Downloads: 0 This Week

Last Update: 2024-11-25
See Project
15

ksqlDB

The database purpose-built for stream processing applications

Build applications that respond immediately to events. Craft materialized views over streams. Receive real-time push updates, or pull current state on demand. Seamlessly leverage your existing Apache Kafka® infrastructure to deploy stream-processing workloads and bring powerful new capabilities to your applications. Use a familiar, lightweight syntax to pack a powerful punch. Capture, process, and serve queries using only SQL. No other languages or services are required. ksqlDB enables you...

Downloads: 0 This Week

Last Update: 2021-12-21
See Project
16

PipelineDB

High-performance time-series aggregation for PostgreSQL

PipelineDB is a PostgreSQL extension for continuous aggregation and stream processing. It allows users to define continuous queries that automatically process incoming data streams, storing results in materialized views. Designed for real-time analytics, PipelineDB extends PostgreSQL with stream-oriented features while maintaining compatibility with standard SQL and tooling.

Downloads: 12 This Week

Last Update: 2025-06-05
See Project
17

Cosmos DB Spark

Apache Spark Connector for Azure Cosmos DB

...The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.

Downloads: 0 This Week

Last Update: 2023-12-21
See Project
18

DataSink

Take a JDBC ResultSet and stream it in one of the supported formats

DataSink takes a JDBC ResultSet and streams it in in a format of your choice. You can as well zip the stream and send it over the network, if you want. DataSink currently implements the following table formats: DBF (the xBase file format), XHTML, and genericode. You can use it as an Ant task or directly from Java.

Downloads: 0 This Week

Last Update: 2018-05-03
See Project
19

Mondrian

Mondrian is an OLAP (online analytical processing) engine written in Java. It reads from JDBC data sources, aggregates data in a memory cache, and implements the MDX language and the olap4j and XML/A APIs.

20 Reviews

Downloads: 48 This Week

Last Update: 2017-05-17
See Project
20

AvanceDB

An in-memory database based on the CouchDB REST API

AvanceDB is a high-performance, in-memory database designed to accelerate SQL-based applications. It uses advanced caching techniques to reduce database latency and improve query execution speed, making it ideal for real-time analytics and transactional workloads.

Downloads: 0 This Week

Last Update: 2025-02-25
See Project
21

SQLMate

Rapidly generate a DAO for SQLite

Complete source code, usage example, & a code-generated test case are included in the .jar file. ( See main.java for the usage / code generation example )

Downloads: 3 This Week

Last Update: 2016-11-24
See Project
22

geog-server-embedded

GeoG Embedded Server

GeoG Embedded Server with GeoG's Own Database.

Downloads: 0 This Week

Last Update: 2016-10-07
See Project
23

Apache PredictionIO

Machine learning server for developers and ML engineers

Apache PredictionIO® is an open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task. Quickly build and deploy an engine as a web service on production with customizable templates; respond to dynamic queries in real-time once deployed as a web service; evaluate and tune multiple engine variants systematically; unify data from multiple platforms in batch or in real-time for comprehensive predictive analytics; speed up machine learning modeling with systematic processes and pre-built evaluation measures; support machine learning and data processing libraries such as Spark MLLib and OpenNLP; implement your own machine learning models and seamlessly incorporate them into your engine; simplify data infrastructure management.

Downloads: 0 This Week

Last Update: 2021-05-10
See Project
24

MARC/Perl

Perl libraries for processing MARC records

MARC/Perl (formerly known as MARC.pm) is a project to develop Perl libraries to process MARC (MAchine Readable Cataloging) data.

1 Review

Downloads: 0 This Week

Last Update: 2016-05-10
See Project
25

XMLPipeDB

XMLPipeDB is a suite of tools for building relational databases from XML sources with minimal manual processing of the data. While the applicability is general, our motivation was to facilitate the management of biological data from different sources.

Downloads: 2 This Week

Last Update: 2015-06-16
See Project