Showing 75 open source projects for "data processing"

View related business solutions
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    MongoDB PHP Library

    MongoDB PHP Library

    The Official MongoDB PHP library

    ...Built on top of the underlying MongoDB PHP extension, the library handles serialization, connection pooling, and error handling in a way that feels natural in idiomatic PHP. It supports rich query expressions, bulk writes, change streams, transactions, and GridFS, making it suitable for everything from simple content apps to complex data processing services. The project also includes helpers for working with BSON types such as ObjectId, UTC datetime, and decimals, which helps bridge the gap between native PHP types and MongoDB’s storage model.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    ArangoDB-Community/pyArango

    ArangoDB-Community/pyArango

    Python Driver for ArangoDB with built-in validation

    PyArango is a Python driver for ArangoDB, a multi-model NoSQL database. It provides a Pythonic way to interact with ArangoDB, allowing developers to manage collections, execute AQL queries, and integrate ArangoDB's document, graph, and key-value storage models into Python applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3

    xsd2pgschema

    Relational database replication tool based on XML Schema

    xsd2pgschema is a Java application suite, which converts XML Schema 1.1 (hierarchical data model) to PostgreSQL DDL (relational data model) and supports XML data migration into PostgreSQL based on the XML Schema without defects on information content. It also supports full-text indexing via either Apache Lucene or Sphinx Search utilizing the relational data model. File conversion from XML to CSV, TSV, or JSON is possible as well as mapping XML Schema to JSON Schema. Obtained PostgreSQL...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    CursusDB

    CursusDB

    CursusDB is an open-source distributed in-memory database

    CursusDB is a time-series database built for high-performance analytics and data processing, optimized for handling large volumes of sequential data efficiently.
    Downloads: 8 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    RedisGraph

    RedisGraph

    A graph database as a Redis module

    A high-performance graph database module for Redis that enables fast graph processing and analytics using a query engine based on Cypher.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6

    GETL

    ETL engine based on Groovy

    P.S. Dear friends. Repository migration to https://github.com/ascrus/getl . You can download jar file from this site or maven. GETL - based package in Groovy, which automates the work of loading and transforming data. His name is an acronym for «Groovy ETL». GETL is a set of libraries of pre-built classes and objects that can be used to solve problems unpacking, transform and load data into programs written in Groovy, or Java, as well as from any software that supports the work with...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Cetus

    Cetus

    Cetus is a high performance middleware that provides routing

    ...Cetus is divided into two versions: read-write separation and sub-library (sub-table is a special form of sub-library). Multi-process lock-free improves operating efficiency. Supports transparent backend connection pooling. Support SQL read-write separation. Support data sub-database. Support distributed transaction processing. Support insert batch operations. Support for conditional distinct operations. Enhanced SQL route parsing and injection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    FeatureBase

    FeatureBase

    A crazy fast analytical database, built on bitmaps

    FeatureBase is an Open Source, in-memory, MLAP engine providing SQL support, real-time updates, and analytical processing for your growing data. A binary tree index improves the performance & efficiency of analytical queries by reducing I/O operation. Simple or complex, FeatureBase knocks it out in milliseconds. On-the-fly updates and deletes. Operate instantly on your freshest data without the need for preaggregation. Built on bitmaps, FeatureBase offers up to 5-10X reduction in storage footprint and 90% reduction in hardware footprint. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Sqlite Index Blaster

    Sqlite Index Blaster

    Create huge Sqlite indexes at breakneck speeds

    SQLite Blaster is an advanced SQLite extension that enhances database performance by enabling multi-threading, data compression, and memory optimizations. It is designed for applications that require fast local storage with improved query efficiency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    SnappyData

    SnappyData

    Memory optimized analytics database, based on Apache Spark

    ...SnappyData delivers high throughput, low latency, and high concurrency for a unified analytics workload. By fusing an in-memory hybrid database inside Apache Spark, it provides analytic query processing, mutability/transactions, access to virtually all big data sources and stream processing all in one unified cluster. One common use case for SnappyData is to provide analytics at interactive speeds over large volumes of data with minimal or no pre-processing of the dataset. For instance, there is no need to often pre-aggregate/reduce or generate cubes over your large data sets for ad-hoc visual analytics. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    eXist-db

    eXist-db

    eXist-db is a feature rich Open Source native XML database

    eXist-db is a native XML database featuring efficient, index-based XQuery processing, extensions for keyword search, XUpdate support, XSLT support, XForms support, REST and tight integration with existing XML development tools. Moved to Github - https://www.github.com/exist-db/exist
    Downloads: 26 This Week
    Last Update:
    See Project
  • 12
    Datatables.AspNet

    Datatables.AspNet

    Microsoft AspNet bindings and automatic parsing for jQuery DataTables

    Formerly known as DataTables.Mvc, this project started with small objectives around 2014, aiming to provide intermediate and experienced developers a tool to avoid the boring process of handling DataTables parameters. More than a year later after a full rewrite, we are now proud to support Asp.net MVC, WebApi, and Asp.Net Core (full .NET Core support). Unit-testing is a priority to avoid breaking your app and every stable release should provide better and wider test cases. Datatables.AspNet...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Demo Scene

    Demo Scene

    Scripts and samples to support Confluent Demos, Talks, and Blogs

    Demo Scene is a collection of resources and examples provided by Confluent Inc. to demonstrate the capabilities of Apache Kafka and its ecosystem. It includes various demos showcasing real-time data streaming, processing, and integration patterns
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Heroic

    Heroic

    The Heroic Time Series Database

    Heroic is a scalable time-series database developed by Spotify, designed for real-time analytics and monitoring of large-scale systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ksqlDB

    ksqlDB

    The database purpose-built for stream processing applications

    Build applications that respond immediately to events. Craft materialized views over streams. Receive real-time push updates, or pull current state on demand. Seamlessly leverage your existing Apache Kafka® infrastructure to deploy stream-processing workloads and bring powerful new capabilities to your applications. Use a familiar, lightweight syntax to pack a powerful punch. Capture, process, and serve queries using only SQL. No other languages or services are required. ksqlDB enables you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PipelineDB

    PipelineDB

    High-performance time-series aggregation for PostgreSQL

    PipelineDB is a PostgreSQL extension for continuous aggregation and stream processing. It allows users to define continuous queries that automatically process incoming data streams, storing results in materialized views. Designed for real-time analytics, PipelineDB extends PostgreSQL with stream-oriented features while maintaining compatibility with standard SQL and tooling.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 17
    Cosmos DB Spark

    Cosmos DB Spark

    Apache Spark Connector for Azure Cosmos DB

    ...The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    DataSink

    Take a JDBC ResultSet and stream it in one of the supported formats

    DataSink takes a JDBC ResultSet and streams it in in a format of your choice. You can as well zip the stream and send it over the network, if you want. DataSink currently implements the following table formats: DBF (the xBase file format), XHTML, and genericode. You can use it as an Ant task or directly from Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Mondrian is an OLAP (online analytical processing) engine written in Java. It reads from JDBC data sources, aggregates data in a memory cache, and implements the MDX language and the olap4j and XML/A APIs.
    Leader badge
    Downloads: 48 This Week
    Last Update:
    See Project
  • 20
    AvanceDB

    AvanceDB

    An in-memory database based on the CouchDB REST API

    AvanceDB is a high-performance, in-memory database designed to accelerate SQL-based applications. It uses advanced caching techniques to reduce database latency and improve query execution speed, making it ideal for real-time analytics and transactional workloads.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SQLMate

    SQLMate

    Rapidly generate a DAO for SQLite

    Complete source code, usage example, & a code-generated test case are included in the .jar file. ( See main.java for the usage / code generation example )
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    geog-server-embedded

    geog-server-embedded

    GeoG Embedded Server

    GeoG Embedded Server with GeoG's Own Database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Apache PredictionIO

    Apache PredictionIO

    Machine learning server for developers and ML engineers

    Apache PredictionIO® is an open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task. Quickly build and deploy an engine as a web service on production with customizable templates; respond to dynamic queries in real-time once deployed as a web service; evaluate and tune multiple engine variants systematically; unify data from multiple platforms in batch or in real-time for comprehensive predictive analytics; speed up machine learning modeling with systematic processes and pre-built evaluation measures; support machine learning and data processing libraries such as Spark MLLib and OpenNLP; implement your own machine learning models and seamlessly incorporate them into your engine; simplify data infrastructure management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    MARC/Perl

    Perl libraries for processing MARC records

    MARC/Perl (formerly known as MARC.pm) is a project to develop Perl libraries to process MARC (MAchine Readable Cataloging) data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    XMLPipeDB is a suite of tools for building relational databases from XML sources with minimal manual processing of the data. While the applicability is general, our motivation was to facilitate the management of biological data from different sources.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB