Showing 24 open source projects for "processing"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    Apache Spark is a unified engine for large-scale data processing, offering APIs for batch jobs, streaming, machine learning, and graph computation. It builds on resilient distributed datasets (RDDs) and the newer DataFrame/Dataset abstractions to provide fault-tolerant, in-memory computation across clusters. Spark’s execution engine handles scheduling, shuffles, caching, and data locality so users can focus on transformations rather than infrastructure plumbing.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    Spark NLP

    Spark NLP

    State of the Art Natural Language Processing

    Experience the power of large language models like never before, unleashing the full potential of Natural Language Processing (NLP) with Spark NLP, the open source library that delivers scalable LLMs. The full code base is open under the Apache 2.0 license, including pre-trained models and pipelines. The only NLP library built natively on Apache Spark. The most widely used NLP library in the enterprise. Spark ML provides a set of machine learning applications that can be built using two main components, estimators and transformers. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Waves Platform Node

    Waves Platform Node

    Host connected to the Waves blockchain network

    ...Waves is an open source blockchain platform that offers a full blockchain ecosystem for building decentralised applications. Nodes are its critical components, performing several important functions such as processing and validating transactions, and generating and storing blocks. Nodes store full blockchain data, pass this data to other nodes, and check the validity of newly added blocks. Validation ensures that the blocks are all in the correct format, all hashes are computed correctly, that the new block contains the hash of the previous one, and that every transaction is validated and signed by the right parties.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    Akka

    Akka

    Build concurrent, distributed, and resilient message-driven apps

    ...Load balancing and adaptive routing across nodes. Event Sourcing and CQRS with Cluster Sharding. Distributed Data for eventual consistency using CRDTs. Asynchronous non-blocking stream processing with backpressure.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Docspell

    Docspell

    Assist in organizing your piles of documents

    Docspell is a personal document organizer. Or sometimes called a "Document Management System" (DMS). You'll need a scanner to convert your papers into files. Docspell can then assist in organizing the resulting mess. It can unify your files from scanners, emails, and other sources. It is targeted for home use, i.e. families, households, and also for smaller groups/companies. You can associate tags, set correspondent,s and lots of other predefined and custom metadata. If your documents are...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Scio

    Scio

    A Scala API for Apache Beam and Google Cloud Dataflow

    Scio is a Scala API developed by Spotify that builds on Apache Beam to enable expressive batch and streaming data pipelines, optimized for running on Google Cloud Dataflow. Inspired by Spark and Scalding, it provides scalable, type‑safe, and production-grade data processing, with built-in support for BigQuery, Pub/Sub, Cassandra, Elasticsearch, Redis, TensorFlow IO, and more.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    FS2

    FS2

    Compositional, streaming I/O library for Scala

    FS2 (“Functional Streams for Scala”) is a purely functional, effectful abstraction for stream processing on the JVM. Built on Cats Effect, it enables compositional resource-safe streaming workflows with robust error handling, back-pressure, pull/push semantics, and support for concurrent and interruptible pipelines.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    sbt

    sbt

    sbt, the interactive build tool

    ...One of the examples of a Scala-specific feature is the ability to cross-build your project against multiple Scala versions. build.sbt is a Scala-based DSL to express parallel processing task graph. Typos in build.sbt will be caught as a compilation error. With Zinc incremental compiler and file watch (~), edit-compile-test loop is fast and incremental. Adding support for new tasks and platforms (like Scala.js) is as easy as writing build.sbt. Join 100+ community-maintained plugins to share and reuse sbt tasks. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    Http4s

    Http4s

    A minimal, idiomatic Scala interface for HTTP

    ...The pure functional side of Scala is favored to promote composability and easy reasoning about your code. I/O is managed through cats-effect. http4s is built on FS2, a streaming library that provides for processing and emitting large payloads in constant space and implementing websockets. http4s cross-builds for Scala.js and Scala Native. Share code and deploy to browsers, Node.js, native executable binaries, and the JVM.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    XiangShan

    XiangShan

    Open-source high-performance RISC-V processor

    XiangShan is an open-source, high-performance RISC-V processor project that implements out-of-order superscalar cores using Chisel for hardware construction. The design targets modern performance goals—deep pipelines, speculative execution, multi-issue decode/execute, and sophisticated branch prediction—while remaining synthesizable for ASIC flows and portable to FPGAs for research. A modular microarchitecture separates frontend, backend, and memory subsystems with coherent caches and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SnappyData

    SnappyData

    Memory optimized analytics database, based on Apache Spark

    ...SnappyData delivers high throughput, low latency, and high concurrency for a unified analytics workload. By fusing an in-memory hybrid database inside Apache Spark, it provides analytic query processing, mutability/transactions, access to virtually all big data sources and stream processing all in one unified cluster. One common use case for SnappyData is to provide analytics at interactive speeds over large volumes of data with minimal or no pre-processing of the dataset. For instance, there is no need to often pre-aggregate/reduce or generate cubes over your large data sets for ad-hoc visual analytics. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    CoolplaySpark

    CoolplaySpark

    Spark Cool Play: Spark source code analysis, Spark class library, etc.

    ...The project contains annotated examples, explanations, and exercises that guide learners through Spark’s architecture, execution model, and source code internals. It is particularly valuable for developers who want to strengthen their understanding of Spark by not only using it as a data processing engine but also exploring how its internals function. Through code analysis and commentary, CoolplaySpark helps readers connect theoretical concepts with practical implementation details. By combining book study with this repository, learners can develop both conceptual clarity and hands-on expertise in Spark’s core components.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    SZT-bigdata

    SZT-bigdata

    SZT‑bigdata is an open source project

    SZT‑bigdata is an open-source project analyzing real Shenzhen metro (subway) card usage data using big‑data frameworks like Spark, Hadoop, Hive, Kafka, Flink, ClickHouse, HBase, and Elasticsearch. Aimed at exploring transit passenger flow patterns and system optimization using a variety of Scala-based technologies.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Cosmos DB Spark

    Cosmos DB Spark

    Apache Spark Connector for Azure Cosmos DB

    ...The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    node2vec

    node2vec

    Learn continuous vector embeddings for nodes in a graph using biased R

    ...The algorithm is designed to learn continuous vector representations of nodes in a graph by simulating biased random walks and applying skip-gram models from natural language processing. These embeddings capture community structure as well as structural equivalence, enabling machine learning on graphs for tasks such as classification, clustering, and link prediction. The repository contains reference code accompanying the research paper node2vec: Scalable Feature Learning for Networks (KDD 2016). It allows researchers and practitioners to apply node2vec to various graph datasets and evaluate embedding quality on downstream tasks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Summingbird

    Summingbird

    Streaming MapReduce with Scalding and Storm

    Summingbird is a streaming + batch hybrid computation framework developed by Twitter. Its aim is to let developers express data aggregation pipelines in a unified way, where the same logic can run either in real time (stream) or in batch mode, and the results can be merged or reconciled. In effect, Summingbird abstracts over multiple execution engines (such as Storm, Scalding, etc.) to provide one high-level program that composes transformations and aggregations, and then executes them in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Apache PredictionIO

    Apache PredictionIO

    Machine learning server for developers and ML engineers

    ...Quickly build and deploy an engine as a web service on production with customizable templates; respond to dynamic queries in real-time once deployed as a web service; evaluate and tune multiple engine variants systematically; unify data from multiple platforms in batch or in real-time for comprehensive predictive analytics; speed up machine learning modeling with systematic processes and pre-built evaluation measures; support machine learning and data processing libraries such as Spark MLLib and OpenNLP; implement your own machine learning models and seamlessly incorporate them into your engine; simplify data infrastructure management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Conky GUI
    Conky GUI eases the customization of Conky configuration files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Inhaler

    Inhaler

    speed reading tool

    Inhaler is a speed reading tool programmed in scala using swing. It features variable reading speed and font size. It is licensed under GPL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Math expression parser and evaluator written in Scala. Usable from Java (Sun JRE 1.6) Provides float, integral, boolean and vector data types, some string processing support. Variables may be defined internally or im- and exported through a binding.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Simuquant

    A quantum circuit simulator written in Scala

    Simuquant is made to construct and simulate universal quantum circuits. It's GUI enables intuitive access and direct graphical feedback, making it useful in a classroom situation and the like. Simulation processing is done in parallel, utilizing parallel collections introduced in scala 2.9. This application has been created as part a bachelor-thesis at the Hamburg University of Applied Sciences. The thesis is available at: http://opus.haw-hamburg.de/volltexte/2012/1843/pdf/ba_dahl.pdf Future plans: - Operator grouping (enables repeated blocks) - Editable library
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Carafe is an implementation of Conditional Random Fields and related algorithms targeted at text processing applications. The latest version, jCarafe, is implemented in Scala and runs on the JVM.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    RowScope
    An asynchronous file viewer for large files (100 of MBytes, GBytes)
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB