80 projects for "distributed computing" with 2 filters applied:

  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 1
    Parallax

    Parallax

    Parallax is a distributed model serving framework

    Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to available hardware and how requests are routed across nodes during execution. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Kubeflow Trainer

    Kubeflow Trainer

    Distributed AI Model Training and LLM Fine-Tuning on Kubernetes

    ...The platform supports a wide range of machine learning frameworks, including PyTorch, JAX, Hugging Face, DeepSpeed, and XGBoost, making it highly flexible for different AI use cases. One of its key innovations is the integration of MPI-based distributed computing within Kubernetes, allowing efficient communication between nodes for high-performance training. It also includes advanced scheduling capabilities through integrations with tools like Kueue and Volcano, enabling topology-aware resource allocation and multi-cluster job orchestration.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Xtuner

    Xtuner

    A Next-Generation Training Engine Built for Ultra-Large MoE Models

    Xtuner is a large-scale training engine designed for efficient training and fine-tuning of modern large language models, particularly mixture-of-experts architectures. The framework focuses on enabling scalable training for extremely large models while maintaining efficiency across distributed computing environments. Unlike traditional 3D parallel training strategies, XTuner introduces optimized parallelism techniques that simplify scaling and reduce system complexity when training massive models. The engine supports training models with hundreds of billions of parameters and enables long-context training with sequence lengths reaching tens of thousands of tokens. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Matrix

    Matrix

    Multi-Agent daTa geneRation Infra and eXperimentation framework

    Matrix is a distributed, large-scale engine for multi-agent synthetic data generation and experiments: it provides the infrastructure to run thousands of “agentic” workflows concurrently (e.g. multiple LLMs interacting, reasoning, generating content, data-processing pipelines) by leveraging distributed computing (like Ray + cluster management). The idea is to treat data generation as a “data-to-data” transformation: each input item defines a task, and the runtime orchestrates asynchronous, peer-to-peer agent workflows, avoiding global synchronization bottlenecks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    mlforecast

    mlforecast

    Scalable machine learning for time series forecasting

    ...It supports multi-series forecasting, meaning you can train one model that forecasts many time series at once (common in retail, demand forecasting, etc.), rather than one model per series. The library is built to scale: behind the scenes, it can leverage distributed computing frameworks (Spark, Dask, Ray) when datasets or the number of series grow large.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    HolmesGPT

    HolmesGPT

    CNCF Sandbox Project

    HolmesGPT is an open-source AI agent designed to help DevOps and site reliability engineering teams diagnose and resolve production incidents. The system aggregates signals from observability tools such as logs, metrics, alerts, and distributed traces, then analyzes them using large language models to identify potential root causes. Rather than requiring engineers to manually correlate large volumes of monitoring data, HolmesGPT automatically synthesizes evidence and presents explanations in natural language. The project is developed by Robusta and has been accepted as a Cloud Native Computing Foundation Sandbox project, highlighting its relevance to the cloud-native ecosystem. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    EvoTorch

    EvoTorch

    Advanced evolutionary computation library built on top of PyTorch

    EvoTorch is an evolutionary optimization framework built on top of PyTorch, developed by NNAISENSE. It is designed for large-scale optimization problems, particularly those that require evolutionary algorithms rather than gradient-based methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Apache Hamilton

    Apache Hamilton

    Helps data scientists define testable self-documenting dataflows

    Apache Hamilton is an open-source Python framework designed to simplify the creation and management of dataflows used in analytics, machine learning pipelines, and data engineering workflows. The framework enables developers to define data transformations as simple Python functions, where each function represents a node in a dataflow graph and its parameters define dependencies on other nodes. Hamilton automatically analyzes these functions and constructs a directed acyclic graph...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Chitu

    Chitu

    High-performance inference framework for large language models

    ...The framework focuses on improving efficiency, flexibility, and scalability for organizations that need to run LLM inference workloads across different hardware platforms. It supports heterogeneous computing environments, including CPUs, GPUs, and various specialized AI accelerators, allowing models to run across a wide range of infrastructure configurations. Chitu is designed to scale from small single-machine deployments to large distributed clusters that handle high volumes of concurrent inference requests. The system also includes performance optimizations for large models, including support for quantized formats and efficient computation operators that reduce memory usage and latency. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    TAME LLM

    TAME LLM

    Traditional Mandarin LLMs for Taiwan

    TAME LLM is an open-source initiative focused on building and releasing large language models optimized for Traditional Mandarin and the linguistic context of Taiwan. The project includes models such as Llama-3-Taiwan-70B, which are fine-tuned versions of large transformer architectures trained on extensive corpora containing both Traditional Mandarin and English text. These models are designed to support applications such as conversational AI, knowledge retrieval, and domain-specific...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    FRODO 2

    Open-Source Framework for Distributed Constraint Optimization (DCOP)

    FRODO is a Java platform to solve Distributed Constraint Satisfaction Problems (DisCSPs) and Optimization Problems (DCOPs). It provides implementations for a variety of algorithms, including DPOP (and its variants), ADOPT, SynchBB, DSA...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Bandicoot

    Bandicoot

    fast C++ library for GPU linear algebra & scientific computing

    * Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    LLM Applications

    LLM Applications

    A comprehensive guide to building RAG-based LLM applications

    ...It provides step-by-step guidance for constructing systems that ingest documents, split them into chunks, generate embeddings, index them in vector databases, and retrieve relevant context during inference. The repository also shows how these components can be scaled and deployed using distributed computing frameworks such as Ray. In addition to development workflows, the project includes notebooks, datasets, and evaluation tools that help developers experiment with different retrieval strategies and model configurations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Mars Framework

    Mars Framework

    Mars is a tensor-based unified framework for large-scale data

    Mars is a distributed computing framework designed to scale scientific computing and data science workloads across large clusters while preserving the familiar programming interfaces of common Python libraries. The project provides a tensor-based execution model that extends the capabilities of tools such as NumPy, pandas, and scikit-learn so that large datasets can be processed in parallel without rewriting code for distributed environments.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15

    Agentopia

    Java5 mobile agents in peer2peer containers without stubs/skeletons.

    Agentopia is a programming framework (API) for Java 5 mobile agents in peer-to-peer networks. Main features: Routing around firewalls, anonymity, and it is extremely easy to write new agents. No RMI, no CORBA, just plain Java bytecode loading.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    spark-ml-source-analysis

    spark-ml-source-analysis

    Spark ml algorithm principle analysis and specific source code

    ...Each section discusses both the mathematical principles behind the algorithms and how Spark implements them in a distributed computing environment. By studying these implementations, readers gain insight into how large-scale machine learning pipelines operate across distributed data systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Jadex is a Belief Desire Intention (BDI) reasoning engine that allows for programming intelligent software agents in XML and Java. The resoning engine is very flexible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Prolog+CG is a Java implementation of Prolog with extensions implementing a subset of the Conceptual Graph (CG) theory of John Sowa. CGs are first-class datatypes on a par with terms. Object oriented extensions are also included.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    P3: The Portable Unix Programming System

    P3: The Portable Unix Programming System

    Multi-process homeostatic software agent library

    PUPS/P3 facilitates development of multi-process multi-host computations by providing tools to emulate colonies of homeostatic organisms. It permits persistent computation, homeostatic resource protection, and asychronous interprocess communication.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DMTK

    DMTK

    Microsoft Distributed Machine Learning Toolkit

    The Microsoft Distributed Machine Learning Toolkit (DMTK) is an open-source framework created to support scalable machine learning across distributed computing environments. Developed by Microsoft Research, the toolkit provides infrastructure and algorithms designed to train large models efficiently on clusters of machines rather than a single system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    H2O-3

    H2O-3

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning

    H2O-3 is an open-source machine learning platform designed to build scalable and distributed machine learning models across large datasets. The system operates as an in-memory computing platform that allows data scientists to train models quickly using distributed resources. It supports many machine learning algorithms including generalized linear models, gradient boosting machines, deep learning networks, and ensemble techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CArtAgO is a framework for programming and executing virtual environments in multi-agent programs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 24
    HERAKLES is a reasoning broker framework for OWL (Web Ontology Language) reasoning systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    YARP - Yet Another Robot Platform
    YARP, Yet Another Robot Platform. Always dreamt of controlling a humanoid robot? ...well, we do that. A C++ library for IPC, vision and control. This project was migrated to GitHub: https://github.com/robotology/yarp
    Leader badge
    Downloads: 15 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next