Search Results for "distributed shared memory" - Page 3

173 projects for "distributed shared memory" with 1 filter applied:

  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Punica

    Punica

    Serving multiple LoRA finetuned LLM as one

    Punica is a system designed to efficiently serve multiple LoRA-fine-tuned large language models within a shared GPU environment. LoRA is a parameter-efficient fine-tuning method that allows developers to adapt large pretrained models to specific tasks by adding lightweight adapter layers rather than retraining the entire model. Punica introduces a serving architecture that allows multiple LoRA adapters to share the same base model during inference, significantly reducing memory consumption and computational overhead. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    NSQ

    NSQ

    A realtime distributed messaging platform

    NSQ is a realtime distributed messaging platform that is designed to scale, and can even handle billions of messages daily. It promotes distributed and decentralized topologies, allowing it high availability and fault tolerance along with guaranteed reliable message delivery. NSQ scales horizontally and is easy to configure and deploy. It is agnostic to data format, so messages can be in JSON, MsgPack, Protocol Buffers, or anything else. Official Go and Python libraries are available,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Neural Tangents

    Neural Tangents

    Fast and Easy Infinite Neural Networks in Python

    ...The library closely mirrors JAX’s stax API while extending it to return a kernel_fn alongside init_fn and apply_fn, enabling drop-in workflows for kernel computation. Kernel evaluation is highly optimized for speed and memory, and computations can be automatically distributed across accelerators with near-linear scaling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    ccgsl

    Use Gnu Scientific Library as if it were writtem in C++.

    The ccgsl provides simple C++ wrappers for the GNU Scientific Library. It uses Java-like shared-pointer classes in place of structs to avoid direct memory allocation/freeing and to work better with the STL. It lets you construct functions for optimisation, root-finding and the like from C++ member functions, making it easier to integrate with existing C++ code. It also provides C++ exceptions.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Metaseq

    Metaseq

    Repo for external large-scale work

    Metaseq is a flexible, high-performance framework for training and serving large-scale sequence models, such as language models, translation systems, and instruction-tuned LLMs. Built on top of PyTorch, it provides distributed training, model sharding, mixed-precision computation, and memory-efficient checkpointing to support models with hundreds of billions of parameters. The framework was used internally at Meta to train models like OPT (Open Pre-trained Transformer) and serves as a reference implementation for scaling transformer architectures efficiently across GPUs and nodes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    GTB: Graphics Toolbox

    C++ libraries and apps for computer graphics and data visualization

    The Graphics Toolbox (GTB) is a collection of C++ libraries and apps for computer graphics and data visualization. Wagner Correa initially created GTB as part of his Ph.D. research at Princeton University in collaboration with Professor Claudio Silva and Dr. James Klosowski. Several other researchers later contributed to GTB (see the AUTHORS file).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Mars Framework

    Mars Framework

    Mars is a tensor-based unified framework for large-scale data

    Mars is a distributed computing framework designed to scale scientific computing and data science workloads across large clusters while preserving the familiar programming interfaces of common Python libraries. The project provides a tensor-based execution model that extends the capabilities of tools such as NumPy, pandas, and scikit-learn so that large datasets can be processed in parallel without rewriting code for distributed environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    SimpleXlsxWriter

    C++ library for creating XLSX files for MS Excel 2007 and above.

    This library represents XLSX files writer for Microsoft Excel 2007 and above. The main feature of this library is that it uses C++ standard file streams. On the one hand it results in almost unnoticeable memory and CPU resources consumption while processing (that may be very useful at saving a large data arrays), but on the other hand it makes unfeasible to edit data that were written. Hence, if using this library the structure of the future report should be known enough. The library is written in C++ with using STL functionality and based on the ZIP library (included), which has a free license: http://www.codeproject.com/Articles/7530/Zip-Utils-clean-elegant-simple-C-Win32 This library is distributed under the terms of the zlib license: http://www.zlib.net/zlib_license.html
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    ...It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. Its components are modular, so teams can adopt just the sharding optimizer or the pipeline engine without rewriting their training loop. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Rocket Chip

    Rocket Chip

    Rocket Chip Generator

    ...A diplomacy framework (LazyModules) lets designers wire components with negotiated parameters, enabling reuse and rapid exploration of different cache sizes, port counts, and memory hierarchies. The generator supports custom accelerators through the RoCC interface, allowing domain-specific compute units to be plugged into the pipeline with shared cache and memory semantics. Tooling integrates with FIRRTL, Verilator, and commercial EDA flows, and the ecosystem around Rocket Chip (e.g., Chipyard) adds harnesses, peripherals, and verification infrastructure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    shmcat

    shmcat

    A tool to dump shared memory segments, files and text

    This is a simple tool that dumps shared memory segments (System V and POSIX), files and text. It might be useful when you have to debug programs that use shared memory.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    wholeaked

    wholeaked

    Tool that embeds identifiers in files to trace leak sources

    wholeaked is an open source file distribution and tracking tool designed to help identify the source of leaked files. It works by generating unique versions of a file for each recipient and embedding identifying signatures or metadata into each distributed copy. If the file later appears in an unauthorized location, the embedded identifier can be analyzed to determine which recipient originally received that specific version. This approach allows organizations, researchers, or individuals to trace the origin of leaks in sensitive documents, binaries, or other shared resources. wholeaked automates the process of creating personalized file variants and associating them with a list of recipients. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    JAMon API

    JAMon API

    Monitor Java applications - SQL, HTTP, Methods, Exceptions and more.

    JAMon API is a free, simple, high performance, thread safe, Java API that allows developers to easily monitor the performance and scalability of production applications. JAMon tracks hits, execution times (total, avg, min, max, std dev), and more. * JAMon Users Manual: For more on the JAMon, including installing, configuring, and using it, see http://jamonapi.sourceforge.net/. * Support: If you have any questions about usage please post a question on the forum at ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 14
    twemproxy

    twemproxy

    A fast, light-weight proxy for memcached and redis

    twemproxy (pronounced "two-em-proxy"), aka nutcracker is a fast and lightweight proxy for memcached and redis protocol. It was built primarily to reduce the number of connections to the caching servers on the backend. This, together with protocol pipelining and sharding enables you to horizontally scale your distributed caching architecture. Fast and lightweight. Maintains persistent server connections. Keeps connection count on the backend caching servers low. Enables pipelining of requests...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BPF Performance Tools

    BPF Performance Tools

    Official repository for the BPF Performance Tools book

    BPF Performance Tools Book is the companion repository for Brendan Gregg’s book on Linux performance analysis using eBPF and BCC tracing technologies. The project contains scripts, examples, and reference material that demonstrate how to inspect kernel behavior, application performance, CPU usage, networking activity, file systems, and system bottlenecks in real time. It serves as both an educational resource and a practical toolkit for Linux engineers, SREs, and performance analysts working...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PyTorch-BigGraph

    PyTorch-BigGraph

    Generate embeddings from large-scale graph-structured data

    PyTorch-BigGraph (PBG) is a system for learning embeddings on massive graphs—think billions of nodes and edges—using partitioning and distributed training to keep memory and compute tractable. It shards entities into partitions and buckets edges so that each training pass only touches a small slice of parameters, which drastically reduces peak RAM and enables horizontal scaling across machines. PBG supports multi-relation graphs (knowledge graphs) with relation-specific scoring functions, negative sampling strategies, and typed entities, making it suitable for link prediction and retrieval. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ...Download pages, repositories, tickets and wiki have been moved to https://github.com/oss-tsukuba/ Download: gfarm: https://github.com/oss-tsukuba/gfarm/releases gfarm2fs: https://github.com/oss-tsukuba/gfarm2fs/releases and more: https://github.com/oss-tsukuba/gfarm/wiki#download Repositories: gfarm: https://github.com/oss-tsukuba/gfarm gfarm2fs: https://github.com/oss-tsukuba/gfarm2fs and more: https://github.com/orgs/oss-tsukuba/repositories Tickets: gfarm: https://github.com/oss-tsukuba/gfarm/issues gfarm2fs: https://github.com/oss-tsukuba/gfarm2fs/issues Wiki: https://github.com/oss-tsukuba/gfarm/wiki Gfarm file system is a network shared file system that supports scalable I/O performance in distributed environment. It can federate local disks of network-connected PCs and compute nodes in several clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ELF (Extensive Lightweight Framework)

    ELF (Extensive Lightweight Framework)

    An End-To-End, Lightweight and Flexible Platform for Game Research

    ELF (Extensive, Lightweight, and Flexible) is a high-performance platform for reinforcement learning research that unifies simulation, data collection, and distributed training. A C++ core provides fast environments and concurrent actors, while Python bindings expose simple APIs for agents, replay, and optimization loops. It supports both single-agent and multi-agent settings, with batched stepping and shared-memory queues that keep GPUs saturated during training. ELF introduced widely used reference systems, most notably ELF OpenGo, demonstrating at-scale self-play with strong analysis tooling and public checkpoints. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    ...Mixed-precision support (float16) is optimized for NVIDIA Volta and Turing GPUs, allowing significant speedups and memory savings without sacrificing model quality. The project comes with configuration-driven training scripts, documentation, and examples that demonstrate how to set up pipelines for tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    JCSprout

    JCSprout

    Basic, concurrent algorithm

    JCSprout is a curated learning path for Java engineers that mixes concise notes, diagrams, and runnable examples to cover core computer science and JVM topics. It walks readers through data structures and algorithms, networking fundamentals, Java concurrency, JVM memory model and GC, and common interview problem patterns. The repository emphasizes understanding over memorization, linking conceptual summaries with small code artifacts that can be compiled and profiled. It also highlights best...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    popt4jlib

    Parallel Optimization Library for Java

    ...A fast parallel implementation of the network simplex method, and some full-fledged parallel/distributed MIP solvers will be added in the next version. In general, emphasis is given in improving the efficiency of the algorithms in shared-memory models via java threads, since multi-core machines are so wide-spread today.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Lists all the processes, executables, and shared libraries that are using up virtual memory. It's helpful to see how the shared memory is used and which 'old' libs are loaded.
    Leader badge
    Downloads: 804 This Week
    Last Update:
    See Project
  • 23
    P3: The Portable Unix Programming System

    P3: The Portable Unix Programming System

    Multi-process homeostatic software agent library

    PUPS/P3 facilitates development of multi-process multi-host computations by providing tools to emulate colonies of homeostatic organisms. It permits persistent computation, homeostatic resource protection, and asychronous interprocess communication.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    TORO kernel
    TORO is demonstrating an innovative operating system by integrating at the same ring level both kernel and the user application server. The threads of the user application server are distributed evenly on all CPUs and running independently in parallel. The memory model chosen is NUMA without pagination. During the initialization, the memory is divided proportionally for each processor installed on the system. When a thread needs memory, the memory allocator returns a free block of memory depending on which CPU the thread is running. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Saros - Distributed Party Programming
    Saros brings multi-writer synchronous distributed editing to the Eclipse IDE, e.g. for joint code reviews, explaining code remotely, or distributed pair programming -- all also for more than 2 participants; we call this Distributed Party Programming. It includes refined awareness functionality, text chat, and a simple distributed whiteboard/sketching facility. Eclipse Update Site: https://www.saros-project.org/update-site/eclipse
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB