Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "distributed shared memory" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Linux 58
Windows 52
Mac 50
More...
BSD 26
ChromeOS 21
Desktop Operating Systems 1

Category

Artificial Intelligence 40
Software Development 15
System 7
Scientific/Engineering 4
Multimedia 3
Business 2
Communications 2
Database 2
Internet 2
Desktop Environment 1
Education 1
Games 1
Security 1

License

OSI-Approved Open Source 63
Creative Commons Attribution License 1

Translations

English 5
Korean 1
Spanish 1

Programming Language

Python 65
C 7
C++ 6
Java 4
Unix Shell 4
More...
JavaScript 3
C# 2
ActionScript 1
Erlang 1
Fortran 1
PowerShell 1
Ruby 1
Scala 1
TypeScript 1

Status

Alpha 7
Beta 4
Production/Stable 2
Planning 1
More...
Pre-Alpha 1

Showing 65 open source projects for "distributed shared memory"

View related business solutions

Python Clear Filters & Widen Search

Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
1

NeuralForecast

Scalable and user friendly neural forecasting algorithms.

NeuralForecast offers a large collection of neural forecasting models focusing on their performance, usability, and robustness. The models range from classic networks like RNNs to the latest transformers: MLP, LSTM, GRU, RNN, TCN, TimesNet, BiTCN, DeepAR, NBEATS, NBEATSx, NHITS, TiDE, DeepNPTS, TSMixer, TSMixerx, MLPMultivariate, DLinear, NLinear, TFT, Informer, AutoFormer, FedFormer, PatchTST, iTransformer, StemGNN, and TimeLLM. There is a shared belief in Neural forecasting methods'...

Downloads: 0 This Week

Last Update: 2026-05-06
See Project
2

LingBot-World

Advancing Open-source World Models

LingBot-World is an open-source, high-fidelity world simulator designed to advance the state of world models through video generation. Built on top of Wan2.2, it enables realistic, dynamic environment simulation across diverse styles, including real-world, scientific, and stylized domains. LingBot-World supports long-term temporal consistency, maintaining coherent scenes and interactions over minute-level horizons. With real-time interactivity and sub-second latency at 16 FPS, it is...

Downloads: 4 This Week

Last Update: 5 days ago
See Project
3

MobileLLM

MobileLLM Optimizing Sub-billion Parameter Language Models

MobileLLM is a lightweight large language model (LLM) framework developed by Facebook Research, optimized for on-device deployment where computational and memory efficiency are critical. Introduced in the ICML 2024 paper “MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases”, it focuses on delivering strong reasoning and generalization capabilities in models under one billion parameters. The framework integrates several architectural innovations—SwiGLU...

Downloads: 1 This Week

Last Update: 3 days ago
See Project
4

Swarms

Enterprise multi-agent orchestration framework for scalable AI apps

...Swarms also includes mechanisms for agent lifecycle management, memory handling, and dynamic composition, making it adaptable to evolving workloads. Additionally, it focuses on developer productivity through APIs, CLI tools, and templates that simplify building and deploying agent-based applications.

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
Streamline Azure Security with Palo Alto Networks VM-Series
Centrally manage physical and virtualized firewalls with Panorama

Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.

Learn more
5

Chitu

High-performance inference framework for large language models

...It supports heterogeneous computing environments, including CPUs, GPUs, and various specialized AI accelerators, allowing models to run across a wide range of infrastructure configurations. Chitu is designed to scale from small single-machine deployments to large distributed clusters that handle high volumes of concurrent inference requests. The system also includes performance optimizations for large models, including support for quantized formats and efficient computation operators that reduce memory usage and latency. Its architecture aims to support enterprise adoption by ensuring stable long-term operation under production workloads.

Downloads: 0 This Week

Last Update: 6 days ago
See Project
6

DeepSpeed

Deep learning optimization library: makes distributed training easy

DeepSpeed is an easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference. With DeepSpeed you can: 1. Train/Inference dense or sparse models with billions or trillions of parameters 2. Achieve excellent system throughput and efficiently scale to thousands of GPUs 3. Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5. Achieve extreme...

Downloads: 0 This Week

Last Update: 2026-05-06
See Project
7

fftw++

FFTW++ is a C++ header class for the FFTW Fast Fourier Transform library that automates memory allocation, alignment, planning, wisdom, and communication on both serial and parallel (OpenMP/MPI) architectures. In 2D and 3D, hybrid dealiasing of convolutions substantially reduces memory usage and computation time. Wrappers for C, Python, and Fortran are included.

1 Review

Downloads: 0 This Week

Last Update: 2025-07-24
See Project
8

FastChat

Open platform for training, serving, and evaluating language models

FastChat is an open platform for training, serving, and evaluating large language model-based chatbots. If you do not have enough memory, you can enable 8-bit compression by adding --load-8bit to the commands above. This can reduce memory usage by around half with slightly degraded model quality. It is compatible with the CPU, GPU, and Metal backend. Vicuna-13B with 8-bit compression can run on a single NVIDIA 3090/4080/T4/V100(16GB) GPU. In addition to that, you can add --cpu-offloading to...

Downloads: 1 This Week

Last Update: 2024-02-11
See Project
9

Punica

Serving multiple LoRA finetuned LLM as one

Punica is a system designed to efficiently serve multiple LoRA-fine-tuned large language models within a shared GPU environment. LoRA is a parameter-efficient fine-tuning method that allows developers to adapt large pretrained models to specific tasks by adding lightweight adapter layers rather than retraining the entire model. Punica introduces a serving architecture that allows multiple LoRA adapters to share the same base model during inference, significantly reducing memory consumption and computational overhead. ...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
Stop Storing Third-Party Tokens in Your Database
Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.

Try Auth0 for Free
10

Neural Tangents

Fast and Easy Infinite Neural Networks in Python

...The library closely mirrors JAX’s stax API while extending it to return a kernel_fn alongside init_fn and apply_fn, enabling drop-in workflows for kernel computation. Kernel evaluation is highly optimized for speed and memory, and computations can be automatically distributed across accelerators with near-linear scaling.

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
11

NuPIC

Numenta platform for intelligent computing

The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implements the HTM learning algorithms. HTM is a detailed computational theory of the neocortex. At the core of HTM are time-based continuous learning algorithms that store and recall spatial and temporal patterns. NuPIC is suited to a variety of problems, particularly anomaly detection and prediction of streaming data sources. For more information, see numenta.org or the NuPIC Forum. If you want...

Downloads: 0 This Week

Last Update: 2023-08-31
See Project
12

Metaseq

Repo for external large-scale work

Metaseq is a flexible, high-performance framework for training and serving large-scale sequence models, such as language models, translation systems, and instruction-tuned LLMs. Built on top of PyTorch, it provides distributed training, model sharding, mixed-precision computation, and memory-efficient checkpointing to support models with hundreds of billions of parameters. The framework was used internally at Meta to train models like OPT (Open Pre-trained Transformer) and serves as a reference implementation for scaling transformer architectures efficiently across GPUs and nodes. ...

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
13

ParlAI

A framework for training and evaluating AI models

...The library integrates tightly with PyTorch and supports both generative and retrieval-augmented models, along with utilities for multitask training and model selection. A large set of built-in tasks and dataset loaders (with consistent preprocessing and metrics) makes it easy to compare methods under shared conditions. Tools for distributed training, mixed precision, and model zoos help scale experiments from laptops to multi-GPU clusters.

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
14

gpustat

A simple command-line utility for querying and monitoring GPU status

...It serves as a simplified alternative to the more verbose nvidia-smi tool by presenting key GPU metrics in a compact, developer-friendly format. The utility retrieves data through NVIDIA’s NVML bindings and displays information such as temperature, utilization, memory usage, and running processes directly in the terminal. Because it is easy to install via pip and requires minimal configuration, gpustat is widely used in machine learning environments, research clusters, and shared GPU servers. The tool also supports watch mode for continuous monitoring and JSON output for integration into automation pipelines. ...

Downloads: 0 This Week

Last Update: 2026-03-02
See Project
15

Mars Framework

Mars is a tensor-based unified framework for large-scale data

Mars is a distributed computing framework designed to scale scientific computing and data science workloads across large clusters while preserving the familiar programming interfaces of common Python libraries. The project provides a tensor-based execution model that extends the capabilities of tools such as NumPy, pandas, and scikit-learn so that large datasets can be processed in parallel without rewriting code for distributed environments.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
16

FairScale

PyTorch extensions for high performance and large scale training

...It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. Its components are modular, so teams can adopt just the sharding optimizer or the pipeline engine without rewriting their training loop. ...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
17

TensorFlowOnSpark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters

By combining salient features from the TensorFlow deep learning framework with Apache Spark and Apache Hadoop, TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. It enables both distributed TensorFlow training and inferencing on Spark clusters, with a goal to minimize the amount of code changes required to run existing TensorFlow programs on a shared grid.

Downloads: 0 This Week

Last Update: 2024-08-05
See Project
18

Apache MXNet (incubating)

A flexible and efficient library for deep learning

Apache MXNet is an open source deep learning framework designed for efficient and flexible research prototyping and production. It contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations. On top of this is a graph optimization layer, overall making MXNet highly efficient yet still portable, lightweight and scalable.

Downloads: 0 This Week

Last Update: 2023-12-13
See Project
19

SimSiam

PyTorch implementation of SimSiam

...The repository provides scripts for both unsupervised pre-training and linear evaluation, using a ResNet-50 backbone by default. It is compatible with multi-GPU distributed training and can be fine-tuned or transferred to downstream tasks like object detection following the same setup as MoCo.

Downloads: 2 This Week

Last Update: 20 hours ago
See Project
20

PyText

A natural language modeling framework based on PyTorch

...It achieves this by providing simple and extensible interfaces and abstractions for model components, and by using PyTorch’s capabilities of exporting models for inference via the optimized Caffe2 execution engine. We use PyText at Facebook to iterate quickly on new modeling ideas and then seamlessly ship them at scale. Distributed-training support built on the new C10d backend in PyTorch 1.0. Mixed precision training support through APEX (trains faster with less GPU memory on NVIDIA Tensor Cores). Extensible components that allows easy creation of new models and tasks.

Downloads: 0 This Week

Last Update: 2021-08-31
See Project
21

BPF Performance Tools

Official repository for the BPF Performance Tools book

BPF Performance Tools Book is the companion repository for Brendan Gregg’s book on Linux performance analysis using eBPF and BCC tracing technologies. The project contains scripts, examples, and reference material that demonstrate how to inspect kernel behavior, application performance, CPU usage, networking activity, file systems, and system bottlenecks in real time. It serves as both an educational resource and a practical toolkit for Linux engineers, SREs, and performance analysts working...

Downloads: 0 This Week

Last Update: 2026-05-06
See Project
22

BytePS

A high performance and generic framework for distributed DNN training

BytePS is a high-performance and generally distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on either TCP or RDMA networks. BytePS outperforms existing open-sourced distributed training frameworks by a large margin. For example, on BERT-large training, BytePS can achieve ~90% scaling efficiency with 256 GPUs (see below), which is much higher than Horovod+NCCL.

Downloads: 0 This Week

Last Update: 2022-08-04
See Project
23

Image Super-Resolution (ISR)

Super-scale your images and run experiments with Residual Dense

The goal of this project is to upscale and improve the quality of low-resolution images. This project contains Keras implementations of different Residual Dense Networks for Single Image Super-Resolution (ISR) as well as scripts to train these networks using content and adversarial loss components. Docker scripts and Google Colab notebooks are available to carry training and prediction. Also, we provide scripts to facilitate training on the cloud with AWS and Nvidia-docker with only a few...

Downloads: 3 This Week

Last Update: 2022-03-31
See Project
24

PyTorch-BigGraph

Generate embeddings from large-scale graph-structured data

PyTorch-BigGraph (PBG) is a system for learning embeddings on massive graphs—think billions of nodes and edges—using partitioning and distributed training to keep memory and compute tractable. It shards entities into partitions and buckets edges so that each training pass only touches a small slice of parameters, which drastically reduces peak RAM and enables horizontal scaling across machines. PBG supports multi-relation graphs (knowledge graphs) with relation-specific scoring functions, negative sampling strategies, and typed entities, making it suitable for link prediction and retrieval. ...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
25

Texar

Toolkit for Machine Learning, Natural Language Processing

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation. Texar was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes. A mirror of this...

Downloads: 0 This Week

Last Update: 2022-08-08
See Project

Previous
1
You're on page 2
3
Next

Related Searches

artificial neural network

smart home control panel

fftw

chatbot code

mxnet

super resolution

inference engine

embarcadero fft fourier

2d fft

chatbot

Related Categories

Artificial Intelligence

Software Development

System

Scientific/Engineering

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise