Ballast is a tool for balancing user load across SSH servers that supports user-specific policies as well as traditional policies based on metrics such as CPU load. It includes a simple client, a lightweight data server, and a data collection agent.
Save is a lightweight framework for creating high availability systems based on Mon with extensions for authenticated heartbeats and IP address takeover. Save also provides validation, synchronization, and archival of configuration and other files.
Scalable, distributed monitoring system for high-performance computing
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. Supports clusters up to 2000 nodes in size.
GGI stands for "General Graphics Interface", and it is a project that aims to develop a reliable, stable and fast graphics system that works everywhere. We want to allow any program using GGI to run on any platform requiring at most a recompile.
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
NOTE: UBMoD is no longer actively developed and we encourage you to download Open XDMoD as a replacement for UBMoD. Open XDMoD has been developed to provide detailed information on resource utilization and performance for academic and industrial HPC centers. More information is available at http://xdmod.sourceforge.net/
UBMoD is a data warehouse and web portal for mining statistical data from resource managers in high-performance computing environments. UBMoD presents resource...
PBS Job Seer is a program which analyses the state of a PBS/Maui batch system and automatically generates a batch script for MPI jobs which would then be able to start immediately.
pdo allows users to run SSH commands on multiple remote hosts in parallel, or run commands (such as rsync etc.) locally against multiple remote hosts in parallel.
CfE stands for “Clusters for Everyone” and is an effort to make a Linux distribution tailored for clusters. For more information, please see http://www.matteocicuttin.it/?page_id=101
metahelper is a utility which makes creating and maintaining upgradeable, removable, and verifiable configuration "metapackages" easy. A metapackage works with files owned by other packages to customize and configure them specifically for your environmen
SimParEx executes a program(command) on many computers (farm) in parallel and collects the results (task farming). Major features: minimal requirements (TCP, SSH, Perl), flexible task definition, web interface.
Job Accounting and User and Project managment for Clustered Computing using Perl, Apache, and MySQL. Designed to be extensible, it processes logs from currently PBS, OpenPBS and Maui2. It also contains a web frontend for User and Project Managment
General purpose, cross-platform, high-latency, high availability (HA) daemon. HA is achieved by IP-takeover of the virtual/HA server(s) by one or more real/redundant servers. Requires Perl 5. Designed for ease of deployment (for a HA system :-).
VACM-Perl is a module harness that allows Perl modules to be written for VACM. This site will also serve as a repository for Open Source VACM Modules built with VACM-Perl