ImplicitGlobalGrid is an outcome of a collaboration of the Swiss National Supercomputing Centre, ETH Zurich (Dr. Samuel Omlin) with Stanford University (Dr. Ludovic Räss) and the Swiss Geocomputing Centre (Prof. Yuri Podladchikov). It renders the distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid almost trivial and enables close to ideal weak scaling of real-world applications on thousands of GPUs [1, 2, 3]. ImplicitGlobalGrid relies on the Julia MPI wrapper (MPI.jl) to perform halo updates close to hardware limit and leverages CUDA-aware or ROCm-aware MPI for GPU-applications. The communication can straightforwardly be hidden behind computation [1, 3] (how this can be done automatically when using ParallelStencil.jl is shown in; a general approach particularly suited for CUDA C applications is explained in.
Features
- Multi-GPU with three functions
- 50-lines Multi-GPU example
- Straightforward in-situ visualization / monitoring
- Seamless interoperability with MPI.jl
- CUDA-aware/ROCm-aware MPI support
- Module documentation callable from the Julia REPL / IJulia