pnumpy is a very lightweight implementation of distributed arrays in python, which is based on numpy and mpi4py. pnumpy supports arrays in any number of dimensions. pnumpy works seemlessly with numpy slicing operators ufunc, etc., making it easy to transition from numpy arrays to using pnumpy arrays for better multicore performance.
pnumpy can accelerate your python code on multicore platforms by executing array operations in parallel. Each process holds its own array, parts of which can be accessed from other processes. You can choose which parts of the array are exposed to other processes. For convenience, a ghosted array class is provided for this purpose.
Simple examples using pnumpy
Letting pnumpy do the domain decomposition
How to apply the Laplacian operator to a field
How to apply a reduction operation across all processors
Wiki: PnumpyDomainDecomp
Wiki: PnumpyExample
Wiki: PnumpyLaplacian