lightweight GPU-based sparse matrix-vector multiplication (SpMV)
LightSpMV is a novel CUDA-compatible sparse matrix-vector multiplication (SpMv) algorithm using the standard compressed sparse row (CSR) storage format. We have evaluated LightSpMV using various sparse matrices and further compared it to the CSR-based SpMV subprograms in the state-of-the-art CUSP and cuSPARSE. Performance evaluation reveals that on a single Tesla K40c GPU, LightSpMV is superior to both CUSP and cuSPARSE, with a speedup of up to 2.60 and 2.63 over CUSP, and up to 1.93 and 1.79 over cuSPARSE for single and double precision, respectively.
Software made available by the Residual Analysis blog, primarily having to do with anthropogenic global warming, e.g. GHCN Processor.
Parallel pairwise correlation computation on Intel Xeon Phi clusters
The first parallel and distributed library for pairwise correlation/dependence computation on Intel Xeon Phi clusters. This library is written in C++ template classes and achieves high speed by exploring the SIMD-instruction-level and thread-level parallelism within Xeon Phis as well as accelerator-level parallelism among multiple Xeon Phis. To facilitate balanced workload distribution, we have proposed a general framework for symmetric all-pairs computation by building provable bijective functions between job identifier and coordinate space for the first time.