I have a signal processing application that SigPack would seem to be perfect for. However, I want to take advantage of available GPU acceleration.
NVidia provides both BLAS and FFTW interface layers that make this trivial. However, I'm wondering if some of the implementations in SigPack might defeat this. Granted, I'm very new to GPU programming (hence my desire for abstraction, but consider this implementation in the "window.h" header:
Hi, glad that you are considering SigPack for your project. Unfortunately there is no support for GPU at this time, Conrad Sandersson, the creator of Armadillo, has a project for Armadillo GPU (https://coot.sourceforge.io/) but I think it is in a "resting" state. Many of the functions in SigPack has potential for improvement regarding performance and my intention is to review this for the different blocks. Meantime, I would recommend you to use OpenBlas and compile it natively on your computer, it has some good multi-thread support and is almost as fast as Intel MKL
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a signal processing application that SigPack would seem to be perfect for. However, I want to take advantage of available GPU acceleration.
NVidia provides both BLAS and FFTW interface layers that make this trivial. However, I'm wondering if some of the implementations in SigPack might defeat this. Granted, I'm very new to GPU programming (hence my desire for abstraction, but consider this implementation in the "window.h" header:
Is there no way to apply a function to the entire vector such that it can be GPU'd? Seems like this block would necessarily be bound to the CPU.
Hi, glad that you are considering SigPack for your project. Unfortunately there is no support for GPU at this time, Conrad Sandersson, the creator of Armadillo, has a project for Armadillo GPU (https://coot.sourceforge.io/) but I think it is in a "resting" state. Many of the functions in SigPack has potential for improvement regarding performance and my intention is to review this for the different blocks. Meantime, I would recommend you to use OpenBlas and compile it natively on your computer, it has some good multi-thread support and is almost as fast as Intel MKL