Matrix transpose

2010-07-16
2013-04-25
  • Nobody/Anonymous

    Supposing matrix is represented as stxxl::vector, storing entries row by row, I'm struggling to implement more efficient
    algorithm for matrix transpose. This is what I have so far:

    template <typename MatType>
    void transpose(const MatType& myMat, MatType& transposed) {
         transposed.rowNo=myMat.colNo;
         transposed.colNo=myMat.rowNo;
         transposed.mat.resize(transposed.rowNo*transposed.colNo);
         typename MatType::vt::const_iterator myFiter=myMat.mat.begin();
         typename MatType::vt::iterator mySiter=transposed.mat.begin();
         unsigned long long k=0;unsigned long long t=0;int shift=0;
         while(t<transposed.colNo) {
                 while(k<transposed.rowNo) {
                       *(mySiter+(k*transposed.colNo+shift))=*myFiter;
                       k++;myFiter+=1;
                 }
                 k=0;t++;shift++;
         }
    }
    

    What would be another, possibly faster approach to perform this operation? Note that I dont have experience with external
    memory algorithms. Any help on this is welcome.

     
  • Johannes Singler

    For I/O-efficient algorithms,

    See e.g. J.S.Vitter: Algorithms and Data Structures for External Memory, 2008 http://www.cs.duke.edu/~jsv/Papers/catalog/node81.html#Vit:I_Obook Chapter 7 is specifically about matrix operations, and the whole book is about I/O efficiency.

    A way supported by the STXXL would be to use sort for permutation.  Associate each matrix entry with its supposed vector rank after the tranpose, and sort these pairs.  Afterwards, strip off the additional rank from each entry.  You should use streaming (http://algo2.iti.kit.edu/stxxl/trunk/group__streampack.html) to keep overhead low.
    Using this method, you will be at least 3 times slower than the optimum, but it might still be much better than the naive algorihtm stated above.

    Johannes

     
  • Johannes Singler

    You might be interested in the fact that we now have an experimental STXXL branch that supports matrix multiplication.
    https://stxxl.svn.sourceforge.net/svnroot/stxxl/branches/matrix

    For installation, please follow the instructions given here:
    http://algo2.iti.kit.edu/stxxl/trunk/install-svn.html

    For an example, look into containers/test_matrix.cpp

    Our best results, using the Intel MKL for the internal BLAS routines and 8 cores on doubles, are as follows:
    800MB per matrix: ~40s
    80GB per matrix:  ~30000s
    So for 10GB per matrix, this could work in less than 1500s on a such powerful machine.

    Please be aware that the interface is far from being finished, and will change without further notice.

    Johannes

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks