Supposing matrix is represented as stxxl::vector, storing entries row by row, I'm struggling to implement more efficient

algorithm for matrix transpose. This is what I have so far:

template <typename MatType> void transpose(const MatType& myMat, MatType& transposed) { transposed.rowNo=myMat.colNo; transposed.colNo=myMat.rowNo; transposed.mat.resize(transposed.rowNo*transposed.colNo); typename MatType::vt::const_iterator myFiter=myMat.mat.begin(); typename MatType::vt::iterator mySiter=transposed.mat.begin(); unsigned long long k=0;unsigned long long t=0;int shift=0; while(t<transposed.colNo) { while(k<transposed.rowNo) { *(mySiter+(k*transposed.colNo+shift))=*myFiter; k++;myFiter+=1; } k=0;t++;shift++; } }

What would be another, possibly faster approach to perform this operation? Note that I dont have experience with external

memory algorithms. Any help on this is welcome.