For performance reasons, it would be useful to
have fast / shallow access to rows and columns
of a matrix.
The current main need is to have something like
this running very fast:
int D = 100; int N = 100000;
mat X(D,N); vec h(N); vec acc(D);
...
acc = 0.0;
for(int i=1;i<N;i++) acc += h(i) * X.get_col(i);
Right now the .get_col() function creates a new
vector (including memory allocation) and then
copies the columns's elements into the new vector.
In the above code fragment this is very wasteful,
as the temporary vector is used for only one
operation after which it is discarded. This
creation/destruction is repeated for every
iteration of the loop.
To avoid copies, the following approach could be
used: Let's say we have a new function
X.shallow_col(), which creates a new vector,
but without actually allocating the memory for
the data. The vector's data would simply be
pointing to the corresponding row in the matrix.
This would of course involve modifying the vector
class, though only lightly. I think at most it
would involve a flag which indicates whether
memory allocation/freeing should be done,
and also a new initalisation function, where
a pointer to the data is given.
While in the above approach copying is omitted,
there is still the issue of continual creation
and destruction of the temporary vector.
Hence to speed things up ever further, it would
be useful to have an extended .get_col() function,
which modifies the data pointer of a given shallow
vector. A rough example:
Alternatively, we can have member function say <em>cvec * get_colptr(int col)</em>. We get to the pointer access to the column of the matrix. But this at the users risk, for example. If by mistake I do something like this:
cmat X(N,M)
cvec *tmp;
tmp=get_colptr(3);
// some procesing on the column elements.. is ok..
//but something like this..
tmp->set_length(K);
//Would Corrupt the matrix internal allocation..
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For performance reasons, it would be useful to
have fast / shallow access to rows and columns
of a matrix.
The current main need is to have something like
this running very fast:
int D = 100; int N = 100000;
mat X(D,N); vec h(N); vec acc(D);
...
acc = 0.0;
for(int i=1;i<N;i++) acc += h(i) * X.get_col(i);
Right now the .get_col() function creates a new
vector (including memory allocation) and then
copies the columns's elements into the new vector.
In the above code fragment this is very wasteful,
as the temporary vector is used for only one
operation after which it is discarded. This
creation/destruction is repeated for every
iteration of the loop.
To avoid copies, the following approach could be
used: Let's say we have a new function
X.shallow_col(), which creates a new vector,
but without actually allocating the memory for
the data. The vector's data would simply be
pointing to the corresponding row in the matrix.
This would of course involve modifying the vector
class, though only lightly. I think at most it
would involve a flag which indicates whether
memory allocation/freeing should be done,
and also a new initalisation function, where
a pointer to the data is given.
While in the above approach copying is omitted,
there is still the issue of continual creation
and destruction of the temporary vector.
Hence to speed things up ever further, it would
be useful to have an extended .get_col() function,
which modifies the data pointer of a given shallow
vector. A rough example:
shallow_vec tmp;
for(int i=1;i<N;i++) {
X.get_col(i,tmp);
acc += h(i) * tmp;
}
Your thoughts ?
Good point,
Alternatively, we can have member function say <em>cvec * get_colptr(int col)</em>. We get to the pointer access to the column of the matrix. But this at the users risk, for example. If by mistake I do something like this:
cmat X(N,M)
cvec *tmp;
tmp=get_colptr(3);
// some procesing on the column elements.. is ok..
//but something like this..
tmp->set_length(K);
//Would Corrupt the matrix internal allocation..