[atlas-devel] FLOP counting in DGELS (using QR decomposition)
Brought to you by:
rwhaley,
tonyc040457
From: José L. G. P. <jgp...@gm...> - 2013-12-20 15:01:18
|
Hello, Due to benchmarking tests, I need to compute the FLOPS spent by DGELS when it works with a square (non transposed mode) matrix and only one right hand side, i.e. the performance of DGELS using the QR decomposition. Inspecting the code, I see that the operations sequence is (I omit the functions that works only on B, because I use only one right hand side): DGELS = DLANGE+DLASCL+DGEQRF+DORM2R+DTRTRS The individual operation counts are: DLANGE: N^2? This function computes the max(abs(A[i,j])), so the N^2 elements of the matrix are used, but only for comparison. Shoul I count each comparison as one FLOP? DLASCL: N^2 in the better case. I say in the better case because the function contains an inner loop DGEQRF: 4/3N^3+ 2N^2+14/3N, (considering a square NxN matrix) as stated in www.netlib.org/lapack/lawnspdf/lawn41.pdf DORM2R: I don't know... DTRTRS: N^3 (www.netlib.org/lapack/lawnspdf/lawn41.pdf) Knows anyone an expression for the total count? Which formula should be use fot DORM2R? Thanks -- ***************************************** José Luis García Pallero jgp...@gm... (o< / / \ V_/_ Use Debian GNU/Linux and enjoy! ***************************************** |