## Re: [Libmesh-users] Support for Matrix-free algorithms

 Re: [Libmesh-users] Support for Matrix-free algorithms From: Benjamin Kirk - 2007-12-15 01:05:23 ```>> In libMesh this is as simple as providing a residual function and then >> using the -ksp_matrix_free or -ksp_mf_operator (I think) options to PETSc. > > I understand this. But my concern is how would you compute the local > contributions to the mat-vec product ? Since I do not want to store the > global stiffness or mass matrices, the question then becomes, what is the > "overhead" in writing a user routine to perform local assemblies with local > mass, stiffness for each DOF to get the nonlinear residual functional ? And > how does this operation scale in parallel ? Oh... That part is easy. In all the examples we show, we compute an element stiffness matrix and then assemble the global stffness matrix: K = Am(K_e) & f = Av(f_e) where Am(), Av() are notional matrix and vector assembly operators. Similarly, let the element residual be r_e = K_e u_e - f_e and the global residual is r = Av(r_e). So in the matrix free approach we just calculate the global residual from local contributions, and never form a global matrix. Now, of course global matrix assembly is expensive in terms of storage, but the vector assembly is not. For a vector of length N on P processors, each processor stores ~O(N/P) entries. And very few elements actually need to be communicated. In 2D the parallel vector summation is often driven by network latency instead of bandwidth because the data transfers are small. > I pose this question since the matrix-free algorithm will not involve any > kind of assembling routine at all. All it would need is to know the local h, > p, and basis type to compute local mass, stiffness matrices to compute the > local residual (a value in the global residual vector corresponding to the > unknown). Agreed. > Also, I perfectly understand your concern about creating a global > preconditioner matrix for this method to be successful but I believe that a > framework can be written well to handle the matrix-free algorithm in a > purely recursive fashion thereby eliminating any need for a preconditioner > matrix. Just my 2 cents. > Any pointers to benchmark studies that you have done in the past regarding > the speed and scalability for different kinds of problems (diffusion, > convection dominated) would also be very useful to me. I'll see what I can pull up. PETSc's interface to hypre for diffusion problems allows for algebraic multigrid directly, which is hard to beat. > I would be happy to contribute to a sample code regarding a matrix-free > scheme with LibMesh and PETSc which will give me an opportunity to test out > different techniques I have in mind. I was thinking about using the Laplace-Young free surface model to put together an example problem using this functionality. The LY model is particularly well-suited since it is 2D, scalar, steady, and nonlinear, and well-suited to adaptivity. I hope to get started on it Monday - I'll be on travel and that's always a good time for me to code without the daily distractions. ;-) I should have something by the end of the week. -Ben ```

 Re: [Libmesh-users] Support for Matrix-free algorithms From: Vijay M - 2007-12-14 20:18:21 ```Ben, Thanks for the prompt reply ! > I assume you would like to compute the action of the matrix-vector product > in some Krylov subspace method? Yes. > In libMesh this is as simple as providing a residual function and then > using the -ksp_matrix_free or -ksp_mf_operator (I think) options to PETSc. I understand this. But my concern is how would you compute the local contributions to the mat-vec product ? Since I do not want to store the global stiffness or mass matrices, the question then becomes, what is the "overhead" in writing a user routine to perform local assemblies with local mass, stiffness for each DOF to get the nonlinear residual functional ? And how does this operation scale in parallel ? I pose this question since the matrix-free algorithm will not involve any kind of assembling routine at all. All it would need is to know the local h, p, and basis type to compute local mass, stiffness matrices to compute the local residual (a value in the global residual vector corresponding to the unknown). Also, I perfectly understand your concern about creating a global preconditioner matrix for this method to be successful but I believe that a framework can be written well to handle the matrix-free algorithm in a purely recursive fashion thereby eliminating any need for a preconditioner matrix. Just my 2 cents. Any pointers to benchmark studies that you have done in the past regarding the speed and scalability for different kinds of problems (diffusion, convection dominated) would also be very useful to me. I would be happy to contribute to a sample code regarding a matrix-free scheme with LibMesh and PETSc which will give me an opportunity to test out different techniques I have in mind. I will be playing around with LibMesh in the coming few weeks and hope to write a pilot code soon. So if you have any suggestions or comments, please feel free to email me about them. Vijay -----Original Message----- From: Benjamin Kirk [mailto:benjamin.kirk@...] Sent: Friday, December 14, 2007 12:46 PM To: Vijay M; libmesh-users@...; Roy Stogner; John Peterson; Derek Gaston Subject: Re: [Libmesh-users] Support for Matrix-free algorithms > I am currently looking for a library that can work well with PETSc and can > provide me an FEM framework to handle a set of coupled nonlinear PDEs in 2 > and 3 dimensions. > > I hope to compare the usability of LibMesh and Deal II for this purpose. Glad to help! > 2) What additional code changes would a user be required to make in order to > get matrix-free solution algorithms to work with LibMesh ? For example, only > local assemblies need be performed and global assemblies that require matrix > to be stored are not needed. For an iterative linear Krylov solve, this > would be done at every matvec product request. I'll go ahead and comment on this one... I assume you would like to compute the action of the matrix-vector product in some Krylov subspace method? PETSc provides support for the notion of 'user-provided' matrix vector product, where the user returns the result y=Kx given the vector x. In the PETSc nonlinear solvers (which we support but there is no explicit example using them at the moment) the user can provide a 'residual' function which can be used to perform matrix-free newton-krylov simulations. In libMesh this is as simple as providing a residual function and then using the -ksp_matrix_free or -ksp_mf_operator (I think) options to PETSc. Of course, you will still need to come up with a quality preconditioner to use such a scheme, and doing so may still require a matrix, depending on your PDE. > If somebody can provide me with some pointers for the scalability results > and possibly a sample code for even a simple 1-D diffusion problem, that > would be fantastic. I could probably be persuaded to add ex19 for this purpose, it is well overdue. John/Roy/Derek, if y'all could point me to a Laplace-Young matrix assembly routine this would be a good case to build the example on! -Ben No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 269.17.2/1184 - Release Date: 12/14/2007 11:29 AM No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 269.17.2/1184 - Release Date: 12/14/2007 11:29 AM ```
 Re: [Libmesh-users] Support for Matrix-free algorithms From: Benjamin Kirk - 2007-12-15 01:05:23 ```>> In libMesh this is as simple as providing a residual function and then >> using the -ksp_matrix_free or -ksp_mf_operator (I think) options to PETSc. > > I understand this. But my concern is how would you compute the local > contributions to the mat-vec product ? Since I do not want to store the > global stiffness or mass matrices, the question then becomes, what is the > "overhead" in writing a user routine to perform local assemblies with local > mass, stiffness for each DOF to get the nonlinear residual functional ? And > how does this operation scale in parallel ? Oh... That part is easy. In all the examples we show, we compute an element stiffness matrix and then assemble the global stffness matrix: K = Am(K_e) & f = Av(f_e) where Am(), Av() are notional matrix and vector assembly operators. Similarly, let the element residual be r_e = K_e u_e - f_e and the global residual is r = Av(r_e). So in the matrix free approach we just calculate the global residual from local contributions, and never form a global matrix. Now, of course global matrix assembly is expensive in terms of storage, but the vector assembly is not. For a vector of length N on P processors, each processor stores ~O(N/P) entries. And very few elements actually need to be communicated. In 2D the parallel vector summation is often driven by network latency instead of bandwidth because the data transfers are small. > I pose this question since the matrix-free algorithm will not involve any > kind of assembling routine at all. All it would need is to know the local h, > p, and basis type to compute local mass, stiffness matrices to compute the > local residual (a value in the global residual vector corresponding to the > unknown). Agreed. > Also, I perfectly understand your concern about creating a global > preconditioner matrix for this method to be successful but I believe that a > framework can be written well to handle the matrix-free algorithm in a > purely recursive fashion thereby eliminating any need for a preconditioner > matrix. Just my 2 cents. > Any pointers to benchmark studies that you have done in the past regarding > the speed and scalability for different kinds of problems (diffusion, > convection dominated) would also be very useful to me. I'll see what I can pull up. PETSc's interface to hypre for diffusion problems allows for algebraic multigrid directly, which is hard to beat. > I would be happy to contribute to a sample code regarding a matrix-free > scheme with LibMesh and PETSc which will give me an opportunity to test out > different techniques I have in mind. I was thinking about using the Laplace-Young free surface model to put together an example problem using this functionality. The LY model is particularly well-suited since it is 2D, scalar, steady, and nonlinear, and well-suited to adaptivity. I hope to get started on it Monday - I'll be on travel and that's always a good time for me to code without the daily distractions. ;-) I should have something by the end of the week. -Ben ```