Menu

efficient use of preconditioners on CUDA (ILUT)

Peter
2014-08-20
2014-08-20
  • Peter

    Peter - 2014-08-20

    I am trying to build a simulator which requires to solve the system

    A * x = b

    The best iterative solver / preconditioner combination is the BICGSTAB with ILUT, which often solves the particular system in only a handful of iterations. The bottleneck for the calculation therefore appears to sit in building the preconditioner, which is done on the CPU as far as I understand.

    The above system is solved over and over again, with only the values of the coefficients of A changing but not the structure of A. Within ViennaCl, are there facilities which would enable me to break the preconditioner down into an analysis and a solve phase, so that I could do the analysis once and then only do solves and thus speed up the calculations?

    Also are there any plans (for the immediate future) to improve the parallelization of preconditioners?

     
  • Karl Rupp

    Karl Rupp - 2014-08-20

    Hi Peter,

    yes, you are right, all ILU-preconditioners are currently build on the CPU because of their sequential nature. Some parallel approaches have been suggested in the literature recently, but we haven't yet implemented them. Also, currently ILU0 has a much faster setup time than ILUT, so ILU0 might be worth a try for your system, even if it involves a higher iteration count. Generally, it has been observed that the smallest time-to-solution is obtained for higher iteration counts on GPUs as compared to the smallest time-to-solution on CPUs. In other words, it usually pays off to trade some solver iterations for better parallelism.

    As for reusing the structure: With ILUT the structure may change in dependence of the values of A, so what you are asking for is ILU with a static pattern obtained from the first ILUT run. We don't have a ready-to-go interface for this yet in ViennaCL, but I can recommend a different approach: Try to reuse the ILUT preconditioner built for the first solve in the subsequent solves. Only recompute the preconditioner if the number of solver iterations required exceeds a certain threshold (e.g. twice the iterations from the initial setup).

    Best regards,
    Karli

     
  • Peter

    Peter - 2014-08-20

    Yeah, wasn't thinking about the changing structure of ILUT due to the threshold, you're right there. I'll try out your suggested approaches, thanks!

     

Log in to post a comment.