ViennaCL / Discussion / General Discussion: MIS2 AMG and lu

Alex Chow - 2016-10-13

Hi,

I am using the library to solve a pressure poisson equation (PPE) matrix as part of a large simulation, I do this every timestep. I have a sparse non-symmatric matrix and I know I've set it up correctly because I can solve it fine on the CPU using either the Jacobi preconditioner or the AMG preconditioner with Sequential Aggregation Coarsening. I am using BiCGStab for all simulations. Not a problem there.

My fully GPU code is able to solve the matrix every time without a problem using the Jacobi preconditioner.
However, when I try the parallel MIS-2 Aggregation AMG (for both CPU and GPU) it doesn't seem to work everytime. It is more often that it doesn't work at all.

I have looked at 2 things:

Using the original ViennaCL library code as it is (latest version) I can see that my error comes from when the preconditioner is applied to the residual (linalg\bicgstab.hpp, line 438), going into that further, the problem comes from the function, "lu_substitute" (linalg\amg.hpp, line 337). I get real numbers up until that point. Is it possible there might be a bug in the "lu_substitute" function?

I have a very basic understanding of the MIS-2 coarsening algorithm so please correct me if I'm wrong. The random weights assignment in the function, "amg_coarse_ag_stage1_mis2":
(linalg\host_based\amg_operations.hpp, line 429 for OPENMP, and linalg\cuda\amg_operations.hpp, line 297 for CUDA)

std::vector<unsigned int=""> random_weights(A.size1());
for (std::size_t i=0; i<random_weights.size(); ++i)="" random_weights<span="">[i] = static_cast<unsigned int="">(rand()) % static_cast<unsigned int="">(A.size1());

I understand this to be a randomized selection of the grid/matrix for a starting point to then narrow down the maximum independent set. And this "random" selection will give me a different preconditioner each time and that is maybe why I sometimes my solution converge or not work at all.

As an experiment I fixed the "random weight" to the columns of the main diagonal as I know for my matrix I will always have a non-zero value there.

for (std::size_t i=0; i<random_weights.size(); ++i)="" random_weights<span="">[i] = static_cast<unsigned int="">(i);

This actually allows me to run mysimulation for quite some time and it solves very nicely, but right up until a point and then the the solver breaksdown again as in point 1.

I appreciate it is difficult to resolve the problem without looking at my code, but I just wondered if you could give any insight on this.

Thanks,
Alex
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-10-19

(copying my response via email; looks like it got stuck?)

Hi Alex,

is your system matrix indefinite? I don't remember all the details of the PPE matrix approach, but maybe there is a singular block in there which causes these problems?

@Pivoting:
The LU substitution does not use any pivoting. This can cause numerical problems if your system matrix is not positive or negative definite.

@Randomization:
Yes, the randomness is to increase (well, usually...) stability by (in statistical mean) avoiding a few ugly corner cases that may result with very specific matrix patterns.
The randomization is a symmetric reordering, so the diagonal entries remain on the diagonal (though at a different location on the diagonal).

@Solver breakdown:
Is there anything specific happening to the solution when the solver breaks down? Maybe the system becomes more stiff, etc.?

Best regards,
Karli

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-20

Hi Karli,

Thanks for the response. Perhaps I should've mentioned I have a rather uncommon problem in which my system is essentially a set of moving points, not a fixed well defined grid/mesh. So the matrix will be changing every timestep.

So yes, it is possible I may get a singular block every now and then. I realised that the MIS2 may just not work for a whole simulation, I just wanted to make sure there wasn't any bugs on my part or the libraries.

The solver breaks down because when the solver applies the preconditioner to the residual for the particular time step the solver breaks down. The lu_substitute returns -1.#IND for the residual vector. And thus the iterative solver won't work.

I just find it odd that the CPU AMG with Aggregation coarsening has no problem at all regarding this.

Thanks for your information on the different parts. Does ViennaCl have a function that allows me to calculate the eigenvalues of my matrix?

Thanks
Alex

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-10-21

Thanks for the extra context. I think the first suspect is indeed just the lack of pivoting in LU_substitute. I think I can add a pivoted LU to AMG soon, so that should fix the problem.

Is it possible for you to provide a system (matrix plus right hand side vector) for which MIS2 currently breaks down? This would simplify testing for me tremendously.

As for eigenvalues: ViennaCL has routines for computing the largest eigenvalues of a symmetric sparse matrix: power iteration and the Lanczos algorithm. The OpenCL backend also includes eigenvalue computations for symmetric dense matrices, but that is in experimental stage.

Best regards,
Karli

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-21

Yes I think I can provide you with such a matrix, what format do you need it in? The matrix I'm using now is only relatively small, 176x176. The system works with Jacobi precoditioner and AMG sequential aggregation coarsening so it is setup correctly.

Last edit: Alex Chow 2016-10-21

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-10-21

Thanks. Please use the matrix market format, for which there is a writer implemented in viennacl/io/matrix_market.hpp.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-21

I have my matrix set up in a crs viennacl container, am I right in thinking I now need to convert it to a ublas matrix in order for the writer to work?

This doesn't seem to work:

viennacl::compressed_matrix<ScalarType> vcl_compressed_matrix(ctx); vcl_compressed_matrix.set(&row[0],&col[0],&matrixa[0],ppedim,ppedim,nnz); viennacl::vector<ScalarType> vcl_vec(matrixb.size(),ctx); copy(matrixb,vcl_vec); //WRITE MATRIX FOR MATRIX MARKET FORMAT string Test_MIS2_Matrix="Test_MIS2_Matrix_LHS"; viennacl::io::write_matrix_market_file(vcl_compressed_matrix,Test_MIS2_Matrix,1);

Last edit: Alex Chow 2016-10-21
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Karli,
Just a quick update which may change things.
I've created an output during the lu_substitute process and I think the error is coming from the creation of the amg preconditioner itself. Here is the code:

  template<typename MatrixT, typename VectorT>
  void lower_inplace_solve_vector(MatrixT & A, VectorT & b, vcl_size_t A_size,  bool unit_diagonal)
  {
    typedef typename VectorT::value_type   value_type;

    for (vcl_size_t i = 0; i < A_size; ++i)
    {
      for (vcl_size_t j = 0; j < i; ++j)
      {
        value_type A_element = A(i, j);
        std::cout<<i<< "\t"<<j<<"\t"<<b(i)<<"\t"<<A_element<<"\t"<< b(j)<<"\n";
        b(i) -= A_element * b(j);
        std::cout<<b(i)<<"\n";
        system("PAUSE");
      }

      if (!unit_diagonal){
          std::cout<<"diagonal"<<A(i,i)<<"\n";
        b(i) /= A(i, i);
        }
    }
  }

And one particular output is:

14 13 0 -1.#IND 0

Which means the non-real number is from A_elements, so I assume the preconditioner has set up incorrectly?

This is using the original randomization in the MIS2 Algorithm.

Last edit: Alex Chow 2016-10-21

AMG LU Substitute.png

Karl Rupp - 2016-10-21

try

std::vector<std::map<unsigned int, ScalarType> > temp; viennacl::copy(vcl_compressed_matrix, temp); viennacl::io::write_matrix_market_file(temp, filename);

for the matrix market writer.

I should be able to also reproduce the incorrect number on the matrix diagonal with your matrix.

Thanks and best regards,
Karli
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-21

It compiles fine, but I get this error:

Assertion failed: (cpu_matrx.size() == gpu_matrix.size1()) && bool("Size mismatch")

Viennacl/compressed_matrix.hpp, line 390.

I'm running all this on the CPU (including the problem and library), no GPU invovlement at the moment.

viennacl::compressed_matrix<ScalarType> vcl_compressed_matrix; vcl_compressed_matrix.set(&row[0],&col[0],&matrixa[0],ppedim,ppedim,nnz); string filename="Matrix_LHS"; std::vector<std::map<unsigned int, ScalarType> > temp; viennacl::copy(vcl_compressed_matrix, temp); viennacl::io::write_matrix_market_file(temp, filename);
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-10-21

ah, please change

std::vector<std::map<unsigned int, ScalarType> > temp;

to

std::vector<std::map<unsigned int, ScalarType> > temp(vcl_compressed_matrix.size1());
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-21

That worked! Attached is the matrix and the RHS to go with it.

I couldn't get the matrix market writer to work with my RHS vector, so I wrote my own file for it, the first line is the size of the array, then it is followed by,
RowNumber Value

I managed to narrow down the error again and it occurs when the array coarsest_op is being calculed from LU_substitute, during the AMG setup, viennacl/linalg/amg.hpp line 245 within the function amg_lu

Thanks!

Last edit: Alex Chow 2016-10-21

Matrix_A.mtx

Matrix_RHS.mtx

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-10-24

Thanks, I can reproduce the issue. I'll report back as soon as I know what is going on.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-25

Great, thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-26

Hi Karl,

Just thought I'd update you on my thoughts. I am now thinking there might not be anything wrong with the LU factorisation, even without the pivoting I think it might be okay.

I've had a look at the different levels caluclated for the multigrid preconditioner. And I've noticed with the cases where my system does converge, the coarsest level has at least one non-zero entry on every row. For example:
If my coarsest level is a 130x130 matrix, there will be at least 130 non-zero values (one for each row on the main diagonal).

For when my systems DOES NOT converge there are rows which contain all zero elements. You can see this more clearly below.

I've taken the test matrix I sent you, then from the library I've extracted the values for the coarsest level of the multigrid BEFORE the LU factorization.

When I use, for the randomisation procedure:

std::vector<unsigned int=""> random_weights(A.size1());
for (std::size_t i=0; i<random_weights.size(); ++i)="" random_weights<span="">[i] = static_cast<unsigned int="">(rand()) % static_cast<unsigned int="">(A.size1());

I will see an output likes this for the coarsest level matrix:
(8, 68) 119.177
(13, 0) 17.7451
So between rows 8 and 13, there are all zero values on rows 9,10,11, and 12. I think this is what proceeds to cause the problem in the LU factorisation.

However if I change the random_weights[i] to be equal to i. I don't get the missing rows, I will get at least one non-zero value on every row.

The problem is that even though the second case (random_weights[i]=i;) works most of the time, it will show the same problem later on in the simulation (for a different system).

I'm not sure if this is something worth you taking note of but thought it might be good to mention.

Thanks,
Alex
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-10-27

Okay, the failure of convergence is clear if the coarse grid operator becomes singular (e.g. via empty rows or columns). Does your system matrix have a symmetric pattern (i.e. if a_ij is nonzero, so is a_ji - though not necessarily the same value)?

Unfortunately I'm unable to check this myself at the moment, as I'm out of office until Saturday.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-10-27

It is not symmetric. No worries, thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alex Chow - 2016-11-04

I think I've solved it!

I think the current ViennaCL algorithm would work completely fine with meshed based methods, but with my meshless system the algorithm might need some more rigidity.

In order for my simulation to keep going and remain robust, (for the GPU algorithm)

I have to eliminate the randomisation on line 297 in linalg/cuda/amg_operations:

random_weights_ptr[i] = static_cast<unsigned int>(i);

and create a conditional if statement for when propogating the coarse indices on line 405 in linalg/cuda/amg_operations:

if(point_types[influenced_point_id]==viennacl::linalg::detail::amg::amg_level_context::POINT_TYPE_UNDECIDED)coarse_ids[influenced_point_id] = coarse_index;

The same applies to th CPU algorithm.

Last edit: Alex Chow 2016-11-04
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2016-11-07

Thanks, Alex, this is very helpful. I still try to understand on a more abstract level what exactly goes wrong and which implications your suggested fix has. The randomization shouldn't make a difference on average, so maybe your fix isn't robust enough?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MIS2 AMG and lu_substitute

Linear algebra and solver library using CUDA, OpenCL, and OpenMP

Forums

Help

MIS2 AMG and lu_substitute

MIS2 AMG and lu_substitute

Linear algebra and solver library using CUDA, OpenCL, and OpenMP

Forums

Help

MIS2 AMG and lu_substitute document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

MIS2 AMG and lu_substitute