From: David Knezevic <david.knezevic@ba...>  20070828 10:20:16

> Ohh, I think I see now. The dense matrix is really just a simple > way of getting > and storing rows (or columns) of solution data. Exactly. And the thing is, I've been talking about taking 1D strips from a logically 2D mesh, but what I'm actually implementing is taking "2D" strips from a 4D mesh. That is, the PDE I'm solving is posed in a 4D domain which is the cartesian product of two 2D domains, and I'm splitting the PDE and applying the ADI method in order to solve it. A single 2D strips is again just a row or column of the dense matrix (it doesn't matter how many dimensions the strips represent, because in the end they're just nodal values of the solution which will be loaded onto a current_local_solution vector). > I'll be curious to see what happens if some of the 1D problems are > harder > (and thus take longer to finish) than others. In such a case, you > may be > waiting on 1 or 2 CPUs to finish their quota while the others sit > idle. Definitely, but luckily in my application all the subproblems are equally hard. Well, actually, all the subproblems in a given direction are equally hard, but I'm only doing one direction at a time, since otherwise the solves would interfere with one another. The nice thing is that all the solves in a given direction are completely independent. > A different approach might be to use a "clientserver" structure, > where, as each "client" processor finishes a 1D solve, it asks the > "server" process for the next row/column to work on, until all are > finished. Since you've already got a dense parallel matrix with the > values, the communication part is mostly taken care of. This would > probably be a bit more complexity than necessary though, especially if > all the solves take about the same amount of time. I think this clientserver structure would give a lot more flexibility. As a first cut, though, I think I'll implement it in the simplest way, which is to just to allow PETSc to partition the dense matrix of solution data so that each processor gets an equal number of rows (which is done by default when you create an MPIDenseMatrix), and then can happily do all the solves on it's local rows, one at a time. In order to do the solves in the other direction, I just transpose the dense matrix and again PETSc partitions it so that each processor gets an equal number of rows of the transposed matrix. But I think that if, later on, I was able to try problems large enough that it was beneficial to have multiple processors working on each subproblem, then the clientserver structure would give the flexibility to handle that situation. But yeah, for the moment, the partitioning that I'm doing is so trivial that I don't think I'll need to worry about implementing anything more complicated than what I described above. Cheers, Dave 