From: Roy S. <roy...@ic...> - 2018-06-28 17:50:46
|
On Thu, 28 Jun 2018, Renato Poli wrote: > I have a DG system (duplicated DOFs). By duplicated, you just mean the multiple DoFs "at" (in a Lagrangian evaluation sense) each node? > I calculate the aperture at the DOFs sharing coordinates Aperture meaning the jump in values between the solution on one element and the solution on it's neighbor? > (position_at_the_element _minus_ position_at_the_neighbor). Sounds like you're taking into account shared coordinates on edges/faces too, then. > I do that for all integration points. And integration points are only on sides, so you don't have to worry about edges/nodes where more than two elements meet? > I need an ordered vector with the bigger apertures first, and I need > to identify who they are (element, neighbor and integration point). > > I think (not sure) it *is* indeed a maxloc() problem, as long as I > can have my own objects to be compared with a specialized > "operator<". No, it's not; if you need the entire vector sorted then doing it one maxloc() at a time definitely won't be the most efficient way to do that. > (I tried to get closer to "X" here ... helped?) Much! 1) You're not using a solution vector, you're using a vector with different indexing that's merely calculated *from* a solution vector, so since you have that expensive calculation/transformation anyway then you aren't stuck using the solution vector directly for efficiency purposes. 2) Plus, it sounds like you don't *need* arbitrary T here - you're looking at jumps in values of type Number, so you can still use vectors of type Number, right? So now it no longer looks like you have self-contradictory needs, which is a huge plus. More questionable help: The good news is that libMesh does have a class, Parallel::Sort, which can do what you want (it does a bin sort on arbitrary data, so you could create e.g. a local vector<pair<Number,original_index_type>>, sort by that, do another parallel communication step afterwards, and end up with exactly the information you want in a scalable way. The bad news is that this class isn't perfectly documented, is only used in one place in the library, and the original developer isn't currently active, so it may be quite difficult to figure out. If I were you I'd start by doing things the slow easy way: after each processor makes its own local vector, just allgather those into a giant serial vector and sort that on each rank. *Then* try to use the Parallel::Sort interface, once you have something to use for debugging and a fallback while you do. --- Roy |