From: Renato P. <re...@gm...> - 2018-06-27 21:58:16
|
Hi, I need to evaluate the "greatest" value of a vector that is created in parallel. It is a matter of iterating over all the local elements and calculate the parts of this distributed vector. What is the best way to syncronize this vector to all processors - so they can have the whole set of data and can find out the "greatest" value? Any example to suggest? Thanks, Renato |
From: John P. <jwp...@gm...> - 2018-06-27 22:01:05
|
On Wed, Jun 27, 2018 at 3:58 PM, Renato Poli <re...@gm...> wrote: > Hi, > > I need to evaluate the "greatest" value of a vector that is created in > parallel. It is a matter of iterating over all the local elements and > calculate the parts of this distributed vector. > > What is the best way to syncronize this vector to all processors - so they > can have the whole set of data and can find out the "greatest" value? Any > example to suggest? > Are you using NumericVector? It has a max() member. -- John |
From: Renato P. <re...@gm...> - 2018-06-27 22:55:09
|
Nice. It is important to identify the source of that value (elem. number, for example). Is it safe to use the template parameter <T> as a custom class overriding the sorting operators? Something like: class MyClass { double number_to_classify; unsigned int elem_id; operator< ( ... ) { ... number_to_classify ...} } On Wed, Jun 27, 2018 at 7:00 PM, John Peterson <jwp...@gm...> wrote: > > > On Wed, Jun 27, 2018 at 3:58 PM, Renato Poli <re...@gm...> wrote: > >> Hi, >> >> I need to evaluate the "greatest" value of a vector that is created in >> parallel. It is a matter of iterating over all the local elements and >> calculate the parts of this distributed vector. >> >> What is the best way to syncronize this vector to all processors - so they >> can have the whole set of data and can find out the "greatest" value? Any >> example to suggest? >> > > Are you using NumericVector? It has a max() member. > > -- > John > |
From: John P. <jwp...@gm...> - 2018-06-27 23:22:20
|
On Wed, Jun 27, 2018 at 4:55 PM, Renato Poli <re...@gm...> wrote: > Nice. It is important to identify the source of that value (elem. number, > for example). > Is it safe to use the template parameter <T> as a custom class overriding > the sorting operators? > > Something like: > > class MyClass { > double number_to_classify; > unsigned int elem_id; > operator< ( ... ) { ... number_to_classify ...} > } > NumericVector is only for type Number (which corresponds to Real or std::complex depending on how you have compiled libmesh), you can't use it with generic type T. -- John |
From: Renato P. <re...@gm...> - 2018-06-28 15:49:30
|
Should I copy-paste code from NumericVector to build mine? Any suggestion? Another alternative is to sync the solution vector to all processors so that everybody can do the same calculation. Seems a little inefficient, but is simple-and-easy. rgds, Renato On Wed, Jun 27, 2018 at 8:21 PM, John Peterson <jwp...@gm...> wrote: > > > On Wed, Jun 27, 2018 at 4:55 PM, Renato Poli <re...@gm...> wrote: > >> Nice. It is important to identify the source of that value (elem. number, >> for example). >> Is it safe to use the template parameter <T> as a custom class overriding >> the sorting operators? >> >> Something like: >> >> class MyClass { >> double number_to_classify; >> unsigned int elem_id; >> operator< ( ... ) { ... number_to_classify ...} >> } >> > > NumericVector is only for type Number (which corresponds to Real or > std::complex depending on how you have compiled libmesh), you can't use it > with generic type T. > > -- > John > |
From: John P. <jwp...@gm...> - 2018-06-28 16:04:12
|
On Thu, Jun 28, 2018 at 9:49 AM, Renato Poli <re...@gm...> wrote: > Should I copy-paste code from NumericVector to build mine? > Any suggestion? > I definitely suggest _not_ copy/pasting code! Another alternative is to sync the solution vector to all processors so > that everybody can do the same calculation. > Seems a little inefficient, but is simple-and-easy. > If you are not using NumericVector, then there is a Communicator::max() overload that takes a std::vector<T> of values, but I believe this only works if there is a valid StandardType<T> specialization for the type T. Most libMesh objects (LibMeshInit, Mesh, EquationSystems, etc.) provide a Communicator object via the comm() method, e.g. mesh.comm().max(vec). -- John |
From: Roy S. <roy...@ic...> - 2018-06-28 16:08:23
|
On Thu, 28 Jun 2018, Renato Poli wrote: > Should I copy-paste code from NumericVector to build mine? > Any suggestion? We're pretty far into XY Problem territory at this point. http://xyproblem.info/ What you've said about your vector is that you'll need to be able to do a maxloc() on it, that it's describable as a "solution vector", that you seem to interested both in sorting it and in using it with an arbitrary value type T. That's not enough information to give advice. The "arbitrary value type T" criterion rules out pretty much all of the libMesh NumericVector subclasses (except perhaps DistributedVector, with some work), but the "solution vector" description in the libMesh context *implies* a NumericVector subclass (and not the DistributedVector subclass, if you're solving an implicit system). > Another alternative is to sync the solution vector to all processors so > that everybody can do the same calculation. Are you referring to the maxloc() calculation here? That wouldn't be hard to add to NumericVector (with specializations for speed in subclasses), and the implementation would be a lot more efficient than serialization. But maxloc has nothing to do with the "you can't use NumericVector with generic type T" problem, so maybe back up and figure that out (starting with *why*, not *how*)first? --- Roy |
From: Renato P. <re...@gm...> - 2018-06-28 17:31:33
|
Ok, I see Roy's point. Let's start over. I have a DG system (duplicated DOFs). I calculate the aperture at the DOFs sharing coordinates (position_at_the_element _minus_ position_at_the_neighbor). I do that for all integration points. I need an ordered vector with the bigger apertures first, and I need to identify who they are (element, neighbor and integration point). I think (not sure) it *is* indeed a maxloc() problem, as long as I can have my own objects to be compared with a specialized "operator<". (I tried to get closer to "X" here ... helped?) Thanks Renato On Thu, Jun 28, 2018 at 1:08 PM, Roy Stogner <roy...@ic...> wrote: > > On Thu, 28 Jun 2018, Renato Poli wrote: > > Should I copy-paste code from NumericVector to build mine? >> Any suggestion? >> > > We're pretty far into XY Problem territory at this point. > > http://xyproblem.info/ > > What you've said about your vector is that you'll need to be able to > do a maxloc() on it, that it's describable as a "solution vector", > that you seem to interested both in sorting it and in using it with > an arbitrary value type T. That's not enough information to give > advice. > > The "arbitrary value type T" criterion rules out pretty much all of > the libMesh NumericVector subclasses (except perhaps > DistributedVector, with some work), but the "solution vector" > description in the libMesh context *implies* a NumericVector subclass > (and not the DistributedVector subclass, if you're solving an implicit > system). > > Another alternative is to sync the solution vector to all processors so >> that everybody can do the same calculation. >> > > Are you referring to the maxloc() calculation here? That wouldn't be > hard to add to NumericVector (with specializations for speed in > subclasses), and the implementation would be a lot more efficient than > serialization. But maxloc has nothing to do with the "you can't use > NumericVector with generic type T" problem, so maybe back up and > figure that out (starting with *why*, not *how*)first? > --- > Roy > |
From: Roy S. <roy...@ic...> - 2018-06-28 17:50:46
|
On Thu, 28 Jun 2018, Renato Poli wrote: > I have a DG system (duplicated DOFs). By duplicated, you just mean the multiple DoFs "at" (in a Lagrangian evaluation sense) each node? > I calculate the aperture at the DOFs sharing coordinates Aperture meaning the jump in values between the solution on one element and the solution on it's neighbor? > (position_at_the_element _minus_ position_at_the_neighbor). Sounds like you're taking into account shared coordinates on edges/faces too, then. > I do that for all integration points. And integration points are only on sides, so you don't have to worry about edges/nodes where more than two elements meet? > I need an ordered vector with the bigger apertures first, and I need > to identify who they are (element, neighbor and integration point). > > I think (not sure) it *is* indeed a maxloc() problem, as long as I > can have my own objects to be compared with a specialized > "operator<". No, it's not; if you need the entire vector sorted then doing it one maxloc() at a time definitely won't be the most efficient way to do that. > (I tried to get closer to "X" here ... helped?) Much! 1) You're not using a solution vector, you're using a vector with different indexing that's merely calculated *from* a solution vector, so since you have that expensive calculation/transformation anyway then you aren't stuck using the solution vector directly for efficiency purposes. 2) Plus, it sounds like you don't *need* arbitrary T here - you're looking at jumps in values of type Number, so you can still use vectors of type Number, right? So now it no longer looks like you have self-contradictory needs, which is a huge plus. More questionable help: The good news is that libMesh does have a class, Parallel::Sort, which can do what you want (it does a bin sort on arbitrary data, so you could create e.g. a local vector<pair<Number,original_index_type>>, sort by that, do another parallel communication step afterwards, and end up with exactly the information you want in a scalable way. The bad news is that this class isn't perfectly documented, is only used in one place in the library, and the original developer isn't currently active, so it may be quite difficult to figure out. If I were you I'd start by doing things the slow easy way: after each processor makes its own local vector, just allgather those into a giant serial vector and sort that on each rank. *Then* try to use the Parallel::Sort interface, once you have something to use for debugging and a fallback while you do. --- Roy |
From: Renato P. <re...@gm...> - 2018-06-28 20:32:26
|
> By duplicated, you just mean the multiple DoFs "at" (in a Lagrangian > evaluation sense) each node? Correct. > And integration points are only on sides, so you don't have to worry > about edges/nodes where more than two elements meet? Correct. > 1) You're not using a solution vector, you're using a vector with > different indexing that's merely calculated *from* a solution vector, > so since you have that expensive calculation/transformation anyway > then you aren't stuck using the solution vector directly for > efficiency purposes. Correct. > 2) Plus, it sounds like you don't *need* arbitrary T here - you're > looking at jumps in values of type Number, so you can still use > vectors of type Number, right? I need to identify where they happen. I can have a class of my own, a map indexed by the aperture, a pair - I need it indexed by a Number, that is true. > The good news is that libMesh does have a class, Parallel::Sort, which > can do what you want (it does a bin sort on arbitrary data, so you > could create e.g. a local vector<pair<Number,original_index_type>>, > sort by that, do another parallel communication step afterwards, and > end up with exactly the information you want in a scalable way. That is awesome. > The bad news is that this class isn't perfectly documented, is only > used in one place in the library, and the original developer isn't > currently active, so it may be quite difficult to figure out. That is not so awesome. > If I were you I'd start by doing things the slow easy way: after each > processor makes its own local vector, just allgather those into a > giant serial vector and sort that on each rank. *Then* try to use the > Parallel::Sort interface, once you have something to use for debugging > and a fallback while you do. This is the serial stuff I was talking about. I have other trouble ahead, I wouldn't say performance is the *major* issue right now. It seems that we are converging.... which takes me to a last question: How can I do this (I am not a MPI guy at all, so please be patient ...): > ... after each processor makes its own local vector, just allgather those into a giant serial vector ... Thanks! Renato |
From: Roy S. <roy...@ic...> - 2018-06-28 21:04:16
|
On Thu, 28 Jun 2018, Renato Poli wrote: > How can I do this (I am not a MPI guy at all, so please be patient ...): > > ... after each processor makes its own local vector, just > > allgather those into a giant serial vector ... I'm not a big fan of MPI myself, which is why we've got it wrapped up as much as possible. https://github.com/libMesh/libmesh/blob/716878d0c5631a7b7488c88fd1de78025231b115/include/parallel/communicator.h#L664 is the comment (with API below) for what I described above. --- Roy |
From: Renato P. <re...@gm...> - 2018-06-29 12:55:03
|
Thank you! On Thu, Jun 28, 2018 at 6:04 PM, Roy Stogner <roy...@ic...> wrote: > > On Thu, 28 Jun 2018, Renato Poli wrote: > > How can I do this (I am not a MPI guy at all, so please be patient ...): >> > > > ... after each processor makes its own local vector, just >> > allgather those into a giant serial vector ... >> > > I'm not a big fan of MPI myself, which is why we've got it wrapped up > as much as possible. > > https://github.com/libMesh/libmesh/blob/716878d0c5631a7b7488 > c88fd1de78025231b115/include/parallel/communicator.h#L664 > > is the comment (with API below) for what I described above. > --- > Roy > |
From: Renato P. <re...@gm...> - 2018-07-04 19:50:55
|
Hi Roy I am trying to implement this one: comm().allgather(vector<MyClass>). The compiler fails. It says MyClass is not a StandardType. I tried with a tuple<>, without success. Any suggestion? Should I specialize a Packing<MyClass *> or so? I can see that for vector<string>, everything works fine. /usr/local/include/libmesh/parallel.h: In instantiation of ‘class libMesh::Parallel::StandardType<abada_lib::MyClass>’: /usr/local/include/libmesh/parallel_implementation.h:3251:23: required from ‘void libMesh::Parallel::Communicator::allgather(std::__debug::vector<_RealType>&, bool) const [with T = abada_lib::MyClass]’ src/abada_lib/Abada_sc10.cxx:323:32: required from here /usr/local/include/libmesh/parallel.h:387:3: error: static assertion failed: Only specializations of StandardType may be used, did you forget to include a header file (e.g. parallel_algebra.h)? static_assert(dependent_false<T>::value, Renato On Fri, Jun 29, 2018 at 9:54 AM, Renato Poli <re...@gm...> wrote: > Thank you! > > On Thu, Jun 28, 2018 at 6:04 PM, Roy Stogner <roy...@ic...> > wrote: > >> >> On Thu, 28 Jun 2018, Renato Poli wrote: >> >> How can I do this (I am not a MPI guy at all, so please be patient ...): >>> >> >> > ... after each processor makes its own local vector, just >>> > allgather those into a giant serial vector ... >>> >> >> I'm not a big fan of MPI myself, which is why we've got it wrapped up >> as much as possible. >> >> https://github.com/libMesh/libmesh/blob/716878d0c5631a7b7488 >> c88fd1de78025231b115/include/parallel/communicator.h#L664 >> >> is the comment (with API below) for what I described above. >> --- >> Roy >> > > |
From: Roy S. <roy...@ic...> - 2018-07-04 20:50:30
|
On Wed, 4 Jul 2018, Renato Poli wrote: > I am trying to implement this one: comm().allgather(vector<MyClass>). > > The compiler fails. > It says MyClass is not a StandardType. > I tried with a tuple<>, without success. There actually is a StandardType<tuple<Foo...>> specialization, but it's untested and it's only useable if all the Foo types also have StandardType specializations. > Any suggestion? > Should I specialize a Packing<MyClass *> or so? Is your class a tuple of built-in constant-size types (or of other tuples of built-in types, and so on)? If so then please you can help us debug whatever's going wrong with StandardType<tuple>? If not, then is your class of constant size? If so then you can specialize StandardType<MyClass>; see the StandardType<pair> case in standard_type.h for the best example of what to do. There are more examples in parallel_algebra.h that might also be helpful. If you have a variable size class... then we're in uncharted territory: > I can see that for vector<string>, everything works fine. That's a special hand-coded case, I'm afraid. We haven't currently got any allgather_packed_range suitable for arbitrary size (de)serializable types. --- Roy |
From: Renato P. <re...@gm...> - 2018-07-04 21:00:24
|
Thanks. Will try that. It is constant size. In fact, I could make my way through in a dirty pair<int, pair<int, pair<int, int> > > I'm debugging something else right now. I will have to come back later an ddo the specialization. Thanks, Renato On Wed, Jul 4, 2018 at 5:50 PM, Roy Stogner <roy...@ic...> wrote: > > On Wed, 4 Jul 2018, Renato Poli wrote: > > I am trying to implement this one: comm().allgather(vector<MyClass>). >> >> The compiler fails. >> It says MyClass is not a StandardType. >> I tried with a tuple<>, without success. >> > > There actually is a StandardType<tuple<Foo...>> specialization, but > it's untested and it's only useable if all the Foo types also have > StandardType specializations. > > Any suggestion? >> Should I specialize a Packing<MyClass *> or so? >> > > Is your class a tuple of built-in constant-size types (or of other > tuples of built-in types, and so on)? If so then please you can help > us debug whatever's going wrong with StandardType<tuple>? > > If not, then is your class of constant size? If so then you can > specialize StandardType<MyClass>; see the StandardType<pair> case in > standard_type.h for the best example of what to do. There are more > examples in parallel_algebra.h that might also be helpful. > > If you have a variable size class... then we're in uncharted > territory: > > I can see that for vector<string>, everything works fine. >> > > That's a special hand-coded case, I'm afraid. We haven't currently > got any allgather_packed_range suitable for arbitrary size > (de)serializable types. > --- > Roy > |
From: Roy S. <roy...@ic...> - 2018-07-05 05:02:51
|
On Wed, 4 Jul 2018, Renato Poli wrote: > Thanks. Will try that. > It is constant size. > In fact, I could make my way through in a dirty pair<int, pair<int, pair<int, int> > > > I'm debugging something else right now. > I will have to come back later an ddo the specialization. It sounds like you'd rather be doing a tuple<int, int, int, int>? I'll try setting up a unit test with that and see if I can fix the tuple specialization, but if you're in a hurry don't wait on me. --- Roy |
From: Renato P. <re...@gm...> - 2018-07-05 11:41:21
|
>> It sounds like you'd rather be doing a tuple<int, int, int, int>? Well, that would be my second shot, which I consider dusty but healthy. *Ideally* I would do a MyClass { int p1; int p2; int p3; int p4; }. >> I'll try setting up a unit test with that and see if I can fix ... There's no need to hurry. I moved forward with the nested pairs. I wrapped it with access functions (p.second.second.second is the deepest raw int). Thanks, Renato On Thu, Jul 5, 2018 at 2:02 AM, Roy Stogner <roy...@ic...> wrote: > > On Wed, 4 Jul 2018, Renato Poli wrote: > > Thanks. Will try that. >> It is constant size. >> In fact, I could make my way through in a dirty pair<int, pair<int, >> pair<int, int> > > >> I'm debugging something else right now. >> I will have to come back later an ddo the specialization. >> > > It sounds like you'd rather be doing a tuple<int, int, int, int>? > I'll try setting up a unit test with that and see if I can fix the > tuple specialization, but if you're in a hurry don't wait on me. > --- > Roy > |
From: Roy S. <roy...@ic...> - 2018-07-05 14:11:20
|
On Thu, 5 Jul 2018, Renato Poli wrote: > >> It sounds like you'd rather be doing a tuple<int, int, int, int>? > Well, that would be my second shot, which I consider dusty but healthy. > *Ideally* I would do a MyClass { int p1; int p2; int p3; int p4; }. I'm afraid handling *that* automatically is not going to be possible until C++ adds reflection support. In C++20, fingers crossed... > >> I'll try setting up a unit test with that and see if I can fix ... > There's no need to hurry. I moved forward with the nested pairs. > I wrapped it with access functions (p.second.second.second is the deepest raw int). Thanks for the update! --- Roy |