From: Kirk, Benjamin (JSCEG311) <benjamin.kirk1@na...>  20121205 20:41:27

I'm working some optimizations with how libMesh stores & allocates degree of freedom indices when there are multiple variables associated with a given System. These generically fall into two categories  (1) homogeneous types, where there are multiple variables but they all have the same finite element type, and (2) mixed types, where there are different variables with different finite element types as well (like TaylorHood Q2/Q1 incompressible NS, for example) for both cases, we could do some optimization in how we store degree of freedom indices & do things like compute the sparse matrix graph. for case (1) in particular, we could also take advantage of the identical blockstructure of the linear system as well and use e.g. PETSc's block matrix and vector support. So my question is, for your multivariable systems, are they typically of type (1) or (2)? How many variables and what types? My reason in asking is because some of the optimizations are easier for case (1), which is where I plan to start, but don't want to do anything that would preclude additional optimization for (2), especially if there are people with ~5 or more variables of different finite element types in a system. Thanks, Ben 
From: Derek Gaston <friedmud@gm...>  20121205 22:00:45

OMG. This would be awesome. I've actually got a user that has been breathing down our necks for this optimization. In his case he is solving with over 2,000 variables that are all of exactly the same type. Generally, first or second order Lagrange (although he does some DG as well). We also have other users solving with 20200 variables of the same kind (again, usually first or second order Lagrange) but they might also have 14 variables of another kind (like cubic hermites) mixed in  but not always. ANY optimizations along these lines would be truly awesome! Derek Sent from my iPhone On Dec 5, 2012, at 1:41 PM, "Kirk, Benjamin (JSCEG311)" <benjamin.kirk1@...> wrote: > I'm working some optimizations with how libMesh stores & allocates degree of freedom indices when there are multiple variables associated with a given System. These generically fall into two categories  > > (1) homogeneous types, where there are multiple variables but they all have the same finite element type, and > (2) mixed types, where there are different variables with different finite element types as well (like TaylorHood Q2/Q1 incompressible NS, for example) > > for both cases, we could do some optimization in how we store degree of freedom indices & do things like compute the sparse matrix graph. > > for case (1) in particular, we could also take advantage of the identical blockstructure of the linear system as well and use e.g. PETSc's block matrix and vector support. > > So my question is, for your multivariable systems, are they typically of type (1) or (2)? How many variables and what types? > > My reason in asking is because some of the optimizations are easier for case (1), which is where I plan to start, but don't want to do anything that would preclude additional optimization for (2), especially if there are people with ~5 or more variables of different finite element types in a system. > > Thanks, > > Ben > > > > >  > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more valueadd services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Libmeshdevel mailing list > Libmeshdevel@... > https://lists.sourceforge.net/lists/listinfo/libmeshdevel 
From: Roy Stogner <roystgnr@ic...>  20121206 16:50:35

On Wed, 5 Dec 2012, Derek Gaston wrote: > In his case he is solving with over 2,000 variables > > We also have other users solving with 20200 variables of the same It would be interesting to know of other places where users are greatly exceeding our original estimates of what "N" might be in cases where we're designing O(f(N)) algorithms. E.g. I just blithely told Vikram not to worry about writing an O(N_vectors_per_System) bit of code, for instance, because in my typical codes (solution, old solution, rhs, qoi0adjoint, qoi1adjoint...) N might be around 6, but perhaps the reducedbasis people or others vastly exceed that. (In this case the constant is small and the code isn't in any inner loops so he's probably fine even for N=6000, but you see the general idea).  Roy 
From: Kirk, Benjamin (JSCEG311) <benjamin.kirk1@na...>  20121206 17:57:38

On Dec 5, 2012, at 4:00 PM, Derek Gaston <friedmud@...> wrote: > We also have other users solving with 20200 variables of the same > kind (again, usually first or second order Lagrange) but they might > also have 14 variables of another kind (like cubic hermites) mixed in >  but not always. OK, based on this and Saurabh's feedback it is clear going straight to the type (2) optimizations, while more difficult, is the way to go. I'm going to focus on the DOF indexing part of this for now and then do the block matrix support later. Note the block matrix support will still only be restricted to when all the variables in a system are of the same type, though. To support the general case of multiple variables of different types (Derek's 20 linear lagrange + 4 cubic hermites, for example) I'm proposing we introduce the construct "VariableGroup", which is just like the current Variable except it supports an arbitrary number of variables with the same finite element approximation type and subdomain restriction. So to implement a system of Derek's type that would be two variable groups. To take advantage of this at the DofObject level, the DOF indexing will need to loop quickest within the variable groups. This is because the DofObject does not know anything about the topology of the mesh or anything. So the DOF index offset between two variables within a VariableGroup should not depend on the number of elements in the mesh in any way. Specifically, to number DOFs efficiently will require something like this: for each VariableGroup for each Node for each Variable in the VariableGroup for each Component in the Variable … … … for each Element for each Variable in the VariableGroup for each Component in the Variable … … … … When we only have one Variable per VariableGroup this is exactly the indexing we currently get (without node_major_dofs, anyway). With multiple Variables in a VariableGroup, however, the indexing and resulting sparse matrix ordering will be different. Ben 
From: Derek Gaston <friedmud@gm...>  20121206 18:44:08
Attachments:
Message as HTML

Would you actually do something like: system.addVariableGroup()? Or would VariableGroups be automatically created based on the types of variables that have been added? It would be great if it were the second one... that way, all of this is not a userfacing API... but really just an optimization under the hood. This is sounding _really_ good! Derek On Thu, Dec 6, 2012 at 10:57 AM, Kirk, Benjamin (JSCEG311) < benjamin.kirk1@...> wrote: > On Dec 5, 2012, at 4:00 PM, Derek Gaston <friedmud@...> wrote: > > > We also have other users solving with 20200 variables of the same > > kind (again, usually first or second order Lagrange) but they might > > also have 14 variables of another kind (like cubic hermites) mixed in > >  but not always. > > OK, based on this and Saurabh's feedback it is clear going straight to the > type (2) optimizations, while more difficult, is the way to go. > > I'm going to focus on the DOF indexing part of this for now and then do > the block matrix support later. Note the block matrix support will still > only be restricted to when all the variables in a system are of the same > type, though. > > To support the general case of multiple variables of different types > (Derek's 20 linear lagrange + 4 cubic hermites, for example) I'm proposing > we introduce the construct "VariableGroup", which is just like the current > Variable except it supports an arbitrary number of variables with the same > finite element approximation type and subdomain restriction. > > So to implement a system of Derek's type that would be two variable groups. > > To take advantage of this at the DofObject level, the DOF indexing will > need to loop quickest within the variable groups. This is because the > DofObject does not know anything about the topology of the mesh or > anything. So the DOF index offset between two variables within a > VariableGroup should not depend on the number of elements in the mesh in > any way. > > Specifically, to number DOFs efficiently will require something like this: > > for each VariableGroup > for each Node > for each Variable in the VariableGroup > for each Component in the Variable > … > … > … > for each Element > for each Variable in the VariableGroup > for each Component in the Variable > … > … > … > … > > When we only have one Variable per VariableGroup this is exactly the > indexing we currently get (without node_major_dofs, anyway). With > multiple Variables in a VariableGroup, however, the indexing and resulting > sparse matrix ordering will be different. > > Ben > > 
From: Roy Stogner <roystgnr@ic...>  20121206 18:58:37

On Thu, 6 Dec 2012, Derek Gaston wrote: > Would you actually do something like: > system.addVariableGroup()? Or would VariableGroups be automatically created based on the types > of variables that have been added? It would be great if it were the second one... that way, all > of this is not a userfacing API... but really just an optimization under the hood. I'd vastly prefer automatic creation, with one caveat. Based on Ben's anticipated numbering requirements, users should need to add_variable() matching types in the right order in order to get them into the same variable group. e.g. with different orders for u,v vs p: add_variable(u1), followed by v1, p1, u2, v2, p2, would give us four variable groups: one for u1,v1, one for p1, one for u2,v2, one for p2. add_variable(u1), followed by v1, u2, v2, p1, p2, would give us two variable groups: one for u1,v1,u2,v2, one for p1,p2 If the only way to make the first case as efficient as the second would be to force a differentthanrequested index ordering on the user, then I'm against it. I dislike user code which depends on indexing internals, but people still write apps like that (*cough*, Ben, *cough*), and I don't want to risk breaking any of them, even hypothetical nonFINS apps that don't force nodefirst rather than variablefirst dof ordering anyway.  Roy 
From: Kirk, Benjamin (JSCEG311) <benjamin.kirk1@na...>  20121206 19:07:40

On Dec 6, 2012, at 12:58 PM, Roy Stogner <roystgnr@...> wrote: > > On Thu, 6 Dec 2012, Derek Gaston wrote: > >> Would you actually do something like: >> system.addVariableGroup()? Or would VariableGroups be automatically created based on the types >> of variables that have been added? It would be great if it were the second one... that way, all >> of this is not a userfacing API... but really just an optimization under the hood. > > I'd vastly prefer automatic creation, with one caveat. Based on Ben's > anticipated numbering requirements, users should need to > add_variable() matching types in the right order in order to get them > into the same variable group. e.g. with different orders for u,v vs p: > > add_variable(u1), followed by v1, p1, u2, v2, p2, would give us four > variable groups: one for u1,v1, one for p1, one for u2,v2, one for p2. > > add_variable(u1), followed by v1, u2, v2, p1, p2, would give us two > variable groups: one for u1,v1,u2,v2, one for p1,p2 > > If the only way to make the first case as efficient as the second > would be to force a differentthanrequested index ordering on the > user, then I'm against it. I dislike user code which depends on > indexing internals, but people still write apps like that (*cough*, > Ben, *cough*), and I don't want to risk breaking any of them, even > hypothetical nonFINS apps that don't force nodefirst rather than > variablefirst dof ordering anyway. Right now we have /** * Adds the variable \p var to the list of variables * for this system. Returns the index number for the new variable. */ unsigned int add_variable (const std::string& var, const FEType& type, const std::set<subdomain_id_type> * const active_subdomains = NULL); I'll add another method add_variables (std::vector<std::string> &vars, …); But I should also be able to create a "cleanup" method that collapses contiguous blocks of the same type that have been added, allowing this to be a pure underthehood optimization too. Internally, I think the way to go is to treat everything across the board as VariabeGroups, the current capability is just a degenerate case of a VariableGroup with only one Variable inside. Automatically identifying VariableGroups is subject to Roy's concern. Consider for example TalylorHood Q2Q1 in 2D: add_variable(u); add_variable(v); add_variable(P); results *currently* in a global solution vector ordered like so: (u0 u1 … uN v0 v1 … vN p0 p1 … pM) whereas automatically identifying u & v and collapsing them into a VariableGroup will yield instead ( u0 v0 u1 v1 … uN vN p0 p1… pM) At the element level all should be good  the dof_indices will simply be permuted to handle the global scatter/gather differences, but if anyone is accessing the global solution directly things will be different. Ben 
From: Roy Stogner <roystgnr@ic...>  20121206 19:32:33

On Thu, 6 Dec 2012, Kirk, Benjamin (JSCEG311) wrote: > results *currently* in a global solution vector ordered like so: > > (u0 u1 … uN v0 v1 … vN p0 p1 … pM) > > whereas automatically identifying u & v and collapsing them into a VariableGroup will yield instead > > ( u0 v0 u1 v1 … uN vN p0 p1… pM) > > At the element level all should be good  the dof_indices will > simply be permuted to handle the global scatter/gather differences, > but if anyone is accessing the global solution directly things will > be different. Ugh; you're right of course. In that case we might as well try to enable "underthehood" optimization even for noncontiguous cases, but provide some bool (controlled by API, command line option, whatever) to allow it to be disabled for anyone depending on global solution ordering.  Roy 
From: Kirk, Benjamin (JSCEG311) <benjamin.kirk1@na...>  20121206 20:08:55

On Dec 6, 2012, at 1:32 PM, Roy Stogner <roystgnr@...> wrote: > Ugh; you're right of course. In that case we might as well try to > enable "underthehood" optimization even for noncontiguous cases, > but provide some bool (controlled by API, command line option, > whatever) to allow it to be disabled for anyone depending on global > solution ordering. Will do. I'm starting with the guts to support the add_variables() interface now, and will work the auto detection magic once this is done. While I have your attention,  the SCALAR Variables logic has me a bit confused. Looks like in the System we always assume any SCALAR values were added last, and that we can have only one. As such, a VariableGroup with multiple SCALAR types makes no sense (under current assumptions) right? If you need more SCALARs you up the oder, not append more  seems to be my reading of what is there presently. Mostly trying to grok the Variable::first_scalar_number() business. Ben 
From: Derek Gaston <friedmud@gm...>  20121206 20:18:34
Attachments:
Message as HTML

On Thu, Dec 6, 2012 at 1:08 PM, Kirk, Benjamin (JSCEG311) < benjamin.kirk1@...> wrote: > Looks like in the System we always assume any SCALAR values were added > last, and that we can have only one. As such, a VariableGroup with > multiple SCALAR types makes no sense (under current assumptions) right? If > you need more SCALARs you up the oder, not append more  seems to be my > reading of what is there presently. > > Mostly trying to grok the Variable::first_scalar_number() business. I don't think that's right  we have multiple SCALAR variables here... each of arbitrary order (things like Lagrange multipliers for contact and "joints" in 1D flow and fullycoupled ODE systems). Derek 
From: Kirk, Benjamin (JSCEG311) <benjamin.kirk1@na...>  20121206 20:23:01

On Dec 6, 2012, at 2:18 PM, Derek Gaston <friedmud@...> wrote: > On Thu, Dec 6, 2012 at 1:08 PM, Kirk, Benjamin (JSCEG311) <benjamin.kirk1@...> wrote: > Looks like in the System we always assume any SCALAR values were added last, and that we can have only one. As such, a VariableGroup with multiple SCALAR types makes no sense (under current assumptions) right? If you need more SCALARs you up the oder, not append more  seems to be my reading of what is there presently. > > Mostly trying to grok the Variable::first_scalar_number() business. > > I don't think that's right  we have multiple SCALAR variables here... each of arbitrary order (things like Lagrange multipliers for contact and "joints" in 1D flow and fullycoupled ODE systems). > > Derek Gotcha. I was misinterpreting this inline unsigned int System::n_components() const { if (_variables.empty()) return 0; const Variable& last = _variables.back(); return last.first_scalar_number() + last.n_components(); } as meaning you only have one SCALAR and it is the last in the list. ...but rather through the magic of recursion when building up first_scalar_number() it is only necessary to interrogate the last variable in the list. Ben 