From: Benjamin Kirk <benjamin.kirk@na...>  20080319 13:47:48

> Hi, Libmesh Users > I find the stiffness matrix assembling is very slow at the first time. But > it is very quickly at the second time and laters for the same system. > It needs 2 hours to assemble the stiffness matrix of the 3d stokes system > with 15,000 dofs. How to speed up the process of assembling the system > stiffness matrix at the first time? > I find the similar question in the maillist. The title is "assembling a > PetscMatrix". > Can you verify that ex11 runs quickly? It is important that the PETSc sparse matrices be efficiently preallocated, otherwise matrix insertion can cause a cascade of dynamic memory operations. In this case all subsequent assemblies will be fast because the memory is already allocated. You can verify that this is working by running the code with info (at least for PETSc 2.3.3) You should see something like: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 Ben 
From: Shengli Xu <shengli.xu@gm...>  20080319 02:02:11

Hi, Libmesh Users I find the stiffness matrix assembling is very slow at the first time. But it is very quickly at the second time and laters for the same system. It needs 2 hours to assemble the stiffness matrix of the 3d stokes system with 15,000 dofs. How to speed up the process of assembling the system stiffness matrix at the first time? I find the similar question in the maillist. The title is "assembling a PetscMatrix". Sincerely Shengli  Best regards, Yours sincerely ShengliXu Department of Engineering Mechanics State Key Laboratory of Structural Analysis for Industrial Equipment Dalian University of Technology Dalian, 116023, P. R. China Email: shengli.xu.xu@... ========================== 
From: John Peterson <jwpeterson@gm...>  20080319 03:25:29

It should not take 2 hours to assemble a system with 15,000 dofs. Something is seriously wrong, perhaps you are in debug mode? J On Tue, Mar 18, 2008 at 9:02 PM, Shengli Xu <shengli.xu.xu@...> wrote: > Hi, Libmesh Users > I find the stiffness matrix assembling is very slow at the first time. But > it is very quickly at the second time and laters for the same system. > It needs 2 hours to assemble the stiffness matrix of the 3d stokes system > with 15,000 dofs. How to speed up the process of assembling the system > stiffness matrix at the first time? > I find the similar question in the maillist. The title is "assembling a > PetscMatrix". > > Sincerely > Shengli > > >  > Best regards, > > Yours sincerely > ShengliXu > Department of Engineering Mechanics > State Key Laboratory of Structural Analysis for Industrial Equipment > Dalian University of Technology > Dalian, 116023, P. R. China > Email: shengli.xu.xu@... > ========================== >  > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Libmeshusers mailing list > Libmeshusers@... > https://lists.sourceforge.net/lists/listinfo/libmeshusers > 
From: Shengli Xu <shengli.xu@gm...>  20080319 08:23:58

Hi John The program runns in opt mode. The information of system is as follows: Mesh Information: mesh_dimension()=3 spatial_dimension()=3 n_nodes()=1331 n_elem()=1000 n_local_elem()=1000 n_active_elem()=1000 n_subdomains()=1 n_processors()=1 processor_id()=0 EquationSystems n_systems()=1 System "Stokes_Darcy" Type "LinearImplicit" Variables="u" "v" "w" "p" Finite Element Types="LAGRANGE" "LAGRANGE" "LAGRANGE" "LAGRANGE" Approximation Orders="SECOND" "SECOND" "SECOND" "FIRST" n_dofs()=16214 n_local_dofs()=16214 n_constrained_dofs()=2883 n_vectors()=12 micro_fluid_loadcase11_assemble is the performance log of calling the function of assembling stiffness matrix.   Main Program Performance: Alive time=5665.91, Active time=5515.65    Event nCalls Total Avg Percent of   Time Time Active Time      micro_fluid_loadcase11_assemble 1 5515.6511 5515.651089 100.00    Totals: 1 5515.6511 100.00   Cheers Shengli 2008/3/18, John Peterson <jwpeterson@...>: > > It should not take 2 hours to assemble a system with 15,000 dofs. > Something is seriously wrong, perhaps you are in debug mode? > > J > > > On Tue, Mar 18, 2008 at 9:02 PM, Shengli Xu <shengli.xu.xu@...> > wrote: > > Hi, Libmesh Users > > I find the stiffness matrix assembling is very slow at the first time. > But > > it is very quickly at the second time and laters for the same system. > > It needs 2 hours to assemble the stiffness matrix of the 3d stokes > system > > with 15,000 dofs. How to speed up the process of assembling the system > > stiffness matrix at the first time? > > I find the similar question in the maillist. The title is "assembling a > > PetscMatrix". > > > > Sincerely > > Shengli > > > > > >  > > Best regards, > > > > Yours sincerely > > ShengliXu > > Department of Engineering Mechanics > > State Key Laboratory of Structural Analysis for Industrial Equipment > > Dalian University of Technology > > Dalian, 116023, P. R. China > > Email: shengli.xu.xu@... > > ========================== > > > >  > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Libmeshusers mailing list > > Libmeshusers@... > > https://lists.sourceforge.net/lists/listinfo/libmeshusers > > >  Best regards, Yours sincerely ShengliXu Department of Engineering Mechanics State Key Laboratory of Structural Analysis for Industrial Equipment Dalian University of Technology Dalian, 116023, P. R. China Email: shengli.xu.xu@... ========================== 
From: Benjamin Kirk <benjamin.kirk@na...>  20080319 13:47:48

> Hi, Libmesh Users > I find the stiffness matrix assembling is very slow at the first time. But > it is very quickly at the second time and laters for the same system. > It needs 2 hours to assemble the stiffness matrix of the 3d stokes system > with 15,000 dofs. How to speed up the process of assembling the system > stiffness matrix at the first time? > I find the similar question in the maillist. The title is "assembling a > PetscMatrix". > Can you verify that ex11 runs quickly? It is important that the PETSc sparse matrices be efficiently preallocated, otherwise matrix insertion can cause a cascade of dynamic memory operations. In this case all subsequent assemblies will be fast because the memory is already allocated. You can verify that this is working by running the code with info (at least for PETSc 2.3.3) You should see something like: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 Ben 
From: Roy Stogner <roystgnr@ic...>  20080319 14:39:54

On Wed, 19 Mar 2008, Shengli Xu wrote: > micro_fluid_loadcase11_assemble is the performance log of calling the > function of assembling stiffness matrix. >  micro_fluid_loadcase11_assemble 1 5515.6511 5515.651089 You might want to do some more finegrained performance logging. The whole point of PerfLog objects is to make it easier to pin down this sort of problem. You might also try running with oprofile, if you've got access to a system on which that's convenient for you. On the other hand, the information that subsequent assembles are fast may be sufficient alone; it sounds to me like Ben's hunch is right and you're trying to write sparse matrix values to a misallocated sparsity pattern. I'm not sure how that would occur, though, unless your formulation is very nontypical. What version of libMesh are you using?  Roy 
From: Shengli Xu <shengli.xu@gm...>  20080320 03:11:47

Hi Ben and Roy, The version of libMesh is 0.6.2. I modified src/mesh/unstructured_mesh.C, add "(name.rfind(".unv") < name.size())" on line 385 to skip renumber nodes and elements for .unv input file. I use PETSc2.3.3. Running the code with info, I find that Number of mallocs during MatSetValues() is not 0 when calling system.solve(). In the system, there are peroidic boundary conditions applied through " dof_map.constrain_element_matrix_and_vector(Ke,Fe,dof_indices);" before system.matrix>add_matrix(Ke,dof_indices); system.rhs>add_vector(Fe,dof_indices); during elements assemble iteration. I have used this method in libMesh0.5.0. There is not PeriodicBoundary Class at that time. The periodic b.c. is applied in libMesh0.6.2 just as in libMesh0.5.0. I don't know how to apply periodic bc using PeriodicBoundary class. Is there any examples? After call system.solve(); The info of PETSc is: [0] PetscCommDuplicate(): Using internal PETSc communicator 2080374784 2080374782 [0] PetscCommDuplicate(): returning tag 2147483643 [0] PetscCommDuplicate(): Using internal PETSc communicator 2080374784 2080374782 [0] PetscCommDuplicate(): returning tag 2147483642 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 375 X 375; storage space: 3174 unneeded,34191 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 2466 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 156 [0] Mat_CheckInode(): Found 375 nodes out of 375 rows. Not using Inode routines [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 375 X 375; storage space: 0 unneeded,34191 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 156 [0] PetscCommDuplicate(): returning tag 2147483619 [0] PetscCommDuplicate(): returning tag 2147483618 [0] PetscCommDuplicate(): returning tag 2147483617 [0] PetscCommDuplicate(): returning tag 2147483616 [0] PetscCommDuplicate(): returning tag 2147483615 [0] PCSetUp(): Setting up new PC [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 2080374783 [0] PetscCommDuplicate(): returning tag 2147483614 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 2080374783 [0] PetscCommDuplicate(): returning tag 2147483613 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 2080374783 [0] PetscCommDuplicate(): returning tag 2147483612 [0] PetscCommDuplicate(): returning tag 2147483641 [0] PetscCommDuplicate(): returning tag 2147483611 [0] KSPDefaultConverged(): user has provided nonzero initial guess, computing 2norm of preconditioned RHS [0] PetscCommDuplicate(): returning tag 2147483610 [0] PetscCommDuplicate(): returning tag 2147483609 [0] PetscCommDuplicate(): returning tag 2147483608 [0] PetscCommDuplicate(): returning tag 2147483607 [0] PetscCommDuplicate(): returning tag 2147483606 [0] PetscCommDuplicate(): returning tag 2147483605 [0] PetscCommDuplicate(): returning tag 2147483604 [0] PetscCommDuplicate(): returning tag 2147483603 [0] PetscCommDuplicate(): returning tag 2147483602 [0] PetscCommDuplicate(): returning tag 2147483601 [0] PetscCommDuplicate(): returning tag 2147483600 [0] PetscCommDuplicate(): returning tag 2147483599 [0] PetscCommDuplicate(): returning tag 2147483598 [0] PetscCommDuplicate(): returning tag 2147483597 [0] PetscCommDuplicate(): returning tag 2147483596 [0] PetscCommDuplicate(): returning tag 2147483595 [0] PetscCommDuplicate(): returning tag 2147483594 [0] PetscCommDuplicate(): returning tag 2147483593 [0] PetscCommDuplicate(): returning tag 2147483592 [0] PetscCommDuplicate(): returning tag 2147483591 [0] KSPDefaultConverged(): Linear solver has converged. Residual norm 2.55487e11 is less than relative tolerance 1e10 times initial right hand side norm 1.03162 at iteration 18 [0] PetscCommDuplicate(): Using internal PETSc communicator 2080374784 2080374782 [0] PetscCommDuplicate(): returning tag 2147483640 [0] PetscCommDuplicate(): returning tag 2147483590 [0] VecScatterCreate(): Special case: sequential vector general scatter Sincerely Shengli 2008/3/19, Benjamin Kirk <benjamin.kirk@...>: > > > Hi, Libmesh Users > > I find the stiffness matrix assembling is very slow at the first time. > But > > it is very quickly at the second time and laters for the same system. > > It needs 2 hours to assemble the stiffness matrix of the 3d stokes > system > > with 15,000 dofs. How to speed up the process of assembling the system > > stiffness matrix at the first time? > > I find the similar question in the maillist. The title is "assembling a > > PetscMatrix". > > > > > Can you verify that ex11 runs quickly? > > It is important that the PETSc sparse matrices be efficiently > preallocated, > otherwise matrix insertion can cause a cascade of dynamic memory > operations. > In this case all subsequent assemblies will be fast because the memory is > already allocated. > > > You can verify that this is working by running the code with info (at > least > for PETSc 2.3.3) > > You should see something like: > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > > Ben > > >  Best regards, Yours sincerely ShengliXu Department of Engineering Mechanics State Key Laboratory of Structural Analysis for Industrial Equipment Dalian University of Technology Dalian, 116023, P. R. China Email: shengli.xu.xu@... ========================== 
From: Roy Stogner <roystgnr@ic...>  20080320 03:33:05

On Wed, 19 Mar 2008, Shengli Xu wrote: > In the system, there are peroidic boundary conditions applied through " > dof_map.constrain_element_matrix_and_vector(Ke,Fe,dof_indices);" before > system.matrix>add_matrix(Ke,dof_indices); > system.rhs>add_vector(Fe,dof_indices); > during elements assemble iteration. I have used this method in libMesh0.5.0. So just to be clear: you're using DofMap::add_constraint_row to create your periodic constraints, so that constrain_element_matrix_and_vector will do the remaining work for you? You must be adding the constraint rows after the matrix has been preallocated... Regardless of whether PeriodicBoundary calls would fix your problem, we ought to make sure that users are able and encouraged to do efficient preallocation for arbitrary user constraints, too. Are you using attach_constraint_function() to make sure that your constraints are applied at the right time? > There is not PeriodicBoundary Class at that time. The periodic b.c. > is applied in libMesh0.6.2 just as in libMesh0.5.0. I don't know how > to apply periodic bc using PeriodicBoundary class. Is there any > examples? If boundaries with ids 0 and 5 are opposite each other, with a vector of (0,0,1) expressing the distance between paired points (as would happen if you built a "cube" of height 1), then this is how it's supposed to work: PeriodicBoundary pb; pb.myboundary = 0; pb.paired_boundary = 5; pb.translation_vector = RealVectorValue(0., 0., 1.); dof_map.add_periodic_boundary(pb); Currently the only test coverage we've got for this may be my own application codes, though, so buyer beware.  Roy 
From: Shengli Xu <shengli.xu@gm...>  20080320 06:08:13

2008/3/19, Roy Stogner <roystgnr@...>: > > > On Wed, 19 Mar 2008, Shengli Xu wrote: > > > > In the system, there are peroidic boundary conditions applied through " > > dof_map.constrain_element_matrix_and_vector(Ke,Fe,dof_indices);" before > > system.matrix>add_matrix(Ke,dof_indices); > > system.rhs>add_vector(Fe,dof_indices); > > during elements assemble iteration. I have used this method in > libMesh0.5.0. > > > So just to be clear: you're using DofMap::add_constraint_row to create > your periodic constraints, so that constrain_element_matrix_and_vector > will do the remaining work for you? > Yes, I use DofMap::add_constraint_row to create the periodic constraints between two dof numbers. It is done in system.init_data(); Then call constrain_element_matrix_and_vector() in the element assemble. You must be adding the constraint rows after the matrix has been > preallocated... Regardless of whether PeriodicBoundary calls would I don't think so. Adding the constraint rows is in the system::init_data(); It is at the front of adding system matrix. I don't know where the matrix be preallocated. in this>init_matrices() ? fix your problem, we ought to make sure that users are able and > encouraged to do efficient preallocation for arbitrary user > constraints, too. Are you using attach_constraint_function() to make > sure that your constraints are applied at the right time? Yes, system.attach_constraint_function(apply_periodic_bc); is used before es.init(); > There is not PeriodicBoundary Class at that time. The periodic b.c. > > is applied in libMesh0.6.2 just as in libMesh0.5.0. I don't know how > > to apply periodic bc using PeriodicBoundary class. Is there any > > examples? > > > If boundaries with ids 0 and 5 are opposite each other, with a > vector of (0,0,1) expressing the distance between paired points (as > would happen if you built a "cube" of height 1), then this is how it's > supposed to work: > > PeriodicBoundary pb; > pb.myboundary = 0; > pb.paired_boundary = 5; What is 0 and 5 here? Are they dof numbers? How to apply periodic boundary condition by dof numbers? pb.translation_vector = RealVectorValue(0., 0., 1.); > dof_map.add_periodic_boundary(pb); Where to put these codes in the program? Before es.init(): And Is it needed to call "constrain_element_matrix_and_vector()" during elements assembly? Currently the only test coverage we've got for this may be my own > application codes, though, so buyer beware. >  > > Roy >  Best regards, Yours sincerely ShengliXu Department of Engineering Mechanics State Key Laboratory of Structural Analysis for Industrial Equipment Dalian University of Technology Dalian, 116023, P. R. China Email: shengli.xu.xu@... ========================== 
From: Benjamin Kirk <benjamin.kirk@na...>  20080320 12:34:33

> You must be adding the constraint rows after the matrix has been > preallocated... Regardless of whether PeriodicBoundary calls would > fix your problem, we ought to make sure that users are able and > encouraged to do efficient preallocation for arbitrary user > constraints, too. The DOF connectivity is established during EquationSystems::init(). So you must add any userdefined constraints before this time. The matrix itself is not actually allocated until the first call to EquationSystems::solve(), but when it is allocated it is using a precomputed sparsity pattern. The proper place to do this is to provide a function pointer which is called by System::user_constrain(): http://libmesh.sourceforge.net/doxygen/classSystem.php#8dd9e548cdc41149155da 0a92eefa591 Try that, it should fix your problem. Ben 
From: Shengli Xu <shengli.xu@gm...>  20080322 11:54:29

Hi Ben and Roy In libMesh 0.5.0, The assemble function runs very quickly. But in libMesh0.6.2, It is very slowly. The memory used in libMesh0.5.0 is much larger than in libMesh0.6.2 before call assemble function. The program is the same. Why? I debugged my programe. The function of apply_periodic_bc() and add_system_matrix() are both called in es.init(), but apply_periodic_bc() is in the front of add_system_matrix(). I think that the function "this>init_matrix()" in void ImplicitSystem::init_data() should consider the information of periodic b.c. which is in DofMap. The program structure is like this: int main() { EquationSystems es(mesh); LinearImplicitSystem &system = ... .... system.add_variable("u", FIRST); system.add_variable("v", FIRST); system.add_variable("w",FIRST); system.attach_assemble_function(assemble_elastic); system.attach_constraint_function(apply_periodic_bc_elastic); es.init(); .... system.solve(); ... } void assemble_elastic(EquationSystems& es, const std::string& system_name) { ... for(; el!=end_el; ++el) { // Gauss integrate { .... } dof_map.constrain_element_matrix_and_vector(Ke,Fe,dof_indices); system.matrix>add_matrix(Ke,dof_indices); system.rhs>add_vector(Fe,dof_indices); } // end el loop } /* * Apply periodic b.c. for elastic homogenization * Read the node couple to apply periodic b.c. */ void apply_periodic_bc_elastic(EquationSystems& es, const std::string& system_name) { std::cout<<"Apply periodic boundary condition for elastic homogenization..."<<std::endl; Mesh &mesh=es.get_mesh(); LinearImplicitSystem& system=es.get_system <LinearImplicitSystem>(system_name.c_str()); DofMap& dof_map = system.get_dof_map(); std::cout<<"Imput file of periodic b.c. node for elastic module"<<std::endl; std::string PeribcFile; std::cin>>PeribcFile; // The file of node couple std::cout<<"PeribcFile="<<PeribcFile<<std::endl; std::ifstream pbcfile(PeribcFile.c_str()); unsigned int p2num, p4num; std::vector<int> p2_1, p2_2, p4_1, p4_2, p4_3, p4_4; pbcfile>>p2num; for(unsigned int i=0; i<p2num; i++) { int m,n; pbcfile>>m>>n; // Read the node number p2_1.push_back(m1); p2_2.push_back(n1); } pbcfile>>p4num; for(unsigned int i=0; i<p4num; i++) { int m1, m2, m3, m4; pbcfile>>m1>>m2>>m3>>m4; p4_1.push_back(m11); p4_2.push_back(m21); p4_3.push_back(m31); p4_4.push_back(m41); } pbcfile.close(); // for(unsigned int i=0;i<p2num;i++){ unsigned int n1=p2_1.at(i); unsigned int n2=p2_2.at(i); const Node &curr_n1 = mesh.node(n1); const Node &curr_n2 = mesh.node(n2); unsigned int u1 = curr_n1.dof_number(0,0,0); unsigned int v1 = curr_n1.dof_number(0,1,0); unsigned int w1 = curr_n1.dof_number(0,2,0); unsigned int u2 = curr_n2.dof_number(0,0,0); unsigned int v2 = curr_n2.dof_number(0,1,0); unsigned int w2 = curr_n2.dof_number(0,2,0); DofConstraintRow cu,cv,cw; cu[u1]=1.0; cv[v1]=1.0; cw[w1]=1.0; dof_map.add_constraint_row(u2,cu); dof_map.add_constraint_row(v2,cv); dof_map.add_constraint_row(w2,cw); } // for(unsigned int i=0;i<p4num;i++){ unsigned int n1=p4_1.at(i); unsigned int n2=p4_2.at(i); unsigned int n3=p4_3.at(i); unsigned int n4=p4_4.at(i); const Node &curr_n1=mesh.node(n1); const Node &curr_n2=mesh.node(n2); const Node &curr_n3=mesh.node(n3); const Node &curr_n4=mesh.node(n4); unsigned int u1=curr_n1.dof_number(0,0,0); unsigned int v1=curr_n1.dof_number(0,1,0); unsigned int w1=curr_n1.dof_number(0,2,0); unsigned int u2=curr_n2.dof_number(0,0,0); unsigned int v2=curr_n2.dof_number(0,1,0); unsigned int w2=curr_n2.dof_number(0,2,0); unsigned int u3=curr_n3.dof_number(0,0,0); unsigned int v3=curr_n3.dof_number(0,1,0); unsigned int w3=curr_n3.dof_number(0,2,0); unsigned int u4=curr_n4.dof_number(0,0,0); unsigned int v4=curr_n4.dof_number(0,1,0); unsigned int w4=curr_n4.dof_number(0,2,0); DofConstraintRow cu,cv,cw; cu[u1]=1.0; cv[v1]=1.0; cw[w1]=1.0; dof_map.add_constraint_row(u2,cu); dof_map.add_constraint_row(u3,cu); dof_map.add_constraint_row(u4,cu); dof_map.add_constraint_row(v2,cv); dof_map.add_constraint_row(v3,cv); dof_map.add_constraint_row(v4,cv); dof_map.add_constraint_row(w2,cw); dof_map.add_constraint_row(w3,cw); dof_map.add_constraint_row(w4,cw); } } Sincerely Shengli 2008/3/20, Benjamin Kirk <benjamin.kirk@...>: > > > You must be adding the constraint rows after the matrix has been > > preallocated... Regardless of whether PeriodicBoundary calls would > > fix your problem, we ought to make sure that users are able and > > encouraged to do efficient preallocation for arbitrary user > > constraints, too. > > The DOF connectivity is established during EquationSystems::init(). So > you must add any userdefined constraints before this time. The matrix > itself is not actually allocated until the first call to > EquationSystems::solve(), but when it is allocated it is using a > precomputed sparsity pattern. > > The proper place to do this is to provide a function pointer which is > called by System::user_constrain(): > > http://libmesh.sourceforge.net/doxygen/classSystem.php#8dd9e548cdc41149155da0a92eefa591 > > Try that, it should fix your problem. > > Ben >  Best regards, Yours sincerely ShengliXu Department of Engineering Mechanics State Key Laboratory of Structural Analysis for Industrial Equipment Dalian University of Technology Dalian, 116023, P. R. China Email: shengli.xu.xu@... ========================== 