From: STEPHANE T. <tch...@ms...> - 2009-02-24 15:49:18
|
Hi everybody, I'd like to configure and compile LibMesh with PETSc and without MPI but it seems to me that its not possible. When i do: " ./configure --enable-petsc --without-mpi " i always get a configuration without MPI but with Laspack. The option "--enabled-petsc" can only be valid with MPI even if Petsc has been compiled without MPI. Has anybody handled that situation before? Thanks for your help. Stephane _________________________________________________________________ Drag n’ drop—Get easy photo sharing with Windows Live™ Photos. http://www.microsoft.com/windows/windowslive/products/photos.aspx |
From: STEPHANE T. <tch...@ms...> - 2009-02-24 16:21:04
|
Hi, I use the class petsc_nonlinear_solver to solve a nonlinear convection-diffusion type equation in porous media. The built linear system's size is 30000 and its resolution lasts about a hour on a single processor machine (my laptop...). I run for that a hundred time steps and have in average 4 Newton iterations per time step. So a hour seems huge to me with the optimized version of Petsc. I tryed to profile my code with the Petsc options "-log_summary" and "-info". What i get out of it is that the first time step lasts about 15min representing 25% of the total time and i think its not normal. The thing is, these 15min happen at the second call of the petsc routine "VecScatterCreate()" right after the Libmesh output of the first Newton residual. Any idea? Thanks. Stephane _________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx |
From: Roy S. <roy...@ic...> - 2009-02-24 16:36:10
|
On Tue, 24 Feb 2009, STEPHANE TCHOUANMO wrote: > The built linear system's size is 30000 and its resolution lasts > about a hour on a single processor machine (my laptop...). I run > for that a hundred time steps and have in average 4 Newton > iterations per time step. So a hour seems huge to me with the > optimized version of Petsc. Definitely huge, but it's not entirely impossible if you have a really poorly conditioned system combined with bad solver options. But: > I tryed to profile my code with the Petsc options "-log_summary" and > "-info". What i get out of it is that the first time step lasts > about 15min representing 25% of the total time and i think its not > normal. That's definitely not normal. What libMesh and PETSc versions are you using? > The thing is, these 15min happen at the second call of the petsc > routine "VecScatterCreate()" right after the Libmesh output of the > first Newton residual. If they happen in VecScatterCreate it sounds like a PETSc problem, but I'd like to know how it's being triggered. There's only three places (in localize calls in petsc_vector.C) where the VecScatterCreate might be hanging; it would be helpful to know which one is at fault and what the arguments look like. --- Roy |
From: STEPHANE T. <tch...@ms...> - 2009-02-24 16:52:26
|
Thanks for your answer Roy, I use libmesh-0.6.2 and petsc-2.3.3-p13 versions. I sent a mail to petsc users and we'll see what they say about that VecScatterCreate. Right now im checking whats going on in petsc_vector.C. Hope to get the answer quick. > Date: Tue, 24 Feb 2009 10:36:05 -0600 > From: roy...@ic... > To: tch...@ms... > CC: lib...@li... > Subject: Re: [Libmesh-users] Petsc slowness on a single processor machine > > > On Tue, 24 Feb 2009, STEPHANE TCHOUANMO wrote: > > > The built linear system's size is 30000 and its resolution lasts > > about a hour on a single processor machine (my laptop...). I run > > for that a hundred time steps and have in average 4 Newton > > iterations per time step. So a hour seems huge to me with the > > optimized version of Petsc. > > Definitely huge, but it's not entirely impossible if you have a really > poorly conditioned system combined with bad solver options. But: > > > I tryed to profile my code with the Petsc options "-log_summary" and > > "-info". What i get out of it is that the first time step lasts > > about 15min representing 25% of the total time and i think its not > > normal. > > That's definitely not normal. What libMesh and PETSc versions are you > using? > > > The thing is, these 15min happen at the second call of the petsc > > routine "VecScatterCreate()" right after the Libmesh output of the > > first Newton residual. > > If they happen in VecScatterCreate it sounds like a PETSc problem, but > I'd like to know how it's being triggered. There's only three places > (in localize calls in petsc_vector.C) where the VecScatterCreate might > be hanging; it would be helpful to know which one is at fault and what > the arguments look like. > --- > Roy _________________________________________________________________ More than messages–check out the rest of the Windows Live™. http://www.microsoft.com/windows/windowslive/ |
From: Roy S. <roy...@ic...> - 2009-02-24 16:56:28
|
On Tue, 24 Feb 2009, STEPHANE TCHOUANMO wrote: > I use libmesh-0.6.2 and petsc-2.3.3-p13 versions. > I sent a mail to petsc users and we'll see what they say about that VecScatterCreate. > Right now im checking whats going on in petsc_vector.C. You might also try 0.6.3 or the SVN libMesh to see if they exhibit the same problem for you. With enough luck it might already have been found and fixed. --- Roy |
From: Kirk, B. (JSC-EG311) <ben...@na...> - 2009-03-01 20:53:37
|
> Thanks for your answer Roy, > > I use libmesh-0.6.2 and petsc-2.3.3-p13 versions. > I sent a mail to petsc users and we'll see what they say about that > VecScatterCreate. > Right now im checking whats going on in petsc_vector.C. > Hope to get the answer quick. What MPI are you using? VecScatterCreate should be particularly trivial on one processor, so this issue is perplexing... -Ben |
From: STEPHANE T. <tch...@ms...> - 2009-03-03 19:34:36
|
Thanks Ben. I use mpich2-1.0.7. After a discussion with PETSc developpers, the problem might come from lots of allocation made by LibMesh within the call of PETSc. In fact if you look at the PETSc log summary of the problem I solve, you can clearly see that most of the time (more than 90%) is spent in the SNESSolve stage. The KSPSolve stage for solving the linear system in Newton takes at most 5% of the time. Actually, my problem is really at the very first Newton iteration which can last a hour out of a 3 hours total time resolution. Here is the behavior i have: ==> Solving time step 0, time = 0.01 NL step 0, |residual|_2 = 5.346581e-05 .. 1 hour .. NL step 1, |residual|_2 = 8.790777e-10 ==> Solving time step 1, time = 2.000000e-02 NL step 0, |residual|_2 = 6.043076e-05 NL step 1, |residual|_2 = 9.936468e-10 ... ... etc. until the end for a total CPU of 3 hours. Finally I always get the right solution but i dont understand the sudden stop at the beginning. It might not be only VecScatterCreate but i think its a whole bunch of memory allocation that happens. What do you think? Thanks. Stephane > From: ben...@na... > To: tch...@ms... > CC: lib...@li... > Date: Sun, 1 Mar 2009 14:53:25 -0600 > Subject: Re: [Libmesh-users] Petsc slowness on a single processor machine > > > > Thanks for your answer Roy, > > > > I use libmesh-0.6.2 and petsc-2.3.3-p13 versions. > > I sent a mail to petsc users and we'll see what they say about that > > VecScatterCreate. > > Right now im checking whats going on in petsc_vector.C. > > Hope to get the answer quick. > > What MPI are you using? VecScatterCreate should be particularly trivial on > one processor, so this issue is perplexing... > > -Ben > _________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx |
From: Kirk, B. (JSC-EG311) <ben...@na...> - 2009-03-03 19:42:29
|
> Thanks Ben. > > I use mpich2-1.0.7. > > After a discussion with PETSc developpers, the problem might come from lots of > allocation made by LibMesh within the call of PETSc. In fact if you look at > the PETSc log summary of the problem I solve, you can clearly see that most of > the time (more than 90%) is spent in the SNESSolve stage. The KSPSolve stage > for solving the linear system in Newton takes at most 5% of the time. > Actually, my problem is really at the very first Newton iteration which can > last a hour out of a 3 hours total time resolution. Here is the behavior i > have: > > ==> Solving time step 0, time = 0.01 > NL step 0, |residual|_2 = 5.346581e-05 > .. 1 hour .. > NL step 1, |residual|_2 = 8.790777e-10 > > ==> Solving time step 1, time = 2.000000e-02 > NL step 0, |residual|_2 = 6.043076e-05 > NL step 1, |residual|_2 = 9.936468e-10 > ... > ... > etc. until the end for a total CPU of 3 hours. > Finally I always get the right solution but i dont understand the sudden stop > at the beginning. > It might not be only VecScatterCreate but i think its a whole bunch of memory > allocation that happens. > > What do you think? I think the problem is most definitely in the sparse matrix allocation. libMesh builds the graph of (what it thinks is) your sparse matrix so that the underlying PETSc data structures can be allocated perfectly. If for some reason the linear system you are assembling has a different structure than what we thought it would, insertions into the sparse matrix can be horrifically slow the first time you assemble the linear system. what you should look for is something like 'number of mallocs called during MatSetValues' when you run with -info. we want that to be 0. what is it on the first linear solve? what type of elements are you using?? -Ben |
From: STEPHANE T. <tch...@ms...> - 2009-03-03 20:15:28
|
Oups! Seems like you've got it Ben. Here is my Konsole output with -info for a small case (9261 dofs). [0] PetscInitialize(): PETSc successfully started: number of processors = 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS linux-stchouan. [0] PetscInitialize(): Running on machine: linux-stchouan Mesh Information: mesh_dimension()=3 spatial_dimension()=3 n_nodes()=9261 n_elem()=8000 n_local_elem()=8000 n_active_elem()=8000 n_subdomains()=1 n_processors()=1 processor_id()=0 ... [0] VecScatterCreate(): Special case: sequential vector general scatter EquationSystems n_systems()=1 System "dc" Type "TransientNonlinearImplicit" Variables="P" Finite Element Types="LAGRANGE" Approximation Orders="FIRST" n_dofs()=9261 n_local_dofs()=9261 n_constrained_dofs()=0 n_vectors()=3 ==> Solving time step 0, time = 0.01 ... [0] VecScatterCreate(): Special case: sequential vector general scatter NL step 0, |residual|_2 = 4.119734e-05 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483632 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374784 -2080374782 [0] PetscCommDuplicate(): returning tag 2147483641 [0] PetscCommDuplicate(): returning tag 2147483631 [0] VecScatterCreate(): Special case: sequential vector general scatter [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9261 X 9261; storage space: 56570 unneeded,226981 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 18286 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27 [0] Mat_CheckInode(): Found 9261 nodes out of 9261 rows. Not using Inode routines [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9261 X 9261; storage space: 0 unneeded,226981 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9261 X 9261; storage space: 0 unneeded,226981 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27 So I have: Number of mallocs during MatSetValues() is 18286 Its only non zero at this point. The rest of the time its zero. The value probably increases exponentially with the number of unknowns in the system. It also seems to depend on the type of elements I use. Here I have Hexes but I will further work with tetras. What should I do to fix this Ben? Thanks a lot. Stephane > From: ben...@na... > To: tch...@ms... > CC: lib...@li... > Date: Tue, 3 Mar 2009 13:42:24 -0600 > Subject: Re: [Libmesh-users] Petsc slowness on a single processor machine > > > Thanks Ben. > > > > I use mpich2-1.0.7. > > > > After a discussion with PETSc developpers, the problem might come from lots of > > allocation made by LibMesh within the call of PETSc. In fact if you look at > > the PETSc log summary of the problem I solve, you can clearly see that most of > > the time (more than 90%) is spent in the SNESSolve stage. The KSPSolve stage > > for solving the linear system in Newton takes at most 5% of the time. > > Actually, my problem is really at the very first Newton iteration which can > > last a hour out of a 3 hours total time resolution. Here is the behavior i > > have: > > > > ==> Solving time step 0, time = 0.01 > > NL step 0, |residual|_2 = 5.346581e-05 > > .. 1 hour .. > > NL step 1, |residual|_2 = 8.790777e-10 > > > > ==> Solving time step 1, time = 2.000000e-02 > > NL step 0, |residual|_2 = 6.043076e-05 > > NL step 1, |residual|_2 = 9.936468e-10 > > ... > > ... > > etc. until the end for a total CPU of 3 hours. > > Finally I always get the right solution but i dont understand the sudden stop > > at the beginning. > > It might not be only VecScatterCreate but i think its a whole bunch of memory > > allocation that happens. > > > > What do you think? > > > I think the problem is most definitely in the sparse matrix allocation. > libMesh builds the graph of (what it thinks is) your sparse matrix so that > the underlying PETSc data structures can be allocated perfectly. If for > some reason the linear system you are assembling has a different structure > than what we thought it would, insertions into the sparse matrix can be > horrifically slow the first time you assemble the linear system. > > what you should look for is something like 'number of mallocs called during > MatSetValues' when you run with -info. we want that to be 0. what is it on > the first linear solve? what type of elements are you using?? > > -Ben > _________________________________________________________________ More than messages–check out the rest of the Windows Live™. http://www.microsoft.com/windows/windowslive/ |
From: STEPHANE T. <tch...@ms...> - 2009-03-30 11:09:37
|
Hi all, Im trying to install Libmesh with Petsc but it seems that the linking libmesh.so is not done properly. The installatio of Petsc was succesful, i set correctly my environment variables and here is the error i get when compiling libmesh: Linking /home/pass/stchouan/home/libmesh-0.6.3-rc1/lib/x86_64-unknown-linux-gnu_opt/libmesh.so `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_1XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' referenced in section `.text' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o: defined in discarded section `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_1XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_1XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' referenced in section `.text' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o: defined in discarded section ... ... `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_3XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' referenced in section `.text' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o: defined in discarded section `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_3XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_3XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' referenced in section `.text' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o: defined in discarded section `.gnu.linkonce.t.__CPR117__n_dofs__53FE__tm__43_XCUiL_1_3XCQ2_12libMeshEnums8FEFamilyL_1_0SFQ2_J35J8ElemTypeQ2_J35J5Order_Ui' of src/fe/fe_xyz.x86_64-unknown-linux-gnu.opt.o make: *** [/home/pass/stchouan/home/libmesh-0.6.3-rc1/lib/x86_64-unknown-linux-gnu_opt/libmesh.so] Erreur 2 Does anybody have an idea? Thanks Stephane _________________________________________________________________ Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us |
From: Kirk, B. (JSC-EG311) <ben...@na...> - 2009-03-30 13:52:06
|
> Im trying to install Libmesh with Petsc but it seems that the linking > libmesh.so is not done properly. > The installatio of Petsc was succesful, i set correctly my environment > variables and here is the error i get when compiling libmesh: > ... What compiler are you using? I have only seen this issue with the portland group C++ compiler and have never been able to figure out what the issue is with the FEXYZ template instantiations that causes this issue... (not to say someone can't fix it.) -Ben |
From: STEPHANE T. <tch...@ms...> - 2009-03-31 13:19:57
|
> What compiler are you using? I have only seen this issue with the portland > group C++ compiler and have never been able to figure out what the issue is > with the FEXYZ template instantiations that causes this issue... I use a gcc4.x compiler. Its really curious because using the same compiler on my laptop, i dont have that problem. Stephane > From: ben...@na... > To: tch...@ms...; lib...@li... > Date: Mon, 30 Mar 2009 08:51:57 -0500 > Subject: Re: [Libmesh-users] Installation problem > > > > Im trying to install Libmesh with Petsc but it seems that the linking > > libmesh.so is not done properly. > > The installatio of Petsc was succesful, i set correctly my environment > > variables and here is the error i get when compiling libmesh: > > ... > > What compiler are you using? I have only seen this issue with the portland > group C++ compiler and have never been able to figure out what the issue is > with the FEXYZ template instantiations that causes this issue... (not to > say someone can't fix it.) > > -Ben > _________________________________________________________________ Show them the way! Add maps and directions to your party invites. http://www.microsoft.com/windows/windowslive/products/events.aspx |
From: John P. <jwp...@gm...> - 2009-03-31 13:52:49
|
On Tue, Mar 31, 2009 at 8:19 AM, STEPHANE TCHOUANMO <tch...@ms...> wrote: > >> What compiler are you using? I have only seen this issue with the portland > >> group C++ compiler and have never been able to figure out what the issue is > >> with the FEXYZ template instantiations that causes this issue... > > > > I use a gcc4.x compiler. > > > > Its really curious because using the same compiler on my laptop, i dont have that problem. And you've tried making clean and remaking so there's no chance of old object files sitting around? -- John |
From: Kirk, B. (JSC-EG311) <ben...@na...> - 2009-03-31 14:07:08
|
>> What compiler are you using? I have only seen this issue with the portland >> group C++ compiler and have never been able to figure out what the issue is >> with the FEXYZ template instantiations that causes this issue... > > I use a gcc4.x compiler. > > Its really curious because using the same compiler on my laptop, i dont have > that problem. will you please post the output of 'make echo' from the problem machine, and preferably include it as a text attachment as well? |
From: STEPHANE T. <tch...@ms...> - 2009-03-31 16:50:36
|
Its all good now ! In fact my compiler was the Portland group C++ one... Sorry and thanks a lot. Stephane > From: ben...@na... > To: tch...@ms...; lib...@li... > Date: Tue, 31 Mar 2009 08:31:12 -0500 > Subject: Re: [Libmesh-users] Installation problem > > > >> What compiler are you using? I have only seen this issue with the portland > >> group C++ compiler and have never been able to figure out what the issue is > >> with the FEXYZ template instantiations that causes this issue... > > > > I use a gcc4.x compiler. > > > > Its really curious because using the same compiler on my laptop, i dont have > > that problem. > > > will you please post the output of 'make echo' from the problem machine, and > preferably include it as a text attachment as well? > _________________________________________________________________ More than messages–check out the rest of the Windows Live™. http://www.microsoft.com/windows/windowslive/ |
From: Roy S. <roy...@ic...> - 2009-02-24 16:24:57
|
On Tue, 24 Feb 2009, STEPHANE TCHOUANMO wrote: > I'd like to configure and compile LibMesh with PETSc and without MPI but it seems to me that its not possible. > When i do: " ./configure --enable-petsc --without-mpi " i always get a configuration without MPI but with Laspack. > The option "--enabled-petsc" can only be valid with MPI even if Petsc has been compiled without MPI. > > Has anybody handled that situation before? Apparently not: even in the libMesh SVN head, line 572 of configure.in turns off PETSc unless MPI is also enabled. Deleting lines 564,571-573 should fix that problem, but since we obviously haven't been testing the with-petsc, without-mpi case then there are probably bugs with it. libMesh::COMM_WORLD is only defined with MPI enabled, the LibMeshInit function only initializes PETSc with MPI enabled... for that matter I see we've got a couple unencapsulated MPI_Reduce calls left in petsc_vector.C. I'll fix the obvious stuff now, but I don't have time to rebuild PETSc and test for other incompatibilities. You're welcome to be a guinea pig and report whatever problems you come across, but it might be simpler just to rebuild PETSc with its "download MPI" option configured. You can still run an MPI-built app on one processor without inefficiency. --- Roy |