You can subscribe to this list here.
2003 |
Jan
(4) |
Feb
(1) |
Mar
(9) |
Apr
(2) |
May
(7) |
Jun
(1) |
Jul
(1) |
Aug
(4) |
Sep
(12) |
Oct
(8) |
Nov
(3) |
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(1) |
Feb
(21) |
Mar
(31) |
Apr
(10) |
May
(12) |
Jun
(15) |
Jul
(4) |
Aug
(6) |
Sep
(5) |
Oct
(11) |
Nov
(43) |
Dec
(13) |
2005 |
Jan
(25) |
Feb
(12) |
Mar
(49) |
Apr
(19) |
May
(104) |
Jun
(60) |
Jul
(10) |
Aug
(42) |
Sep
(15) |
Oct
(12) |
Nov
(6) |
Dec
(4) |
2006 |
Jan
(1) |
Feb
(6) |
Mar
(31) |
Apr
(17) |
May
(5) |
Jun
(95) |
Jul
(38) |
Aug
(44) |
Sep
(6) |
Oct
(8) |
Nov
(21) |
Dec
|
2007 |
Jan
(5) |
Feb
(46) |
Mar
(9) |
Apr
(23) |
May
(17) |
Jun
(51) |
Jul
(41) |
Aug
(4) |
Sep
(28) |
Oct
(71) |
Nov
(193) |
Dec
(20) |
2008 |
Jan
(46) |
Feb
(46) |
Mar
(18) |
Apr
(38) |
May
(14) |
Jun
(107) |
Jul
(50) |
Aug
(115) |
Sep
(84) |
Oct
(96) |
Nov
(105) |
Dec
(34) |
2009 |
Jan
(89) |
Feb
(93) |
Mar
(119) |
Apr
(73) |
May
(39) |
Jun
(51) |
Jul
(27) |
Aug
(8) |
Sep
(91) |
Oct
(90) |
Nov
(77) |
Dec
(67) |
2010 |
Jan
(25) |
Feb
(36) |
Mar
(98) |
Apr
(45) |
May
(25) |
Jun
(60) |
Jul
(17) |
Aug
(36) |
Sep
(48) |
Oct
(45) |
Nov
(65) |
Dec
(39) |
2011 |
Jan
(26) |
Feb
(48) |
Mar
(151) |
Apr
(108) |
May
(61) |
Jun
(108) |
Jul
(27) |
Aug
(50) |
Sep
(43) |
Oct
(43) |
Nov
(27) |
Dec
(37) |
2012 |
Jan
(56) |
Feb
(120) |
Mar
(72) |
Apr
(57) |
May
(82) |
Jun
(66) |
Jul
(51) |
Aug
(75) |
Sep
(166) |
Oct
(232) |
Nov
(284) |
Dec
(105) |
2013 |
Jan
(168) |
Feb
(151) |
Mar
(30) |
Apr
(145) |
May
(26) |
Jun
(53) |
Jul
(76) |
Aug
(33) |
Sep
(23) |
Oct
(72) |
Nov
(125) |
Dec
(38) |
2014 |
Jan
(47) |
Feb
(62) |
Mar
(27) |
Apr
(8) |
May
(12) |
Jun
(2) |
Jul
(22) |
Aug
(22) |
Sep
|
Oct
(17) |
Nov
(20) |
Dec
(12) |
2015 |
Jan
(25) |
Feb
(2) |
Mar
(16) |
Apr
(13) |
May
(21) |
Jun
(5) |
Jul
(1) |
Aug
(8) |
Sep
(9) |
Oct
(30) |
Nov
(8) |
Dec
|
2016 |
Jan
(16) |
Feb
(31) |
Mar
(43) |
Apr
(18) |
May
(21) |
Jun
(11) |
Jul
(17) |
Aug
(26) |
Sep
(4) |
Oct
(16) |
Nov
(5) |
Dec
(6) |
2017 |
Jan
(1) |
Feb
(2) |
Mar
(5) |
Apr
(4) |
May
(1) |
Jun
(11) |
Jul
(5) |
Aug
|
Sep
(3) |
Oct
(1) |
Nov
(7) |
Dec
|
2018 |
Jan
(8) |
Feb
(8) |
Mar
(1) |
Apr
|
May
(5) |
Jun
(11) |
Jul
|
Aug
(51) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2019 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
(7) |
May
(2) |
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
(4) |
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: John P. <jwp...@gm...> - 2018-02-13 15:39:10
|
On Tue, Feb 13, 2018 at 8:12 AM, Victor Eijkhout <eij...@ta...> wrote: > > > On Feb 12, 2018, at 5:46 PM, John Peterson <jwp...@gm...> wrote: > > try putting your custom library+path in the $libmesh_LDFLAGS environment > variable > > > Mixed case like that? > > It’s not working, and I have no idea what it stumbles on as the output > does not contain the commandline. > > Again, log attached. > We probably need the config.log file from your build directory rather than just the output of make, because I think you may be using an older version of libmesh that is still passing old/wrong flags to the Intel 2017 compiler: icpc: command line remark #10148: option '-Ob2' not supported icpc: command line remark #10148: option '-tpp6' not supported icpc: command line remark #10148: option '-vec_report0' not supported icpc: command line remark #10148: option '-par_report0' not supported icpc: command line remark #10148: option '-openmp_report0' not supported Anyway, based on these errors: /home1/apps/intel17/impi17_0/trilinos/12.10.1/lib/libpytrilinos.so: undefined reference to `PyType_IsSubtype' /home1/apps/intel17/impi17_0/trilinos/12.10.1/lib/libpytrilinos.so: undefined reference to `PyExc_RuntimeError' /home1/apps/intel17/impi17_0/trilinos/12.10.1/lib/libpytrilinos.so: undefined reference to `PyErr_Print' /home1/apps/intel17/impi17_0/trilinos/12.10.1/lib/libpytrilinos.so: undefined reference to `_Py_ZeroStruct' I'm guessing you tried adding "-L/home1/apps/intel17/impi17_0/trilinos/12.10.1/lib -lpytrilinos" to libmesh_LDFLAGS, but it looks like you're going to link in more libraries than just that one as it has a bunch of undefined references. -- John |
From: Victor E. <eij...@ta...> - 2018-02-13 15:12:57
|
On Feb 12, 2018, at 5:46 PM, John Peterson <jwp...@gm...<mailto:jwp...@gm...>> wrote: try putting your custom library+path in the $libmesh_LDFLAGS environment variable Mixed case like that? It’s not working, and I have no idea what it stumbles on as the output does not contain the commandline. Again, log attached. Victor. |
From: John P. <jwp...@gm...> - 2018-02-12 23:46:31
|
On Mon, Feb 12, 2018 at 12:43 PM, Victor Eijkhout <eij...@ta...> wrote: > I’m attaching a configure log file. I put a python library in the LIBS > variable, and it is found and validated, but then not added to the optional > libraries. This means that during linking I get an error from pytrilinos > which can not find the python library. > It doesn't look like we do anything with $LIBS during configure, other than to use it for test linking. As a short term workaround, you could try putting your custom library+path in the $libmesh_LDFLAGS environment variable, as I think we actually respect the value of that one. I think to be consistent with libmesh_LDFLAGS and libmesh_CXXFLAGS, we would probably end up reading the value of libmesh_LIBS from the environment. This is definitely non-standard, but I think the thinking was to avoid accidentally picking up LDFLAGS/CXXFLAGS/LIBS from the user's environment that they may have set for other reasons... -- John |
From: Victor E. <eij...@ta...> - 2018-02-12 19:43:51
|
I’m attaching a configure log file. I put a python library in the LIBS variable, and it is found and validated, but then not added to the optional libraries. This means that during linking I get an error from pytrilinos which can not find the python library. Victor. |
From: Paul T. B. <ptb...@gm...> - 2018-01-22 19:55:05
|
OK, I'll try and push a PR before the end of today. On Mon, Jan 22, 2018 at 2:54 PM, Roy Stogner <roy...@ic...> wrote: > > On Mon, 22 Jan 2018, Paul T. Bauman wrote: > > John: Should I go ahead and change that one to XFAIL? I'm not sure >> it will cause less confusion for users, but it would permit a >> blanket `make check`. Downside is it won't show up as a failure on >> CIVET testing. >> > > That's an extremely good idea. > > We can change it back to PASS/FAIL after an actual fix is in. > > Sorry about the delay. > --- > Roy > |
From: Roy S. <roy...@ic...> - 2018-01-22 19:54:12
|
On Mon, 22 Jan 2018, Paul T. Bauman wrote: > John: Should I go ahead and change that one to XFAIL? I'm not sure > it will cause less confusion for users, but it would permit a > blanket `make check`. Downside is it won't show up as a failure on > CIVET testing. That's an extremely good idea. We can change it back to PASS/FAIL after an actual fix is in. Sorry about the delay. --- Roy |
From: Paul T. B. <ptb...@gm...> - 2018-01-22 17:44:56
|
Two strategies: 1. Don't configure with VTK (that test depends on VTK and won't run if you haven't configured with VTK) 2. You can run make check in the subdirectories directly and bypass that example, using a shell script, for example. John: Should I go ahead and change that one to XFAIL? I'm not sure it will cause less confusion for users, but it would permit a blanket `make check`. Downside is it won't show up as a failure on CIVET testing. On Mon, Jan 22, 2018 at 12:42 PM, Daniel Vasconcelos < dan...@ou...> wrote: > Thank you for the quick reply, I appreciate it. > > Is there a way to run the make check command and skip this example? > > Regards, > > Daniel F. M. Vasconcelos. > > > > Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > From: John Peterson<mailto:jwp...@gm...> > Sent: Monday, January 22, 2018 15:38 > To: Daniel Vasconcelos<mailto:dan...@ou...> > Cc: libmesh-users<mailto:lib...@li...>; > libmesh-devel<mailto:lib...@li...> > Subject: Re: [Libmesh-users] Example fem_system_ex2 failing to run > > > > On Mon, Jan 22, 2018 at 10:35 AM, Daniel Vasconcelos < > dan...@ou...<mailto:dan...@ou...>> wrote: > Sorry, I have forgotten to include the actual libMesh configuration log. > > Once again. thanks in advance. > > Daniel F. M. Vasconcelos. > > 2018-01-22 15:28 GMT-02:00 Daniel Vasconcelos <dan...@ou...< > mailto:dan...@ou...><mailto:dan...@ou...<mailto: > dan...@ou...>>>: > Dear libMesh users, > > Currently I am with problems when running example fem_system_ex2. > MPI_ABORT is invoked in step 1 right after assembling the system (see > attached file). > > Below is the setup I am using: > > * CPU: Intel i7 > * Oracle Virtual Box 5.2.6 r120293 > * Host system: Windows 10 > * Guest system: Linux Mint 18.3 Mate > * Cloned libMesh master branch as per 01-21-2018 > * All libMesh optional libraries installed using apt-get (VTK, > openMPI, PETSc, HDF5, …) > > I have also attached the libMesh configuration log and the fem_system_ex2 > traceout file. > > Thanks, we are aware of the issue, but not sure what the right fix is yet. > > (https://github.com/libMesh/libmesh/issues/1559<https:// > nam01.safelinks.protection.outlook.com/?url=https%3A%2F% > 2Fgithub.com%2FlibMesh%2Flibmesh%2Fissues%2F1559&data=02%7C01%7C% > 7C4f2c2890c38d4b31d54c08d561bf00ce%7C84df9e7fe9f640afb435aaaaaaaa > aaaa%7C1%7C0%7C636522395328398299&sdata=eTEd2aYe3xE23iCBlf% > 2FVhUAuZh9wIxYyFwOtLUPXF7s%3D&reserved=0>) > > -- > John > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Libmesh-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-users > |
From: Daniel V. <dan...@ou...> - 2018-01-22 17:42:15
|
Thank you for the quick reply, I appreciate it. Is there a way to run the make check command and skip this example? Regards, Daniel F. M. Vasconcelos. Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 From: John Peterson<mailto:jwp...@gm...> Sent: Monday, January 22, 2018 15:38 To: Daniel Vasconcelos<mailto:dan...@ou...> Cc: libmesh-users<mailto:lib...@li...>; libmesh-devel<mailto:lib...@li...> Subject: Re: [Libmesh-users] Example fem_system_ex2 failing to run On Mon, Jan 22, 2018 at 10:35 AM, Daniel Vasconcelos <dan...@ou...<mailto:dan...@ou...>> wrote: Sorry, I have forgotten to include the actual libMesh configuration log. Once again. thanks in advance. Daniel F. M. Vasconcelos. 2018-01-22 15:28 GMT-02:00 Daniel Vasconcelos <dan...@ou...<mailto:dan...@ou...><mailto:dan...@ou...<mailto:dan...@ou...>>>: Dear libMesh users, Currently I am with problems when running example fem_system_ex2. MPI_ABORT is invoked in step 1 right after assembling the system (see attached file). Below is the setup I am using: * CPU: Intel i7 * Oracle Virtual Box 5.2.6 r120293 * Host system: Windows 10 * Guest system: Linux Mint 18.3 Mate * Cloned libMesh master branch as per 01-21-2018 * All libMesh optional libraries installed using apt-get (VTK, openMPI, PETSc, HDF5, …) I have also attached the libMesh configuration log and the fem_system_ex2 traceout file. Thanks, we are aware of the issue, but not sure what the right fix is yet. (https://github.com/libMesh/libmesh/issues/1559<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FlibMesh%2Flibmesh%2Fissues%2F1559&data=02%7C01%7C%7C4f2c2890c38d4b31d54c08d561bf00ce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636522395328398299&sdata=eTEd2aYe3xE23iCBlf%2FVhUAuZh9wIxYyFwOtLUPXF7s%3D&reserved=0>) -- John |
From: John P. <jwp...@gm...> - 2018-01-22 17:38:58
|
On Mon, Jan 22, 2018 at 10:35 AM, Daniel Vasconcelos < dan...@ou...> wrote: > Sorry, I have forgotten to include the actual libMesh configuration log. > > Once again. thanks in advance. > > Daniel F. M. Vasconcelos. > > 2018-01-22 15:28 GMT-02:00 Daniel Vasconcelos <dan...@ou...< > mailto:dan...@ou...>>: > Dear libMesh users, > > Currently I am with problems when running example fem_system_ex2. > MPI_ABORT is invoked in step 1 right after assembling the system (see > attached file). > > Below is the setup I am using: > > * CPU: Intel i7 > * Oracle Virtual Box 5.2.6 r120293 > * Host system: Windows 10 > * Guest system: Linux Mint 18.3 Mate > * Cloned libMesh master branch as per 01-21-2018 > * All libMesh optional libraries installed using apt-get (VTK, > openMPI, PETSc, HDF5, …) > > I have also attached the libMesh configuration log and the fem_system_ex2 > traceout file. > Thanks, we are aware of the issue, but not sure what the right fix is yet. (https://github.com/libMesh/libmesh/issues/1559) -- John |
From: Paul T. B. <ptb...@gm...> - 2018-01-22 17:38:13
|
Unfortunately, this is a known problem. Discussion here: https://github.com/libMesh/libmesh/issues/1559 TL; DR: Has to do with doing reinit on a single System (as opposed to all Systems as once). If you're not doing that, then you should be OK. HTH, Paul On Mon, Jan 22, 2018 at 12:35 PM, Daniel Vasconcelos < dan...@ou...> wrote: > Sorry, I have forgotten to include the actual libMesh configuration log. > > Once again. thanks in advance. > > Daniel F. M. Vasconcelos. > > 2018-01-22 15:28 GMT-02:00 Daniel Vasconcelos <dan...@ou...< > mailto:dan...@ou...>>: > Dear libMesh users, > > Currently I am with problems when running example fem_system_ex2. > MPI_ABORT is invoked in step 1 right after assembling the system (see > attached file). > > Below is the setup I am using: > > * CPU: Intel i7 > * Oracle Virtual Box 5.2.6 r120293 > * Host system: Windows 10 > * Guest system: Linux Mint 18.3 Mate > * Cloned libMesh master branch as per 01-21-2018 > * All libMesh optional libraries installed using apt-get (VTK, > openMPI, PETSc, HDF5, …) > > I have also attached the libMesh configuration log and the fem_system_ex2 > traceout file. > > Thanks in advance > > Daniel F. M. Vasconcelos. > > > Sent from Mail<https://eur02.safelinks.protection.outlook.com/?url= > https%3A%2F%2Fgo.microsoft.com%2Ffwlink%2F%3FLinkId% > 3D550986&data=02%7C01%7C%7C0e43d093590a4547252308d561bd8630% > 7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636522388979942063&sdata= > EhkdQ8CgY0oLi0wcajY4MIyWyXVv3dp4jetbpOUlwGg%3D&reserved=0> for Windows 10 > > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Libmesh-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-users > |
From: Daniel V. <dan...@ou...> - 2018-01-22 17:35:47
|
Sorry, I have forgotten to include the actual libMesh configuration log. Once again. thanks in advance. Daniel F. M. Vasconcelos. 2018-01-22 15:28 GMT-02:00 Daniel Vasconcelos <dan...@ou...<mailto:dan...@ou...>>: Dear libMesh users, Currently I am with problems when running example fem_system_ex2. MPI_ABORT is invoked in step 1 right after assembling the system (see attached file). Below is the setup I am using: * CPU: Intel i7 * Oracle Virtual Box 5.2.6 r120293 * Host system: Windows 10 * Guest system: Linux Mint 18.3 Mate * Cloned libMesh master branch as per 01-21-2018 * All libMesh optional libraries installed using apt-get (VTK, openMPI, PETSc, HDF5, …) I have also attached the libMesh configuration log and the fem_system_ex2 traceout file. Thanks in advance Daniel F. M. Vasconcelos. Sent from Mail<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgo.microsoft.com%2Ffwlink%2F%3FLinkId%3D550986&data=02%7C01%7C%7C0e43d093590a4547252308d561bd8630%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636522388979942063&sdata=EhkdQ8CgY0oLi0wcajY4MIyWyXVv3dp4jetbpOUlwGg%3D&reserved=0> for Windows 10 |
From: Daniel V. <dan...@ou...> - 2018-01-22 17:28:13
|
Dear libMesh users, Currently I am with problems when running example fem_system_ex2. MPI_ABORT is invoked in step 1 right after assembling the system (see attached file). Below is the setup I am using: * CPU: Intel i7 * Oracle Virtual Box 5.2.6 r120293 * Host system: Windows 10 * Guest system: Linux Mint 18.3 Mate * Cloned libMesh master branch as per 01-21-2018 * All libMesh optional libraries installed using apt-get (VTK, openMPI, PETSc, HDF5, …) I have also attached the libMesh configuration log and the fem_system_ex2 traceout file. Thanks in advance Daniel F. M. Vasconcelos. Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 |
From: Boris B. <bor...@bu...> - 2017-11-09 19:23:09
|
Sure, please take a look at the modified introduction example 3 <https://github.com/bboutkov/libmesh/tree/distmesh_linearpart_fail> which should trip said assert with -np 2. On Thu, Nov 9, 2017 at 12:40 PM, Paul T. Bauman <ptb...@gm...> wrote: > Boris, can you please send Roy a standalone libMesh example that he can > just compile and run and trip the assert? Thanks, > > On Thu, Nov 9, 2017 at 12:22 PM, Boris Boutkov <bor...@bu...> > wrote: > >> Having a simple DistributedMesh partitioner to compare vs ParMETIS would >> certainly be very useful in driving the original issue forward. >> >> The LinearPartitioner assert trip does appear reproducible with >> gcc7.2/mpich3.2 when running the refinement example >> <https://github.com/bboutkov/grins/tree/master_refine_test>from before. >> The initial mesh can be made as small as 4x4 for easier debugging, and to >> be totally explicit im building off libMesh master and manually attaching >> the partitioner right before partition() in prepare_for_use(). >> >> Thanks for the help - >> >> On Thu, Nov 9, 2017 at 10:50 AM, Roy Stogner <roy...@ic...> >> wrote: >> >>> >>> On Thu, 9 Nov 2017, Boris Boutkov wrote: >>> >>> Well, I eliminated PETSc and have been linking to MPI using >>>> --with-mpi=$MPI_DIR and playing with the refinement example I had >>>> mentioned earlier to try and eliminate ParMETIS due the the >>>> hang/crash issue. In these configs I attach either the >>>> LinearPartitioner or an SFC in prepare_for_use right before calling >>>> partition(). This causes assert trips in MeshComm::Redistribute >>>> where elem.proc_id != proc_id while unpacking elems (stack below). >>>> >>> >>> Shoot - I don't think either of those partitioners have been upgraded >>> to be compatible with DistributedMesh use. Just glancing at >>> LinearPartitioner, it looks like it'll do fine for an *initial* >>> partitioning, but then it'll scramble everything if it's ever asked to >>> do a *repartitioning* on an already-distributed mesh. >>> >>> I could probably fix that pretty quickly, if you've got a test case I >>> can replicate. >>> >>> SFC, on the other hand, I don't know about. We do distributed >>> space-filling-curve stuff elsewhere in the library with libHilbert, >>> and it's not trivial. >>> --- >>> Roy >>> >> >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Libmesh-devel mailing list >> Lib...@li... >> https://lists.sourceforge.net/lists/listinfo/libmesh-devel >> >> > |
From: Paul T. B. <ptb...@gm...> - 2017-11-09 17:40:59
|
Boris, can you please send Roy a standalone libMesh example that he can just compile and run and trip the assert? Thanks, On Thu, Nov 9, 2017 at 12:22 PM, Boris Boutkov <bor...@bu...> wrote: > Having a simple DistributedMesh partitioner to compare vs ParMETIS would > certainly be very useful in driving the original issue forward. > > The LinearPartitioner assert trip does appear reproducible with > gcc7.2/mpich3.2 when running the refinement example > <https://github.com/bboutkov/grins/tree/master_refine_test>from before. > The initial mesh can be made as small as 4x4 for easier debugging, and to > be totally explicit im building off libMesh master and manually attaching > the partitioner right before partition() in prepare_for_use(). > > Thanks for the help - > > On Thu, Nov 9, 2017 at 10:50 AM, Roy Stogner <roy...@ic...> > wrote: > >> >> On Thu, 9 Nov 2017, Boris Boutkov wrote: >> >> Well, I eliminated PETSc and have been linking to MPI using >>> --with-mpi=$MPI_DIR and playing with the refinement example I had >>> mentioned earlier to try and eliminate ParMETIS due the the >>> hang/crash issue. In these configs I attach either the >>> LinearPartitioner or an SFC in prepare_for_use right before calling >>> partition(). This causes assert trips in MeshComm::Redistribute >>> where elem.proc_id != proc_id while unpacking elems (stack below). >>> >> >> Shoot - I don't think either of those partitioners have been upgraded >> to be compatible with DistributedMesh use. Just glancing at >> LinearPartitioner, it looks like it'll do fine for an *initial* >> partitioning, but then it'll scramble everything if it's ever asked to >> do a *repartitioning* on an already-distributed mesh. >> >> I could probably fix that pretty quickly, if you've got a test case I >> can replicate. >> >> SFC, on the other hand, I don't know about. We do distributed >> space-filling-curve stuff elsewhere in the library with libHilbert, >> and it's not trivial. >> --- >> Roy >> > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Libmesh-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-devel > > |
From: Boris B. <bor...@bu...> - 2017-11-09 17:22:53
|
Having a simple DistributedMesh partitioner to compare vs ParMETIS would certainly be very useful in driving the original issue forward. The LinearPartitioner assert trip does appear reproducible with gcc7.2/mpich3.2 when running the refinement example <https://github.com/bboutkov/grins/tree/master_refine_test>from before. The initial mesh can be made as small as 4x4 for easier debugging, and to be totally explicit im building off libMesh master and manually attaching the partitioner right before partition() in prepare_for_use(). Thanks for the help - On Thu, Nov 9, 2017 at 10:50 AM, Roy Stogner <roy...@ic...> wrote: > > On Thu, 9 Nov 2017, Boris Boutkov wrote: > > Well, I eliminated PETSc and have been linking to MPI using >> --with-mpi=$MPI_DIR and playing with the refinement example I had >> mentioned earlier to try and eliminate ParMETIS due the the >> hang/crash issue. In these configs I attach either the >> LinearPartitioner or an SFC in prepare_for_use right before calling >> partition(). This causes assert trips in MeshComm::Redistribute >> where elem.proc_id != proc_id while unpacking elems (stack below). >> > > Shoot - I don't think either of those partitioners have been upgraded > to be compatible with DistributedMesh use. Just glancing at > LinearPartitioner, it looks like it'll do fine for an *initial* > partitioning, but then it'll scramble everything if it's ever asked to > do a *repartitioning* on an already-distributed mesh. > > I could probably fix that pretty quickly, if you've got a test case I > can replicate. > > SFC, on the other hand, I don't know about. We do distributed > space-filling-curve stuff elsewhere in the library with libHilbert, > and it's not trivial. > --- > Roy > |
From: Roy S. <roy...@ic...> - 2017-11-09 15:50:32
|
On Thu, 9 Nov 2017, Boris Boutkov wrote: > Well, I eliminated PETSc and have been linking to MPI using > --with-mpi=$MPI_DIR and playing with the refinement example I had > mentioned earlier to try and eliminate ParMETIS due the the > hang/crash issue. In these configs I attach either the > LinearPartitioner or an SFC in prepare_for_use right before calling > partition(). This causes assert trips in MeshComm::Redistribute > where elem.proc_id != proc_id while unpacking elems (stack below). Shoot - I don't think either of those partitioners have been upgraded to be compatible with DistributedMesh use. Just glancing at LinearPartitioner, it looks like it'll do fine for an *initial* partitioning, but then it'll scramble everything if it's ever asked to do a *repartitioning* on an already-distributed mesh. I could probably fix that pretty quickly, if you've got a test case I can replicate. SFC, on the other hand, I don't know about. We do distributed space-filling-curve stuff elsewhere in the library with libHilbert, and it's not trivial. --- Roy |
From: Boris B. <bor...@bu...> - 2017-11-09 15:34:09
|
Well, I eliminated PETSc and have been linking to MPI using --with-mpi=$MPI_DIR and playing with the refinement example I had mentioned earlier to try and eliminate ParMETIS due the the hang/crash issue. In these configs I attach either the LinearPartitioner or an SFC in prepare_for_use right before calling partition(). This causes assert trips in MeshComm::Redistribute where elem.proc_id != proc_id while unpacking elems (stack below). These asserts trigger at slightly smaller square meshes than in the original issue; SFC with 3^2 initial elems, while Linear with 4^2. At this point I wasnt sure about the MPI-partitioner support; Is attaching a partitioner ok in prepare_for_use or is there some setup stage im missing? If so, it seems theres very little that touches the mesh before this point, it seems that pretty much somethings already off at _refine_elements() since this all seems seperated from the partitioner. I tried investigating some of the make_elems_parallel_consistent calls and the libmesh_assert_valid_parallel_ids() call right after but so far no luck. One nagging/lingering issue I have is with the us using PETSc flags for MPI. In the PETSc build scripts that I originally was using we had to pass in an extra -lpmi to the PETSc LDFLAGS on the local cluster. The recent gcc7.2/Mvapich2 upgrade came with pmi2 that im to also pass to slurm and so in the PETSc builds I supply -lpmi2 now. On the standalone MPI builds I tried exporting libmesh_LDFLAGS and libmesh_LIBS to link against this library, but was not sure if it was picked up as -lpmi2 didnt show in the libmesh_optional_LIBS in the configure summaries like it does when linking though PETSc. Im quite unfamiliar with thie pmi library in general but I still have lingering fears this all could somehow stem from this. Thanks for any info you can provide, Boris Stack Trace ======= #0 __cxxabiv1::__cxa_throw (obj=obj@entry=0x9040e0, tinfo=0x407e68 <typeinfo for libMesh::LogicError>, tinfo@entry=0x7ffff76337b0 <typeinfo for libMesh::LogicError>, dest=0x403250 <libMesh::LogicError::~LogicError()>, dest@entry=0x7ffff6259370 <libMesh::LogicError::~LogicError()>) at ../../../../gcc/libstdc++-v3/libsupc++/eh_throw.cc:75 #1 0x00007ffff6a98f8a in libMesh::Parallel::Packing<libMesh::Elem*>::unpack<__gnu_cxx::__normal_iterator<unsigned long const*, std::vector<unsigned long, std::allocator<unsigned long> > >, libMesh::MeshBase> (in=..., mesh=mesh@entry=0x64a0e0) at ../source/src/parallel/parallel_elem.C:474 #2 0x00007ffff6a995c5 in libMesh::Parallel::Packing<libMesh::Elem*>::unpack<__gnu_cxx::__normal_iterator<unsigned long const*, std::vector<unsigned long, std::allocator<unsigned long> > >, libMesh::DistributedMesh> (in=..., in@entry=987654321, mesh=mesh@entry=0x64a0e0) at ../source/src/parallel/parallel_elem.C:814 #3 0x00007ffff688f8ab in unpack_range<libMesh::DistributedMesh, unsigned long, libMesh::mesh_inserter_iterator<libMesh::Elem>, libMesh::Elem*> (out_iter=..., context=<optimized out>, buffer=std::vector of length 195733, capacity 195733 = {...}) at ./include/libmesh/parallel_implementation.h:607 #4 libMesh::Parallel::Communicator::receive_packed_range<libMesh::DistributedMesh, libMesh::mesh_inserter_iterator<libMesh::Elem>, libMesh::Elem*> ( this=0x649188, src_processor_id=src_processor_id@entry=4294967294, context=context@entry=0x64a0e0, out_iter=out_iter@entry=..., output_type=output_type@entry=0x0, tag=...) at ./include/libmesh/parallel_implementation.h:2761 #5 0x00007ffff687d70e in libMesh::MeshCommunication::redistribute ( this=this@entry=0x7fffffff9cdf, mesh=..., newly_coarsened_only=newly_coarsened_only@entry=false) at ../source/src/mesh/mesh_communication.C:500 #6 0x00007ffff67ef2f2 in libMesh::DistributedMesh::redistribute ( this=0x64a0e0) at ../source/src/mesh/distributed_mesh.C:835 #7 0x00007ffff6acb1e6 in libMesh::Partitioner::partition ( this=<optimized out>, mesh=..., n=<optimized out>) at ../source/src/partitioning/partitioner.C:85 #8 0x00007ffff685aa6e in libMesh::MeshBase::partition ( this=this@entry=0x64a0e0, n_parts=2) at ../source/src/mesh/mesh_base.C:485 #9 0x00007ffff685f8fb in partition (this=0x64a0e0) at ./include/libmesh/mesh_base.h:728 #10 libMesh::MeshBase::prepare_for_use (this=0x64a0e0, skip_renumber_nodes_and_elements=skip_renumber_nodes_and_elements@entry=false, skip_find_neighbors=skip_find_neighbors@entry=false) at ../source/src/mesh/mesh_base.C:273 #11 0x00007ffff6938b02 in libMesh::MeshRefinement::uniformly_refine ( this=this@entry=0x7fffffffa9e0, n=5) at ../source/src/mesh/mesh_refinement.C:1723 #12 0x00007ffff7a8abc5 in GRINS::MeshBuilder::do_mesh_refinement_from_input ( this=this@entry=0x646820, input=..., comm=..., mesh=...) at ../../source/src/solver/src/mesh_builder.C:393 #13 0x00007ffff7a8bc6d in GRINS::MeshBuilder::build (this=0x646820, input=..., comm=...) at ../../source/src/solver/src/mesh_builder.C:167 #14 0x00007ffff7aa20ed in GRINS::SimulationBuilder::build_mesh ( this=this@entry=0x7fffffffb108, input=..., comm=...) at ../../source/src/solver/src/simulation_builder.C:68 #15 0x00007ffff7a947d2 in GRINS::Simulation::Simulation (this=0x89b910, input=..., sim_builder=..., comm=...) at ../../source/src/solver/src/simulation.C:123 #16 0x00007ffff7acdae2 in GRINS::Runner::init (this=this@entry =0x7fffffffb100) at ../../source/src/solver/src/runner.C:59 ---Type <return> to continue, or q <return> to quit--- #17 0x0000000000402c9d in main (argc=<optimized out>, argv=<optimized out>) at ../../source/src/apps/grins.C:31 On Thu, Nov 9, 2017 at 9:07 AM, Roy Stogner <roy...@ic...> wrote: > > On Mon, 6 Nov 2017, Boris Boutkov wrote: > > In some preliminary testing I encountered issues with the >> LinearPartitioner, >> > > Could you be more specific? That partitioner is dead simple, so I > wouldn't have expected to see many bugs, but it's also awful, so if > there were many bugs there's probably been nobody to use it and > encounter them for a decade. > > and the SFCPartitioner complained it wasnt enabled despite me >> configuring using --enable-everything. Any ideas if theres anything >> simple I could have forgotten? >> > > Yeah: the SFC partitioner isn't under an LGPL-friendly license, so > unless you add --disable-strict-lgpl to your configure line, it still > gets dropped for that reason. > --- > Roy > |
From: Roy S. <roy...@ic...> - 2017-11-09 14:07:51
|
On Mon, 6 Nov 2017, Boris Boutkov wrote: > In some preliminary testing I encountered issues with the > LinearPartitioner, Could you be more specific? That partitioner is dead simple, so I wouldn't have expected to see many bugs, but it's also awful, so if there were many bugs there's probably been nobody to use it and encounter them for a decade. > and the SFCPartitioner complained it wasnt enabled despite me > configuring using --enable-everything. Any ideas if theres anything > simple I could have forgotten? Yeah: the SFC partitioner isn't under an LGPL-friendly license, so unless you add --disable-strict-lgpl to your configure line, it still gets dropped for that reason. --- Roy |
From: Boris B. <bor...@bu...> - 2017-11-06 18:54:47
|
Hello all, As part of further investigations for #1468 <https://github.com/libMesh/libmesh/issues/1468> Ive started to experiment with various partitioners to try and further narrow down the problem. While the issue apparently appears with both PETSc's Parametis as well as the libMesh contrib Parametis, I wanted to clarify if this affects other parallel Partitioners and wanted to double check which ones are expected to support builds with MPI enabled. In some preliminary testing I encountered issues with the LinearPartitioner, and the SFCPartitioner complained it wasnt enabled despite me configuring using --enable-everything. Any ideas if theres anything simple I could have forgotten? Thanks for any info in advanced, - Boris Boutkov |
From: simone <sob...@ya...> - 2017-10-09 16:03:58
|
Dear Libmesh developers, I apologize if this is not the right place to report this problem, or if it has been already raised by someone else. I searched previous digests for a solution, but I did not find anything similar. I'm having trouble using GetPot library (not the one included in libMesh installation), paired with Libmesh. I have isolated the problem in the attached pack, when running it I have a segfault error at the exit of the program. I know there's a conflict beetwen getpot class definitions, but I don't understand why this occur even without including the libmesh getpot.h header. I'm forced to use GetPot library in my program, I guess one possible solution could be to install libMesh library with the variable GETPOT_NAMESPACE defined, but I'd like to avoid it because my code is part of a library that can be delivered to external users using their standard installation of libMesh, and I can't force them to recompile their installed package. Is there another possible solution? Another question, about public method: libMesh::GnuPlotIO::write(fname) It seems to do nothing but raising an exception, and there's no way to avoid it. Is this the expected behaviour? Is there a way to print a 1d mesh on a gnuplot readable file as the method is depicted to do in the class documentation? I thank you in advance for your answer. Simone |
From: Boris B. <bor...@bu...> - 2017-09-18 17:23:05
|
Intel MPI Library Version 2017.0.1 I tried devel mode earlier and it was getting stuck at the same spot inside the communicate_bins() call; specifically when trying this->comm().get() in the MPI_Gatherv which was inlined away. I saw no output indicating something was wrong in terms of asserts, but Ill try dbg and see what else I can dig up and ill report back. On Mon, Sep 18, 2017 at 12:54 PM, Roy Stogner <roy...@ic...> wrote: > > On Mon, 18 Sep 2017, Boris Boutkov wrote: > > - I often attach a gdb session to the running program and notice >> the commonly recurring stack (see below). It seems the issue is >> always around the HilbertIndices parallel sort communicate_bins() >> with invalid looking communicator ids in the above PMPI_Allgather >> calls. >> > > If it's replicable for particular mesh sizes, could you try running > in devel (or better, dbg) modes and see if you get any more > informative output? > > What MPI implementation+version are you using? > --- > Roy |
From: Roy S. <roy...@ic...> - 2017-09-18 16:54:32
|
On Mon, 18 Sep 2017, Boris Boutkov wrote: > - I often attach a gdb session to the running program and notice > the commonly recurring stack (see below). It seems the issue is > always around the HilbertIndices parallel sort communicate_bins() > with invalid looking communicator ids in the above PMPI_Allgather > calls. If it's replicable for particular mesh sizes, could you try running in devel (or better, dbg) modes and see if you get any more informative output? What MPI implementation+version are you using? --- Roy |
From: Boris B. <bor...@bu...> - 2017-09-18 16:20:09
|
Hello all, Ive run into an issue where the ParmetisPartitioner seems to ocassionaly hang during initialization on UB's CCR cluster. Unfortunately, this bug is a bit slippery and I havent always had the best results reproducing it completely consistently. The testing scenario is simply me uniformly refining (through a GRINS input file) a grid a number of times to prepare for some later multigrid computations. Ive seen the mentioned issue mostly commonly with square starting grids of 75^2 elements when running with 7 processors on one node, but more consistently I've seen the issue with 25^2 elements on two nodes with four processors. Small pertubations to the processor count and number of starting elements dont seem to trigger the bug so its something quite specific, and as odd as it sounds, Ive had some runs go through fine under seemingly identical settings. Some notes: - Ive noticed the same issue both configuring --with-metis=PETSc as well as without. Ive attached a sample config.log in case its useful. - I often attach a gdb session to the running program and notice the commonly recurring stack (see below). It seems the issue is always around the HilbertIndices parallel sort communicate_bins() with invalid looking communicator ids in the above PMPI_Allgather calls. Other than these couple of hints though, I'm at a loss as to what could cause such weird behaviour. Ive never managed to reproduce this on my local development machines so it it seems like it could be some machine / mpi specific configuration thing, but after a lot of reconfiguring attempts I'm running out of things to try. Any chance anyone has seen such a behaviour, or has any ideas as to what I can investigate further got get some more useful info? Thanks for any help! - Boris #0 poll_all_fboxes (cell=<optimized out>) at ../../src/mpid/ch3/channels/nemesis/include/mpid_nem_fbox.h:94 #1 MPID_nem_mpich_blocking_recv (cell=<optimized out>, in_fbox=<optimized out>, completions=<optimized out>) at ../../src/mpid/ch3/channels/nemesis/include/mpid_nem_inline.h:1232 #2 PMPIDI_CH3I_Progress (progress_state=0x7ffcb5bff384, is_blocking=0) at ../../src/mpid/ch3/channels/nemesis/src/ch3_progress.c:589 #3 0x00007f1596f6ca73 in MPIC_Sendrecv (sendbuf=0x7ffcb5bff384, sendcount=0, sendtype=0, dest=1, sendtag=5, recvbuf=0x7, recvcount=7, recvtype=1275069445, source=2, recvtag=7, comm_ptr=0x7f159792e780 <MPID_Comm_builtin>, status=0x7ffcb5bff458, errflag=0x7ffcb5bff5f8) at ../../src/mpi/coll/helper_fns.c:268 #4 0x00007f1596d6297f in MPIR_Allgather_intra (sendbuf=0x7ffcb5bff384, sendcount=0, sendtype=0, recvbuf=0x1, recvcount=5, recvtype=7, comm_ptr=0x5265c, errflag=0x1) at ../../src/mpi/coll/allgather.c:257 #5 0x00007f1596d6548e in PMPI_Allgather (sendbuf=0x7ffcb5bff384, sendcount=0, sendtype=0, recvbuf=0x1, recvcount=5, recvtype=7, comm=-1245709088) at ../../src/mpi/coll/allgather.c:858 #6 0x00007f15a1491d17 in libMesh::Parallel::Sort<std::pair<Hilbert::HilbertIndices, unsigned long>, unsigned int>::communicate_bins() () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #7 0x00007f15a149d029 in libMesh::Parallel::Sort<std::pair<Hilbert::HilbertIndices, unsigned long>, unsigned int>::sort() () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #8 0x00007f15a1312d17 in void libMesh::MeshCommunication::find_global_indices<libMesh::MeshBase::const_element_iterator>(libMesh::Parallel::Communicator const&, libMesh::BoundingBox const&, libMesh::MeshBase::const_element_iterator const&, libMesh::MeshBase::const_element_iterator const&, std::vector<unsigned int, std::allocator<unsigned int> >&) const () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #9 0x00007f15a14b2281 in libMesh::ParmetisPartitioner::initialize(libMesh::MeshBase const&, unsigned int) () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #10 0x00007f15a14b3d2d in libMesh::ParmetisPartitioner::_do_repartition(libMesh::MeshBase&, unsigned int) () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #11 0x00007f15a14bc2fe in libMesh::Partitioner::partition(libMesh::MeshBase&, unsigned int) () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #12 0x00007f15a12f05c4 in libMesh::MeshBase::prepare_for_use(bool, bool) () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #13 0x00007f15a137c7e0 in libMesh::MeshRefinement::uniformly_refine(unsigned int) () from /projects/academic/pbauman/borisbou/software/planex/libmesh/install/lib/libmesh_opt.so.0 #14 0x00007f15a230390b in GRINS::MeshBuilder::do_mesh_refinement_from_input(GetPot const&, libMesh::Parallel::Communicator const&, libMesh::UnstructuredMesh&) const () from /projects/academic/pbauman/borisbou/software/planex/grins/install/opt/lib/libgrins.so.0 #15 0x00007f15a230614c in GRINS::MeshBuilder::build(GetPot const&, libMesh::Parallel::Communicator const&) () from /projects/academic/pbauman/borisbou/software/planex/grins/install/opt/lib/libgrins.so.0 #16 0x00007f15a231e86d in GRINS::SimulationBuilder::build_mesh(GetPot const&, libMesh::Parallel::Communicator const&) () from /projects/academic/pbauman/borisbou/software/planex/grins/install/opt/lib/libgrins.so.0 #17 0x00007f15a2311bfc in GRINS::Simulation::Simulation(GetPot const&, GetPot&, GRINS::SimulationBuilder&, libMesh::Parallel::Communicator const&) () from /projects/academic/pbauman/borisbou/software/planex/grins/install/opt/lib/libgrins.so.0 #18 0x0000000000407312 in main () |
From: Derek G. <fri...@gm...> - 2017-07-07 18:28:23
|
Generally we are recommending: Linux: OpenSpeedShop and Intel VTune Mac: Instruments (It's a tool that comes with XCode) Derek On Fri, Jul 7, 2017 at 2:11 PM Roy Stogner <roy...@ic...> wrote: > > On Fri, 7 Jul 2017, Fabio Canesin wrote: > > > I`m going some profiling of our code with focus on energy and just > > wanted to know if there is any recommended method, library or > > something that has been a wish in this regards. > > For CPU-limited parts of the code, perf is very useful. I believe > our "oprofile" compilation method generates executables that perf (or > oprofile, of course) is happy with. > > valgrind is extremely useful, but only after you've generated test > cases that are simultaneously fast enough to run through valgrind and > still representative of real problems, which may be trivial or may be > impossible for you. > > For MPI-limited parts of the code, I think the INL folks have some > experience; hopefully they'll chime in. > --- > Roy > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Libmesh-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-devel > |
From: Roy S. <roy...@ic...> - 2017-07-07 18:11:18
|
On Fri, 7 Jul 2017, Fabio Canesin wrote: > I`m going some profiling of our code with focus on energy and just > wanted to know if there is any recommended method, library or > something that has been a wish in this regards. For CPU-limited parts of the code, perf is very useful. I believe our "oprofile" compilation method generates executables that perf (or oprofile, of course) is happy with. valgrind is extremely useful, but only after you've generated test cases that are simultaneously fast enough to run through valgrind and still representative of real problems, which may be trivial or may be impossible for you. For MPI-limited parts of the code, I think the INL folks have some experience; hopefully they'll chime in. --- Roy |