You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
(2) |
Nov
(27) |
Dec
(31) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(6) |
Feb
(15) |
Mar
(33) |
Apr
(10) |
May
(46) |
Jun
(11) |
Jul
(21) |
Aug
(15) |
Sep
(13) |
Oct
(23) |
Nov
(1) |
Dec
(8) |
2005 |
Jan
(27) |
Feb
(57) |
Mar
(86) |
Apr
(23) |
May
(37) |
Jun
(34) |
Jul
(24) |
Aug
(17) |
Sep
(50) |
Oct
(24) |
Nov
(10) |
Dec
(60) |
2006 |
Jan
(47) |
Feb
(46) |
Mar
(127) |
Apr
(19) |
May
(26) |
Jun
(62) |
Jul
(47) |
Aug
(51) |
Sep
(61) |
Oct
(42) |
Nov
(50) |
Dec
(33) |
2007 |
Jan
(60) |
Feb
(55) |
Mar
(77) |
Apr
(102) |
May
(82) |
Jun
(102) |
Jul
(169) |
Aug
(117) |
Sep
(80) |
Oct
(37) |
Nov
(51) |
Dec
(43) |
2008 |
Jan
(71) |
Feb
(94) |
Mar
(98) |
Apr
(125) |
May
(54) |
Jun
(119) |
Jul
(60) |
Aug
(111) |
Sep
(118) |
Oct
(125) |
Nov
(119) |
Dec
(94) |
2009 |
Jan
(109) |
Feb
(38) |
Mar
(93) |
Apr
(88) |
May
(29) |
Jun
(57) |
Jul
(53) |
Aug
(48) |
Sep
(68) |
Oct
(151) |
Nov
(23) |
Dec
(35) |
2010 |
Jan
(84) |
Feb
(60) |
Mar
(184) |
Apr
(112) |
May
(60) |
Jun
(90) |
Jul
(23) |
Aug
(70) |
Sep
(119) |
Oct
(27) |
Nov
(47) |
Dec
(54) |
2011 |
Jan
(22) |
Feb
(19) |
Mar
(92) |
Apr
(93) |
May
(35) |
Jun
(91) |
Jul
(32) |
Aug
(61) |
Sep
(7) |
Oct
(69) |
Nov
(81) |
Dec
(23) |
2012 |
Jan
(64) |
Feb
(95) |
Mar
(35) |
Apr
(36) |
May
(63) |
Jun
(98) |
Jul
(70) |
Aug
(171) |
Sep
(149) |
Oct
(64) |
Nov
(67) |
Dec
(126) |
2013 |
Jan
(108) |
Feb
(104) |
Mar
(171) |
Apr
(133) |
May
(108) |
Jun
(100) |
Jul
(93) |
Aug
(126) |
Sep
(74) |
Oct
(59) |
Nov
(145) |
Dec
(93) |
2014 |
Jan
(38) |
Feb
(45) |
Mar
(26) |
Apr
(41) |
May
(125) |
Jun
(70) |
Jul
(61) |
Aug
(66) |
Sep
(60) |
Oct
(110) |
Nov
(27) |
Dec
(30) |
2015 |
Jan
(43) |
Feb
(67) |
Mar
(71) |
Apr
(92) |
May
(39) |
Jun
(15) |
Jul
(46) |
Aug
(63) |
Sep
(84) |
Oct
(82) |
Nov
(69) |
Dec
(45) |
2016 |
Jan
(92) |
Feb
(91) |
Mar
(148) |
Apr
(43) |
May
(58) |
Jun
(117) |
Jul
(92) |
Aug
(140) |
Sep
(49) |
Oct
(33) |
Nov
(85) |
Dec
(40) |
2017 |
Jan
(41) |
Feb
(36) |
Mar
(49) |
Apr
(41) |
May
(73) |
Jun
(51) |
Jul
(12) |
Aug
(69) |
Sep
(26) |
Oct
(43) |
Nov
(75) |
Dec
(23) |
2018 |
Jan
(86) |
Feb
(36) |
Mar
(50) |
Apr
(28) |
May
(53) |
Jun
(65) |
Jul
(26) |
Aug
(43) |
Sep
(32) |
Oct
(28) |
Nov
(52) |
Dec
(17) |
2019 |
Jan
(39) |
Feb
(26) |
Mar
(71) |
Apr
(30) |
May
(73) |
Jun
(18) |
Jul
(5) |
Aug
(10) |
Sep
(8) |
Oct
(24) |
Nov
(12) |
Dec
(34) |
2020 |
Jan
(17) |
Feb
(10) |
Mar
(6) |
Apr
(4) |
May
(15) |
Jun
(3) |
Jul
(8) |
Aug
(15) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(4) |
2021 |
Jan
(4) |
Feb
(4) |
Mar
(21) |
Apr
(14) |
May
(13) |
Jun
(18) |
Jul
(1) |
Aug
(39) |
Sep
(1) |
Oct
|
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(2) |
Apr
(8) |
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
(3) |
Nov
|
Dec
|
2023 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(7) |
Sep
(3) |
Oct
|
Nov
|
Dec
(1) |
From: Renato P. <re...@gm...> - 2019-05-12 02:09:46
|
Hi It seems that partitioning is taking a lot of time. If I skip it, it runs much faster. >> mesh.skip_partitioning(true); Does that make any sense? Renato On Sat, May 11, 2019 at 10:59 PM Renato Poli <re...@gm...> wrote: > Hi Kirk, > > I see there is something related to the parallelization. > I am using mpirun.mpich. > With a single processor, it runs much faster than with 4 processors. > Please find data below. > > Why parallel reading would be so slower? > Any suggestions? > > XDR - 1 processor > # Stopwatch "LibMesh::read": 12.7637 s > XDR - 4 processors > # Stopwatch "LibMesh::read": 135.473 s > EXO - 1 processor > # Stopwatch "LibMesh::read": 0.294671 s > EXO - 4 processrs > # Stopwatch "LibMesh::read": 198.897 s > > This is the mesh: > ====== > Mesh Information: > elem_dimensions()={2} > spatial_dimension()=2 > n_nodes()=40147 > n_local_nodes()=40147 > n_elem()=19328 > n_local_elem()=19328 > n_active_elem()=19328 > n_subdomains()=1 > n_partitions()=1 > n_processors()=1 > n_threads()=1 > processor_id()=0 > > > On Sat, May 11, 2019 at 7:15 PM Renato Poli <re...@gm...> wrote: > >> Thanks. >> I am currently running on a virtual machine - not sure mpi is getting >> along with that. >> I will try other approaches and bring more information if necessary. >> >> rgds, >> Renato >> >> On Sat, May 11, 2019 at 5:49 PM Kirk, Benjamin (JSC-EG311) < >> ben...@na...> wrote: >> >>> Definitely not right, but that seems like something in your machine or >>> filesystem. >>> >>> You can use the “meshtool-opt” command to convert it to XDR and try that >>> for comparison. We’ve got users who routinely read massive meshes with >>> ExodusII, so I’m skeptical of a performance regression. >>> >>> -Ben >>> >>> >>> >>> >>> ------------------------------ >>> On: 11 May 2019 15:24, "Renato Poli" <re...@gm...> wrote: >>> >>> Hi >>> >>> I am reading in a mesh of 20.000 elements. >>> I am using Exodus format. >>> It takes up to 4 minutes. >>> Is that right? >>> How can I enhance performance? >>> >>> Thanks, >>> Renato >>> >>> _______________________________________________ >>> Libmesh-users mailing list >>> Lib...@li... >>> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_libmesh-2Dusers&d=DwICAg&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=cL6XMjnReoBDeWspbxyIhOHmg_O4uY2LnJOBmaiGPkI&m=g9IqMQsNW7V7TetwCxIPRjeT6JqEPzvKRkxJgwL-sl8&s=LvY_-3DUqPYdv-F9qjoCV95-Em1ASG0AXQvo6KvU308&e= >>> >>> |
From: Renato P. <re...@gm...> - 2019-05-11 20:23:55
|
Hi I am reading in a mesh of 20.000 elements. I am using Exodus format. It takes up to 4 minutes. Is that right? How can I enhance performance? Thanks, Renato |
From: Stogner, R. H <roy...@ic...> - 2019-05-10 02:59:44
|
On Thu, 9 May 2019, Alexander Lindsay wrote: > I'm getting the helgrind error below. Is this a false positive? My intuition says "no, this is a real bug", but I'm having trouble figuring out what's really going on here. There's a race condition between a read and a write to the same spot in the same PetscVector, from two different threads working on the SortAndCopy::operator()? But I don't immediately see how that's possible. That operator reads from one PetscVector and writes to a different PetscVector. Can you set up a case that (at least usually) reproduces the problem? Thanks, --- Roy |
From: Alexander L. <ale...@gm...> - 2019-05-09 20:26:46
|
I'm getting the helgrind error below. Is this a false positive? ==27566== ---Thread-Announcement------------------------------------------ ==27566== ==27566== Thread #153 was created ==27566== at 0xD9433DE: clone (clone.S:74) ==27566== by 0xCB7B149: create_thread (createthread.c:102) ==27566== by 0xCB7CE83: pthread_create@@GLIBC_2.2.5 (pthread_create.c:679) ==27566== by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0x87E2B3D: void libMesh::Threads::parallel_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, libMesh::GenericProjector<libMesh::OldSolutionValue<doubl e, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradie nt<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy>(libMesh::StoredRange<libMesh::MeshBase::const_element_it erator, libMesh::Elem const*> const&, libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, l ibMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::V ectorSetAction<double> >::SortAndCopy&) (threads_pthread.h:444) ==27566== by 0x880032A: libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::Old SolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAct ion<double> >::project(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (generic_projector.h:955) ==27566== by 0x87BE8AB: libMesh::System::project_vector(libMesh::NumericVector<double> const&, libMesh::NumericVector<double>&, int) const (system_projection.C:381) ==27566== by 0x87BF561: libMesh::System::project_vector(libMesh::NumericVector<double>&, int) const (system_projection.C:257) ==27566== by 0x87991DF: libMesh::System::restrict_vectors() (system.C:334) ==27566== by 0x8753C77: libMesh::EquationSystems::reinit_solutions() (equation_systems.C:210) ==27566== by 0x87540A0: libMesh::EquationSystems::reinit() (equation_systems.C:123) ==27566== by 0x6F38FFD: FEProblemBase::reinitBecauseOfGhostingOrNewGeomObjects() (FEProblemBase.C:3284) ==27566== ==27566== ---Thread-Announcement------------------------------------------ ==27566== ==27566== Thread #152 was created ==27566== at 0xD9433DE: clone (clone.S:74) ==27566== by 0xCB7B149: create_thread (createthread.c:102) ==27566== by 0xCB7CE83: pthread_create@@GLIBC_2.2.5 (pthread_create.c:679) ==27566== by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0x87E2B21: void libMesh::Threads::parallel_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, libMesh::GenericProjector<libMesh::OldSolutionValue<doubl e, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradie nt<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy>(libMesh::StoredRange<libMesh::MeshBase::const_element_it erator, libMesh::Elem const*> const&, libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, l ibMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::V ectorSetAction<double> >::SortAndCopy&) (threads_pthread.h:444) ==27566== by 0x880032A: libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::Old SolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAct ion<double> >::project(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (generic_projector.h:955) ==27566== by 0x87BE8AB: libMesh::System::project_vector(libMesh::NumericVector<double> const&, libMesh::NumericVector<double>&, int) const (system_projection.C:381) ==27566== by 0x87BF561: libMesh::System::project_vector(libMesh::NumericVector<double>&, int) const (system_projection.C:257) ==27566== by 0x87991DF: libMesh::System::restrict_vectors() (system.C:334) ==27566== by 0x8753C77: libMesh::EquationSystems::reinit_solutions() (equation_systems.C:210) ==27566== by 0x87540A0: libMesh::EquationSystems::reinit() (equation_systems.C:123) ==27566== by 0x6F38FFD: FEProblemBase::reinitBecauseOfGhostingOrNewGeomObjects() (FEProblemBase.C:3284) ==27566== ==27566== ---Thread-Announcement------------------------------------------ ==27566== ==27566== Thread #1 is the program's root thread ==27566== ==27566== ---------------------------------------------------------------- ==27566== ==27566== Lock at 0x12D10458 was first observed ==27566== at 0x4C321BC: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0x85B4E8B: __gthread_mutex_lock (gthr-default.h:748) ==27566== by 0x85B4E8B: lock (std_mutex.h:103) ==27566== by 0x85B4E8B: lock_guard (std_mutex.h:162) ==27566== by 0x85B4E8B: libMesh::PetscVector<double>::_get_array(bool) const (petsc_vector.C:1374) ==27566== by 0x5934483: libMesh::PetscVector<double>::get(std::vector<unsigned int, std::allocator<unsigned int> > const&, double*) const (petsc_vector.h:1090) ==27566== by 0x87C80BB: get (numeric_vector.h:828) ==27566== by 0x87C80BB: libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>::eval_old_dofs(libMesh::Elem const&, unsigned int, unsigned int, std::vector<unsigned int, std::allocator<unsigned int> >&, std::vector<double, std::allocator<double> >&) (generic_projector.h:715) ==27566== by 0x87E0E4A: libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (generic_projector.h:1441) ==27566== by 0x87E34B2: void* libMesh::Threads::run_body<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy>(void*) (threads_pthread.h:236) ==27566== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0xCB7C6B9: start_thread (pthread_create.c:333) ==27566== by 0xD94341C: clone (clone.S:109) ==27566== Address 0x12d10458 is 72 bytes inside a block of size 176 alloc'd ==27566== at 0x4C2F50F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0x85ADAF1: make_unique<libMesh::PetscVector<double>, const libMesh::Parallel::Communicator&, libMesh::ParallelType> (unique_ptr.h:825) ==27566== by 0x85ADAF1: libMesh::NumericVector<double>::build(libMesh::Parallel::Communicator const&, libMesh::SolverPackage) (numeric_vector.C:62) ==27566== by 0x87BE4EB: libMesh::System::project_vector(libMesh::NumericVector<double> const&, libMesh::NumericVector<double>&, int) const (system_projection.C:342) ==27566== by 0x87BF561: libMesh::System::project_vector(libMesh::NumericVector<double>&, int) const (system_projection.C:257) ==27566== by 0x87991DF: libMesh::System::restrict_vectors() (system.C:334) ==27566== by 0x8753C77: libMesh::EquationSystems::reinit_solutions() (equation_systems.C:210) ==27566== by 0x87540A0: libMesh::EquationSystems::reinit() (equation_systems.C:123) ==27566== by 0x6F38FFD: FEProblemBase::reinitBecauseOfGhostingOrNewGeomObjects() (FEProblemBase.C:3284) ==27566== by 0x6F3A292: FEProblemBase::initialSetup() (FEProblemBase.C:770) ==27566== by 0x71FC4D1: MooseApp::executeExecutioner() (MooseApp.C:849) ==27566== by 0x59DF420: OutputApp::executeExecutioner() (OutputApp.C:144) ==27566== by 0x71FC56C: MooseApp::run() (MooseApp.C:961) ==27566== Block was alloc'd by thread #1 ==27566== ==27566== Possible data race during read of size 1 at 0x12D10430 by thread #153 ==27566== Locks held: none ==27566== at 0x85B4DF7: load (atomic_base.h:396) ==27566== by 0x85B4DF7: operator bool (atomic:86) ==27566== by 0x85B4DF7: libMesh::PetscVector<double>::_get_array(bool) const (petsc_vector.C:1363) ==27566== by 0x5934483: libMesh::PetscVector<double>::get(std::vector<unsigned int, std::allocator<unsigned int> > const&, double*) const (petsc_vector.h:1090) ==27566== by 0x87C80BB: get (numeric_vector.h:828) ==27566== by 0x87C80BB: libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>::eval_old_dofs(libMesh::Elem const&, unsigned int, unsigned int, std::vector<unsigned int, std::allocator<unsigned int> >&, std::vector<double, std::allocator<double> >&) (generic_projector.h:715) ==27566== by 0x87E0E4A: libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (generic_projector.h:1441) ==27566== by 0x87E34B2: void* libMesh::Threads::run_body<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy>(void*) (threads_pthread.h:236) ==27566== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0xCB7C6B9: start_thread (pthread_create.c:333) ==27566== by 0xD94341C: clone (clone.S:109) ==27566== ==27566== This conflicts with a previous write of size 1 by thread #152 ==27566== Locks held: 1, at address 0x12D10458 ==27566== at 0x85B4F39: store (atomic_base.h:374) ==27566== by 0x85B4F39: store (atomic:103) ==27566== by 0x85B4F39: libMesh::PetscVector<double>::_get_array(bool) const (petsc_vector.C:1442) ==27566== by 0x5934483: libMesh::PetscVector<double>::get(std::vector<unsigned int, std::allocator<unsigned int> > const&, double*) const (petsc_vector.h:1090) ==27566== by 0x87C80BB: get (numeric_vector.h:828) ==27566== by 0x87C80BB: libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>::eval_old_dofs(libMesh::Elem const&, unsigned int, unsigned int, std::vector<unsigned int, std::allocator<unsigned int> >&, std::vector<double, std::allocator<double> >&) (generic_projector.h:715) ==27566== by 0x87E0E4A: libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (generic_projector.h:1441) ==27566== by 0x87E34B2: void* libMesh::Threads::run_body<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, libMesh::GenericProjector<libMesh::OldSolutionValue<double, &(void libMesh::FEMContext::point_value<double>(unsigned int, libMesh::Point const&, double&, double) const)>, libMesh::OldSolutionValue<libMesh::VectorValue<double>, &(void libMesh::FEMContext::point_gradient<libMesh::VectorValue<double> >(unsigned int, libMesh::Point const&, double&, double) const)>, double, libMesh::VectorSetAction<double> >::SortAndCopy>(void*) (threads_pthread.h:236) ==27566== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0xCB7C6B9: start_thread (pthread_create.c:333) ==27566== by 0xD94341C: clone (clone.S:109) ==27566== Address 0x12d10430 is 32 bytes inside a block of size 176 alloc'd ==27566== at 0x4C2F50F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==27566== by 0x85ADAF1: make_unique<libMesh::PetscVector<double>, const libMesh::Parallel::Communicator&, libMesh::ParallelType> (unique_ptr.h:825) ==27566== by 0x85ADAF1: libMesh::NumericVector<double>::build(libMesh::Parallel::Communicator const&, libMesh::SolverPackage) (numeric_vector.C:62) ==27566== by 0x87BE4EB: libMesh::System::project_vector(libMesh::NumericVector<double> const&, libMesh::NumericVector<double>&, int) const (system_projection.C:342) ==27566== by 0x87BF561: libMesh::System::project_vector(libMesh::NumericVector<double>&, int) const (system_projection.C:257) ==27566== by 0x87991DF: libMesh::System::restrict_vectors() (system.C:334) ==27566== by 0x8753C77: libMesh::EquationSystems::reinit_solutions() (equation_systems.C:210) ==27566== by 0x87540A0: libMesh::EquationSystems::reinit() (equation_systems.C:123) ==27566== by 0x6F38FFD: FEProblemBase::reinitBecauseOfGhostingOrNewGeomObjects() (FEProblemBase.C:3284) ==27566== by 0x6F3A292: FEProblemBase::initialSetup() (FEProblemBase.C:770) ==27566== by 0x71FC4D1: MooseApp::executeExecutioner() (MooseApp.C:849) ==27566== by 0x59DF420: OutputApp::executeExecutioner() (OutputApp.C:144) ==27566== by 0x71FC56C: MooseApp::run() (MooseApp.C:961) ==27566== Block was alloc'd by thread #1 |
From: Povolotskyi, M. <mpo...@pu...> - 2019-05-08 15:02:30
|
On 5/8/2019 9:20 AM, Stogner, Roy H wrote: > On Mon, 6 May 2019, Povolotskyi, Mykhailo wrote: > >> can the PointLocatorTree find an element if the mesh is distributed ? >> >> Or it will only find elements that belong to the same MPI rank as the >> point? > If you're using DistributedMesh, a point locator will only be able to > find "semilocal" elements: elements that either belong to the MPI rank > making the call or are ghosted on the MPI rank making the call. The > locator will return a null pointer on ranks for which those elements > are remote. > --- > Roy Thank you. Now it is clear. In order to filter out the "semilocal" elements I'm checking its processor_id. Michael. |
From: Stogner, R. H <roy...@ic...> - 2019-05-08 13:20:15
|
On Mon, 6 May 2019, Povolotskyi, Mykhailo wrote: > can the PointLocatorTree find an element if the mesh is distributed ? > > Or it will only find elements that belong to the same MPI rank as the > point? If you're using DistributedMesh, a point locator will only be able to find "semilocal" elements: elements that either belong to the MPI rank making the call or are ghosted on the MPI rank making the call. The locator will return a null pointer on ranks for which those elements are remote. --- Roy |
From: Stogner, R. H <roy...@ic...> - 2019-05-08 13:18:48
|
On Mon, 6 May 2019, Povolotskyi, Mykhailo wrote: > I'm having difficulties in reading mesh in parallel and using it after. > > The documentation says: > > void libMesh::GmshIO::read_mesh ( std::istream & in ) > > { > > // This is a serial-only process for now; > > // the Mesh should be read on processor 0 and > > // broadcast later > > > } > > How should the mesh be broadcast? After prepare_for_use() or before? During, automatically. Just prepare_for_use() should handle it; you shouldn't have to do anything else. If you are it's a bug, and let us know details! > If I want to solve and output the system mesh with VTKIO, do I have to > serialize the mesh? I forget whether our VTKIO class handles parallel VTK output properly automatically or not. I do know that we handle DistributedMesh serialization automatically for those MeshOutput classes which require it, but we do spit out a warning which shouldn't be ignored: generally for any problem big enough to prompt use of a DistributedMesh, being forced to serialize it is a performance disaster. --- Roy |
From: Povolotskyi, M. <mpo...@pu...> - 2019-05-06 21:49:18
|
Dear developers, can the PointLocatorTree find an element if the mesh is distributed ? Or it will only find elements that belong to the same MPI rank as the point? Thank you, Michael. |
From: Povolotskyi, M. <mpo...@pu...> - 2019-05-06 18:22:20
|
Dear Libmesh developers, I'm having difficulties in reading mesh in parallel and using it after. The documentation says: void libMesh::GmshIO::read_mesh ( std::istream & in ) { // This is a serial-only process for now; // the Mesh should be read on processor 0 and // broadcast later } How should the mesh be broadcast? After prepare_for_use() or before? If I want to solve and output the system mesh with VTKIO, do I have to serialize the mesh? Thank you, Michael. |
From: Stogner, R. H <roy...@ic...> - 2019-05-06 16:47:20
|
On Sat, 4 May 2019, Manav Bhatia wrote: > I am working on immersed boundary problems where I cut a FE based on a level-set function. On each element obtained by the cut operation, I > ask the FE to be initialized, either using a quadrature rule, or by specific in the QP locations. > > While this forward operation of QP -> compute shape functions, derivatives and normals (on sides) should work out fine, I'd be scared of the derivatives. If dxi/dx is going off to infinity (in some eigenvalues anyway) then our standard attempt to compute dphi/dx as dphi/dxi*dxi/dx sounds perilous. > the issue is showing > up in compute_face_map() ( https://github.com/libMesh/libmesh/blob/c4c9fd5450489486fd5f7baf2ec9a8ac0d47fc99/src/fe/fe_boundary.C#L224 ). > > It appears that compute_face_map() is using the xyz locations of the quadrature points to figure out non-dimensional location of the > quadrature points here: https://github.com/libMesh/libmesh/blob/c4c9fd5450489486fd5f7baf2ec9a8ac0d47fc99/src/fe/fe_boundary.C#L754 > > So, every once in a while the level-set based intersection leads to sliver cut-cells, which case this inverse-map method to fail. > > I am guessing there is a reason to do things this way, even though the user may have explicitly provided the quadrature points to begin with. > Do you know if there is a way to bypass the inverse_map for FE reinit? I'm actually not aware of any reason to do things this way. It seems like a bad idea in general, not just in your case! There are cases where we do inverse_map() in AMR problems and for edge and face reinits, because we're too lazy to do direct translation function calculations for every single combination of parent<->child element configuration, and it's easy to just do inverse_map() on one of the xyz coordinates calculated from the other. And that's embarrassing, but at least we have an excuse. But here I don't actually see an excuse. Maybe we wrote the compute_face_map API first, then we added the ability to get proper tangents of 2D elements in 3D space later, and someone wanted to do that without changing the API? I'd say you should just change the APIs for compute_face_map and compute_edge_map. I'll bet nobody is directly using them; they're technically public but they're basically internal. Hmm.. that fix will work for face reinits, and for edge reinits with manually supplied quadrature points, but for edge reinits based on an quadrature rule we're *also* using inverse_map(xyz) for some reason, rather than using the refspace_nodes calculation we do for face reinits. That should probably be fixed one of these days. --- Roy |
From: Stogner, R. H <roy...@ic...> - 2019-05-06 16:08:48
|
On Mon, 6 May 2019, Povolotskyi, Mykhailo wrote: > let me clarify. > > Do I have to create a partitioner and attach it to the mesh? Any Mesh gets a default partitioner depending on what the underlying type (DistributedMesh or ReplicatedMesh) is and on what code you have configured on vs off. Most ReplicatedMesh users will end up with METIS by default. You shouldn't need to create your own partioner unless you don't like the default option. --- Roy > Michael. > > On 05/06/2019 09:09 AM, Paul T. Bauman wrote: > > > On Fri, May 3, 2019 at 9:14 PM Povolotskyi, Mykhailo <mpo...@pu...<mailto:mpo...@pu...>> wrote: > Hello, > > is it possible to read mesh from gmesh and then partition it over MPI > ranks with libmesh? > > As long as you can parse the Mesh, then partitioning it should just work as partitioning is entirely separate from mesh parsing. Is this not working for you? > > > If yes, do you have an example? > > Thank you, > > Michael. > > > _______________________________________________ > Libmesh-users mailing list > Lib...@li...<mailto:Lib...@li...> > https://lists.sourceforge.net/lists/listinfo/libmesh-users > > > _______________________________________________ > Libmesh-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-users >>> This message is from an external sender. Learn more about why this << >>> matters at https://links.utexas.edu/rtyclf. << > > |
From: Povolotskyi, M. <mpo...@pu...> - 2019-05-06 15:48:27
|
Thank you, let me clarify. Do I have to create a partitioner and attach it to the mesh? Michael. On 05/06/2019 09:09 AM, Paul T. Bauman wrote: On Fri, May 3, 2019 at 9:14 PM Povolotskyi, Mykhailo <mpo...@pu...<mailto:mpo...@pu...>> wrote: Hello, is it possible to read mesh from gmesh and then partition it over MPI ranks with libmesh? As long as you can parse the Mesh, then partitioning it should just work as partitioning is entirely separate from mesh parsing. Is this not working for you? If yes, do you have an example? Thank you, Michael. _______________________________________________ Libmesh-users mailing list Lib...@li...<mailto:Lib...@li...> https://lists.sourceforge.net/lists/listinfo/libmesh-users |
From: Paul T. B. <ptb...@gm...> - 2019-05-06 13:09:30
|
On Fri, May 3, 2019 at 9:14 PM Povolotskyi, Mykhailo <mpo...@pu...> wrote: > Hello, > > is it possible to read mesh from gmesh and then partition it over MPI > ranks with libmesh? > As long as you can parse the Mesh, then partitioning it should just work as partitioning is entirely separate from mesh parsing. Is this not working for you? > > If yes, do you have an example? > > Thank you, > > Michael. > > > _______________________________________________ > Libmesh-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-users > |
From: Renato P. <re...@gm...> - 2019-05-05 00:41:26
|
Just to close the conversation: My objective here is to map values from one mesh to another. Does libMesh offer convenient tools for that? I am manually L2 mapping on element basis. Any smarter suggestion? On Sat, May 4, 2019 at 9:36 PM Alexander Lindsay <ale...@gm...> wrote: > Whether you’re reading in from an exodus file doesn’t matter. After > reading, the mesh will delete remote non-ghosted elements if you’re using a > DistributedMesh. However, you’re using the Mesh class which inherits from > ReplicatedMesh unless you passed the —enable-parmesh flag during configure. > > Presuming you did not pass that flag, then the active elements should be > the same across processes to the best of my knowledge. > > On May 4, 2019, at 2:17 PM, Renato Poli <re...@gm...> wrote: > > I'm just reading in from a Exodus file. > That would be a replicated mesh, right? > > ==== CODE > libMesh::Mesh lm_mesh(init.comm()); > lm_mesh.read( exofn ); > > > On Sat, May 4, 2019 at 5:10 PM Alexander Lindsay <ale...@gm...> > wrote: > >> Are you using a replicated or a distributed mesh? A distributed mesh will >> not have the same active elements. >> >> > On May 4, 2019, at 1:52 PM, Renato Poli <re...@gm...> wrote: >> > >> > Hi Roy >> > >> > I found what is breaking the flow. >> > Please consider the code below. >> > >> > I thought the "active_elements" were the same across the processors. It >> > seems that is not the case? >> > Then the point_locator (which is a collective task - right?) breaks sync >> > across processors. >> > >> > My code intends to map values from one mesh to the other. >> > What is the best construction for that? >> > Should I use "elements_begin/end" iterator instead? >> > >> > === CODE >> > MBcit el = mesh.active_elements_begin(); >> > MBcit end_el = mesh.active_elements_end(); >> > for ( ; el != end_el; ++el) { >> > ... >> > UniquePtr<PointLocatorBase> plocator = mesh.sub_point_locator(); >> > elem = (*plocator)( pt ); >> > ... >> > } >> > >> > >> >> On Fri, May 3, 2019 at 9:30 PM Renato Poli <re...@gm...> wrote: >> >> >> >> Thanks. >> >> >> >> Should I call "parallel_object_only()" throughout the code to check >> when >> >> it lost sync? >> >> Any smarter way to do that? >> >> What GDB can do for me? >> >> Parallel debugging is really something new to me... >> >> >> >> On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H < >> roy...@ic...> >> >> wrote: >> >> >> >>> >> >>>> On Fri, 3 May 2019, Renato Poli wrote: >> >>>> >> >>>> I see a number of error messages, as below. >> >>>> I am struggling to understand what they mean and how to move forward. >> >>>> It is related to manually setting a system solution and closing the >> >>>> "solution" vector afterwards. >> >>>> Any idea? >> >>>> >> >>>> Assertion >> >>>> >> >>> >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed. >> >>>> Assertion >> >>>> `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' >> >>> failed. >> >>>> [Assertion >> >>>> >> >>> >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed. >> >>>> [2] Assertion >> >>>> >> >>> >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb >> 22 >> >>> 2019 >> >>>> at 17:56:59 >> >>>> 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 >> >>>> ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at >> >>> 17:56:59 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 >> >>>> >> >>>> Thanks, >> >>>> Renato >> >>> >> >>> You're running in parallel, but your different processors have gotten >> >>> out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or >> >>> 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() >> >>> on every processor, perhaps? Then the missing processor would >> >>> continue to whatever the next parallel-only operation is and the >> >>> dbg/devel mode check for synchronization would fail. >> >>> --- >> >>> Roy >> >>> >> >> >> > >> > _______________________________________________ >> > Libmesh-users mailing list >> > Lib...@li... >> > https://lists.sourceforge.net/lists/listinfo/libmesh-users >> > |
From: Alexander L. <ale...@gm...> - 2019-05-05 00:37:01
|
Whether you’re reading in from an exodus file doesn’t matter. After reading, the mesh will delete remote non-ghosted elements if you’re using a DistributedMesh. However, you’re using the Mesh class which inherits from ReplicatedMesh unless you passed the —enable-parmesh flag during configure. Presuming you did not pass that flag, then the active elements should be the same across processes to the best of my knowledge. > On May 4, 2019, at 2:17 PM, Renato Poli <re...@gm...> wrote: > > I'm just reading in from a Exodus file. > That would be a replicated mesh, right? > > ==== CODE > libMesh::Mesh lm_mesh(init.comm()); > lm_mesh.read( exofn ); > > >> On Sat, May 4, 2019 at 5:10 PM Alexander Lindsay <ale...@gm...> wrote: >> Are you using a replicated or a distributed mesh? A distributed mesh will not have the same active elements. >> >> > On May 4, 2019, at 1:52 PM, Renato Poli <re...@gm...> wrote: >> > >> > Hi Roy >> > >> > I found what is breaking the flow. >> > Please consider the code below. >> > >> > I thought the "active_elements" were the same across the processors. It >> > seems that is not the case? >> > Then the point_locator (which is a collective task - right?) breaks sync >> > across processors. >> > >> > My code intends to map values from one mesh to the other. >> > What is the best construction for that? >> > Should I use "elements_begin/end" iterator instead? >> > >> > === CODE >> > MBcit el = mesh.active_elements_begin(); >> > MBcit end_el = mesh.active_elements_end(); >> > for ( ; el != end_el; ++el) { >> > ... >> > UniquePtr<PointLocatorBase> plocator = mesh.sub_point_locator(); >> > elem = (*plocator)( pt ); >> > ... >> > } >> > >> > >> >> On Fri, May 3, 2019 at 9:30 PM Renato Poli <re...@gm...> wrote: >> >> >> >> Thanks. >> >> >> >> Should I call "parallel_object_only()" throughout the code to check when >> >> it lost sync? >> >> Any smarter way to do that? >> >> What GDB can do for me? >> >> Parallel debugging is really something new to me... >> >> >> >> On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H <roy...@ic...> >> >> wrote: >> >> >> >>> >> >>>> On Fri, 3 May 2019, Renato Poli wrote: >> >>>> >> >>>> I see a number of error messages, as below. >> >>>> I am struggling to understand what they mean and how to move forward. >> >>>> It is related to manually setting a system solution and closing the >> >>>> "solution" vector afterwards. >> >>>> Any idea? >> >>>> >> >>>> Assertion >> >>>> >> >>> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed. >> >>>> Assertion >> >>>> `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' >> >>> failed. >> >>>> [Assertion >> >>>> >> >>> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed. >> >>>> [2] Assertion >> >>>> >> >>> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 >> >>> 2019 >> >>>> at 17:56:59 >> >>>> 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 >> >>>> ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at >> >>> 17:56:59 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 >> >>>> >> >>>> Thanks, >> >>>> Renato >> >>> >> >>> You're running in parallel, but your different processors have gotten >> >>> out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or >> >>> 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() >> >>> on every processor, perhaps? Then the missing processor would >> >>> continue to whatever the next parallel-only operation is and the >> >>> dbg/devel mode check for synchronization would fail. >> >>> --- >> >>> Roy >> >>> >> >> >> > >> > _______________________________________________ >> > Libmesh-users mailing list >> > Lib...@li... >> > https://lists.sourceforge.net/lists/listinfo/libmesh-users |
From: Renato P. <re...@gm...> - 2019-05-05 00:36:46
|
I got it. It was a flag initialization issue. Nothing related to libmesh. Sorry to bother. Renato On Sat, May 4, 2019 at 5:17 PM Renato Poli <re...@gm...> wrote: > I'm just reading in from a Exodus file. > That would be a replicated mesh, right? > > ==== CODE > libMesh::Mesh lm_mesh(init.comm()); > lm_mesh.read( exofn ); > > > On Sat, May 4, 2019 at 5:10 PM Alexander Lindsay <ale...@gm...> > wrote: > >> Are you using a replicated or a distributed mesh? A distributed mesh will >> not have the same active elements. >> >> > On May 4, 2019, at 1:52 PM, Renato Poli <re...@gm...> wrote: >> > >> > Hi Roy >> > >> > I found what is breaking the flow. >> > Please consider the code below. >> > >> > I thought the "active_elements" were the same across the processors. It >> > seems that is not the case? >> > Then the point_locator (which is a collective task - right?) breaks sync >> > across processors. >> > >> > My code intends to map values from one mesh to the other. >> > What is the best construction for that? >> > Should I use "elements_begin/end" iterator instead? >> > >> > === CODE >> > MBcit el = mesh.active_elements_begin(); >> > MBcit end_el = mesh.active_elements_end(); >> > for ( ; el != end_el; ++el) { >> > ... >> > UniquePtr<PointLocatorBase> plocator = mesh.sub_point_locator(); >> > elem = (*plocator)( pt ); >> > ... >> > } >> > >> > >> >> On Fri, May 3, 2019 at 9:30 PM Renato Poli <re...@gm...> wrote: >> >> >> >> Thanks. >> >> >> >> Should I call "parallel_object_only()" throughout the code to check >> when >> >> it lost sync? >> >> Any smarter way to do that? >> >> What GDB can do for me? >> >> Parallel debugging is really something new to me... >> >> >> >> On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H < >> roy...@ic...> >> >> wrote: >> >> >> >>> >> >>>> On Fri, 3 May 2019, Renato Poli wrote: >> >>>> >> >>>> I see a number of error messages, as below. >> >>>> I am struggling to understand what they mean and how to move forward. >> >>>> It is related to manually setting a system solution and closing the >> >>>> "solution" vector afterwards. >> >>>> Any idea? >> >>>> >> >>>> Assertion >> >>>> >> >>> >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed. >> >>>> Assertion >> >>>> `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' >> >>> failed. >> >>>> [Assertion >> >>>> >> >>> >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed. >> >>>> [2] Assertion >> >>>> >> >>> >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> >>>> failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb >> 22 >> >>> 2019 >> >>>> at 17:56:59 >> >>>> 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 >> >>>> ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at >> >>> 17:56:59 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 >> >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 >> >>>> >> >>>> Thanks, >> >>>> Renato >> >>> >> >>> You're running in parallel, but your different processors have gotten >> >>> out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or >> >>> 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() >> >>> on every processor, perhaps? Then the missing processor would >> >>> continue to whatever the next parallel-only operation is and the >> >>> dbg/devel mode check for synchronization would fail. >> >>> --- >> >>> Roy >> >>> >> >> >> > >> > _______________________________________________ >> > Libmesh-users mailing list >> > Lib...@li... >> > https://lists.sourceforge.net/lists/listinfo/libmesh-users >> > |
From: Renato P. <re...@gm...> - 2019-05-04 20:18:07
|
I'm just reading in from a Exodus file. That would be a replicated mesh, right? ==== CODE libMesh::Mesh lm_mesh(init.comm()); lm_mesh.read( exofn ); On Sat, May 4, 2019 at 5:10 PM Alexander Lindsay <ale...@gm...> wrote: > Are you using a replicated or a distributed mesh? A distributed mesh will > not have the same active elements. > > > On May 4, 2019, at 1:52 PM, Renato Poli <re...@gm...> wrote: > > > > Hi Roy > > > > I found what is breaking the flow. > > Please consider the code below. > > > > I thought the "active_elements" were the same across the processors. It > > seems that is not the case? > > Then the point_locator (which is a collective task - right?) breaks sync > > across processors. > > > > My code intends to map values from one mesh to the other. > > What is the best construction for that? > > Should I use "elements_begin/end" iterator instead? > > > > === CODE > > MBcit el = mesh.active_elements_begin(); > > MBcit end_el = mesh.active_elements_end(); > > for ( ; el != end_el; ++el) { > > ... > > UniquePtr<PointLocatorBase> plocator = mesh.sub_point_locator(); > > elem = (*plocator)( pt ); > > ... > > } > > > > > >> On Fri, May 3, 2019 at 9:30 PM Renato Poli <re...@gm...> wrote: > >> > >> Thanks. > >> > >> Should I call "parallel_object_only()" throughout the code to check when > >> it lost sync? > >> Any smarter way to do that? > >> What GDB can do for me? > >> Parallel debugging is really something new to me... > >> > >> On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H <roy...@ic... > > > >> wrote: > >> > >>> > >>>> On Fri, 3 May 2019, Renato Poli wrote: > >>>> > >>>> I see a number of error messages, as below. > >>>> I am struggling to understand what they mean and how to move forward. > >>>> It is related to manually setting a system solution and closing the > >>>> "solution" vector afterwards. > >>>> Any idea? > >>>> > >>>> Assertion > >>>> > >>> > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > >>>> failed. > >>>> Assertion > >>>> `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' > >>> failed. > >>>> [Assertion > >>>> > >>> > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > >>>> failed. > >>>> [2] Assertion > >>>> > >>> > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > >>>> failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 > >>> 2019 > >>>> at 17:56:59 > >>>> 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 > >>>> ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at > >>> 17:56:59 > >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 > >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 > >>>> > >>>> Thanks, > >>>> Renato > >>> > >>> You're running in parallel, but your different processors have gotten > >>> out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or > >>> 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() > >>> on every processor, perhaps? Then the missing processor would > >>> continue to whatever the next parallel-only operation is and the > >>> dbg/devel mode check for synchronization would fail. > >>> --- > >>> Roy > >>> > >> > > > > _______________________________________________ > > Libmesh-users mailing list > > Lib...@li... > > https://lists.sourceforge.net/lists/listinfo/libmesh-users > |
From: Alexander L. <ale...@gm...> - 2019-05-04 20:10:49
|
Are you using a replicated or a distributed mesh? A distributed mesh will not have the same active elements. > On May 4, 2019, at 1:52 PM, Renato Poli <re...@gm...> wrote: > > Hi Roy > > I found what is breaking the flow. > Please consider the code below. > > I thought the "active_elements" were the same across the processors. It > seems that is not the case? > Then the point_locator (which is a collective task - right?) breaks sync > across processors. > > My code intends to map values from one mesh to the other. > What is the best construction for that? > Should I use "elements_begin/end" iterator instead? > > === CODE > MBcit el = mesh.active_elements_begin(); > MBcit end_el = mesh.active_elements_end(); > for ( ; el != end_el; ++el) { > ... > UniquePtr<PointLocatorBase> plocator = mesh.sub_point_locator(); > elem = (*plocator)( pt ); > ... > } > > >> On Fri, May 3, 2019 at 9:30 PM Renato Poli <re...@gm...> wrote: >> >> Thanks. >> >> Should I call "parallel_object_only()" throughout the code to check when >> it lost sync? >> Any smarter way to do that? >> What GDB can do for me? >> Parallel debugging is really something new to me... >> >> On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H <roy...@ic...> >> wrote: >> >>> >>>> On Fri, 3 May 2019, Renato Poli wrote: >>>> >>>> I see a number of error messages, as below. >>>> I am struggling to understand what they mean and how to move forward. >>>> It is related to manually setting a system solution and closing the >>>> "solution" vector afterwards. >>>> Any idea? >>>> >>>> Assertion >>>> >>> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >>>> failed. >>>> Assertion >>>> `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' >>> failed. >>>> [Assertion >>>> >>> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >>>> failed. >>>> [2] Assertion >>>> >>> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >>>> failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 >>> 2019 >>>> at 17:56:59 >>>> 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 >>>> ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at >>> 17:56:59 >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 >>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 >>>> >>>> Thanks, >>>> Renato >>> >>> You're running in parallel, but your different processors have gotten >>> out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or >>> 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() >>> on every processor, perhaps? Then the missing processor would >>> continue to whatever the next parallel-only operation is and the >>> dbg/devel mode check for synchronization would fail. >>> --- >>> Roy >>> >> > > _______________________________________________ > Libmesh-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-users |
From: Renato P. <re...@gm...> - 2019-05-04 19:53:00
|
Hi Roy I found what is breaking the flow. Please consider the code below. I thought the "active_elements" were the same across the processors. It seems that is not the case? Then the point_locator (which is a collective task - right?) breaks sync across processors. My code intends to map values from one mesh to the other. What is the best construction for that? Should I use "elements_begin/end" iterator instead? === CODE MBcit el = mesh.active_elements_begin(); MBcit end_el = mesh.active_elements_end(); for ( ; el != end_el; ++el) { ... UniquePtr<PointLocatorBase> plocator = mesh.sub_point_locator(); elem = (*plocator)( pt ); ... } On Fri, May 3, 2019 at 9:30 PM Renato Poli <re...@gm...> wrote: > Thanks. > > Should I call "parallel_object_only()" throughout the code to check when > it lost sync? > Any smarter way to do that? > What GDB can do for me? > Parallel debugging is really something new to me... > > On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H <roy...@ic...> > wrote: > >> >> On Fri, 3 May 2019, Renato Poli wrote: >> >> > I see a number of error messages, as below. >> > I am struggling to understand what they mean and how to move forward. >> > It is related to manually setting a system solution and closing the >> > "solution" vector afterwards. >> > Any idea? >> > >> > Assertion >> > >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> > failed. >> > Assertion >> > `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' >> failed. >> > [Assertion >> > >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> > failed. >> > [2] Assertion >> > >> `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' >> > failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 >> 2019 >> > at 17:56:59 >> > 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 >> > ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at >> 17:56:59 >> > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 >> > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 >> > >> > Thanks, >> > Renato >> >> You're running in parallel, but your different processors have gotten >> out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or >> 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() >> on every processor, perhaps? Then the missing processor would >> continue to whatever the next parallel-only operation is and the >> dbg/devel mode check for synchronization would fail. >> --- >> Roy >> > |
From: Manav B. <bha...@gm...> - 2019-05-04 03:40:12
|
Roy, This turned out to be unrelated to adaptivity. I am working on immersed boundary problems where I cut a FE based on a level-set function. On each element obtained by the cut operation, I ask the FE to be initialized, either using a quadrature rule, or by specific in the QP locations. While this forward operation of QP -> compute shape functions, derivatives and normals (on sides) should work out fine, the issue is showing up in compute_face_map() ( https://github.com/libMesh/libmesh/blob/c4c9fd5450489486fd5f7baf2ec9a8ac0d47fc99/src/fe/fe_boundary.C#L224 <https://github.com/libMesh/libmesh/blob/c4c9fd5450489486fd5f7baf2ec9a8ac0d47fc99/src/fe/fe_boundary.C#L224> ). It appears that compute_face_map() is using the xyz locations of the quadrature points to figure out non-dimensional location of the quadrature points here: https://github.com/libMesh/libmesh/blob/c4c9fd5450489486fd5f7baf2ec9a8ac0d47fc99/src/fe/fe_boundary.C#L754 <https://github.com/libMesh/libmesh/blob/c4c9fd5450489486fd5f7baf2ec9a8ac0d47fc99/src/fe/fe_boundary.C#L754> So, every once in a while the level-set based intersection leads to sliver cut-cells, which case this inverse-map method to fail. I am guessing there is a reason to do things this way, even though the user may have explicitly provided the quadrature points to begin with. Do you know if there is a way to bypass the inverse_map for FE reinit? Thanks, Manav > On May 2, 2019, at 8:10 AM, Stogner, Roy H <roy...@ic...> wrote: > > > On Thu, 2 May 2019, Manav Bhatia wrote: > >> I ran the debug version and this is where it crashed. > > Well, I'm afraid that's no more help. Thanks anyway. > > Could you loop through elem 16514's parent (672) and parent's parent > and so on up to the top level and print them? I'm not sure where to > begin looking for the problem, unless you can set up a test case we > can reproduce. > --- > Roy > >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 11 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(-1.21944e-13, >> -1.12163e-06, 0) p=(x,y,z)=( -1, -0.57735, 0) error=1.12163e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 12 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(-1.21944e-13, >> -1.12163e-06, 0) p=(x,y,z)=( -1, -0.577351, 0) error=1.12163e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 13 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(2.43888e-13, >> 2.24326e-06, 0) p=(x,y,z)=( -1, -0.577349, 0) error=2.24326e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 14 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(-1.21944e-13, >> -1.12163e-06, 0) p=(x,y,z)=( -1, -0.57735, 0) error=1.12163e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 15 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(-1.21944e-13, >> -1.12163e-06, 0) p=(x,y,z)=( -1, -0.577351, 0) error=1.12163e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 16 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(2.43888e-13, >> 2.24326e-06, 0) p=(x,y,z)=( -1, -0.577349, 0) error=2.24326e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> [2] src/fe/fe_map.C, line 1866, compiled Apr 29 2019 at 13:59:12 >> WARNING: Newton scheme has not converged in 17 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(-1.21944e-13, >> -1.12163e-06, 0) p=(x,y,z)=( -1, -0.57735, 0) error=1.12163e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> WARNING: Newton scheme has not converged in 21 iterations: >> physical_point=(x,y,z)=( 0.27, 0.2775, 0) physical_guess=(x,y,z)=( 0.27, 0.2775, 0) dp=(x,y,z)=(-1.21944e-13, >> -1.12163e-06, 0) p=(x,y,z)=( -1, -0.577351, 0) error=1.12163e-06 in element 16514 >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> ERROR: Newton scheme FAILED to converge in 21 iterations in element 16514 for physical point = (x,y,z)=( 0.27, 0.2775, 0) >> Elem Information >> id()=16514, processor_id()=2 >> type()=QUAD4 >> dim()=2 >> n_nodes()=4 >> 0 Node id()=13234, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> 1 Node id()=604, processor_id()=2, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs=(0/0/13470) (0/1/13471) (0/2/13472) (0/3/13473) (0/4/13474) (0/5/13475) (2/0/2245) (3/0/2245) >> 2 Node id()=13235, processor_id()=4294967295, Point=(x,y,z)=(0.271875, 0.2775, 0) >> DoFs= >> 3 Node id()=13233, processor_id()=4294967295, Point=(x,y,z)=( 0.27, 0.2775, 0) >> DoFs= >> n_sides()=4 >> neighbor(0)=nullptr >> neighbor(1)=nullptr >> neighbor(2)=nullptr >> neighbor(3)=nullptr >> hmin()=9.89829e-11, hmax()=0.001875 >> volume()=1.85593e-13 >> active()=1, ancestor()=0, subactive()=0, has_children()=0 >> parent()=672 >> level()=4, p_level()=0 >> refinement_flag()=DO_NOTHING >> p_refinement_flag()=DO_NOTHING >> DoFs= >> Exiting... >> [2] src/fe/fe_map.C, line 1905, compiled Apr 29 2019 at 13:59:12 >> >> >> >> >> On May 1, 2019, at 9:29 PM, Stogner, Roy H <roy...@ic...> wrote: >> >> >> On Wed, 1 May 2019, Manav Bhatia wrote: >> >> I am using h-refinement in my analysis, which uses the mesh function routines to compute the value of the function in the >> interior of an element. >> >> All of my elements in the original mesh (before any refinements) are squares (quad4). >> >> For the most part everything works out fine without any issues. >> Occasionally, however, I will get an error in the inverse_map() >> like this. I am particularly perplexed by the hmin() size of >> 10^-11. >> >> The size of my elements before refinement is hmin() = .015 and I >> allow a max of 4 refinements in any element. Would there be any >> reason to expect an hmin of order 10^-11 in this case? >> >> >> Not even close, but there's definitely *something* going seriously >> wrong here. >> >> You have a degenerate element; points 0 and 3 and points 1 and 2 >> coincide. >> >> You have three nodes with invalid processor ids. >> >> You probably ought to >> >> Rerun in devel/dbg mode for more details. >> >> >> (preferably dbg) and see whether it catches any problems earlier. >> --- >> Roy >> >> >> |
From: Povolotskyi, M. <mpo...@pu...> - 2019-05-04 01:13:52
|
Hello, is it possible to read mesh from gmesh and then partition it over MPI ranks with libmesh? If yes, do you have an example? Thank you, Michael. |
From: Renato P. <re...@gm...> - 2019-05-04 00:30:51
|
Thanks. Should I call "parallel_object_only()" throughout the code to check when it lost sync? Any smarter way to do that? What GDB can do for me? Parallel debugging is really something new to me... On Fri, May 3, 2019 at 7:34 PM Stogner, Roy H <roy...@ic...> wrote: > > On Fri, 3 May 2019, Renato Poli wrote: > > > I see a number of error messages, as below. > > I am struggling to understand what they mean and how to move forward. > > It is related to manually setting a system solution and closing the > > "solution" vector afterwards. > > Any idea? > > > > Assertion > > > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > > failed. > > Assertion > > `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' > failed. > > [Assertion > > > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > > failed. > > [2] Assertion > > > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > > failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 > 2019 > > at 17:56:59 > > 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 > > ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at > 17:56:59 > > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 > > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 > > > > Thanks, > > Renato > > You're running in parallel, but your different processors have gotten > out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or > 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() > on every processor, perhaps? Then the missing processor would > continue to whatever the next parallel-only operation is and the > dbg/devel mode check for synchronization would fail. > --- > Roy > |
From: Stogner, R. H <roy...@ic...> - 2019-05-03 22:35:10
|
On Fri, 3 May 2019, Renato Poli wrote: > I see a number of error messages, as below. > I am struggling to understand what they mean and how to move forward. > It is related to manually setting a system solution and closing the > "solution" vector afterwards. > Any idea? > > Assertion > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > failed. > Assertion > `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' failed. > [Assertion > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > failed. > [2] Assertion > `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' > failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 > at 17:56:59 > 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 > ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at 17:56:59 > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 > > Thanks, > Renato You're running in parallel, but your different processors have gotten out of sync. At least 1 is at mesh_base.C line 511, and at least 2 or 3 are at petsc_vector.h 812. Are you not calling PetscVector::close() on every processor, perhaps? Then the missing processor would continue to whatever the next parallel-only operation is and the dbg/devel mode check for synchronization would fail. --- Roy |
From: Renato P. <re...@gm...> - 2019-05-03 21:07:36
|
Hi I see a number of error messages, as below. I am struggling to understand what they mean and how to move forward. It is related to manually setting a system solution and closing the "solution" vector afterwards. Any idea? Assertion `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' failed. Assertion `(this->comm()).verify(std::string("src/mesh/mesh_base.C").size())' failed. [Assertion `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' failed. [2] Assertion `(this->comm()).verify(std::string("./include/libmesh/petsc_vector.h").size())' failed.[0] ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at 17:56:59 1] src/mesh/mesh_base.C, line 511, compiled Feb 22 2019 at 17:55:09 ./include/libmesh/petsc_vector.h, line 812, compiled Feb 22 2019 at 17:56:59 application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2 Thanks, Renato |
From: Yuxiang W. <yw...@vi...> - 2019-05-03 02:23:35
|
John & Jed, Got it! Thank you so much for the help. Best, Shawn On Thu, May 2, 2019, 11:04 Jed Brown <je...@je...> wrote: > John Peterson <jwp...@gm...> writes: > > > On Thu, May 2, 2019 at 12:35 PM Yuxiang Wang <yw...@vi...> wrote: > > > >> Dear all, > >> > >> A quick question - when using libmesh, should I consciously do bandwidth > >> minimization when I number my node IDs? Or, are those taken care of at > >> pre-processing stage of PETSc and other solvers? > >> > > > > I wouldn't try and do anything manually since PETSc has a number of > > renumbering algorithms that I think you can play around with pretty > easily > > to see if they make any difference... > > You can run factorizations in a different ordering, but low-bandwidth > orderings make a difference for cache reuse in MatMult. PETSc can't > reorder that without making a copy. But don't put too much effort into > it. > |