From: Roy S. <roy...@ic...> - 2018-01-12 22:37:19
|
Copying this to libmesh-users; it looks like an issue that could affect anyone trying to reuse Elem objects. On Fri, 12 Jan 2018, Jiang, Wen wrote: > Thanks. I think your explanation makes a lot sense to me. I followed > your suggestions that manually call clear_old_dof_object() for the > reused element, but I still hit the same error. Hmmm... are you clearing (or never setting) refinement_flag() too? > Is there any method that we should call to let libmesh to identify > those elements and skip them for projection? clear_old_dof_object() plus set_refinement_flag(DO_NOTHING) should do it! The old_dof_object pointer and the refinement_flag are the only things that the relevant if statement branches on. If that combination doesn't work, then we need to figure out what else is happening. The next step would be to run in gdb, "catch throw" or "b MPI_Abort" or whatever you usually do to catch assertions, then step up the stack to BuildProjectionList::operator() and "p elem->old_dof_object", "p elem->refinement_flag() to see what exactly is going on. Wait, wait, wait. I have another theory about what might be going on. When we hit DofMap::reinit(), that's where the library automatically *creates* old_dof_object, for any object with current dof indexing that needs to be remembered later as old dof indexing, even if the "current dof indexing" is as trivial as "this object has variables for this system but doesn't have any dofs for each variable". So maybe you're hitting that code, it sees the existing DofObject::n_vars(sys) is true, and so it creates old_dof_object! In that case what you'd want to run is clear_dofs() - you might not even need clear_old_dof_object(), since this second theory of mine suggests that you're doing your XFEM stuff at a stage where old_dof_object either has already been automatically cleared or soon will be. Hmm... there's also a chance you'll need to set_n_systems(your_number_of_systems) afterwards to avoid a different assertion. I don't *think* that will happen, I think n_sys gets kept up to date automatically in EquationSystems::reinit (if only because otherwise I think even the old XFEM code would be breaking), but it's a possibility. Let me know what works or doesn't? Also, if clear_dofs() does turn out to be the solution, you might want to add a libMesh unit test that asserts it can be called publicly and has the desired effect of making has_dofs(sys) return false afterwards. If I'd been designing the clear_dofs() API, for modularity's sake I'd have made it private and accessable only via a DofMap friend declaration in DofObject... and if I ever go on a crazy refactoring-for-modularity bender someday I'd hate to break your code because I forgot about your use case! --- Roy > Regards, > Wen > > > Assertion `node->old_dof_object' failed. > > Stack frames: 19 > 0: 0 libmesh_dbg.0.dylib 0x0000000103cf91cf libMesh::print_trace(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) + > 2287 > 1: 1 libmesh_dbg.0.dylib 0x0000000103ce7fea libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*) + > 634 > 2: 2 libmesh_dbg.0.dylib 0x0000000103b49d7a libMesh::DofMap::old_dof_indices(libMesh::Elem const*, std::__1::vector<unsigned > int, std::__1::allocator<unsigned int> >&, unsigned int) const + 10218 > 3: 3 libmesh_dbg.0.dylib 0x00000001054dd3de > libMesh::BuildProjectionList::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) + 5726 > 4: 4 libmesh_dbg.0.dylib 0x00000001054d9229 void > libMesh::Threads::parallel_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, > libMesh::BuildProjectionList>(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, > libMesh::BuildProjectionList&) + 153 > 5: 5 libmesh_dbg.0.dylib 0x00000001054d63ad libMesh::System::project_vector(libMesh::NumericVector<double> const&, > libMesh::NumericVector<double>&, int) const + 3149 > 6: 6 libmesh_dbg.0.dylib 0x00000001054d557f libMesh::System::project_vector(libMesh::NumericVector<double>&, int) const + 159 > 7: 7 libmesh_dbg.0.dylib 0x0000000105414bbd libMesh::System::restrict_vectors() + 1085 > 8: 8 libmesh_dbg.0.dylib 0x0000000105415b19 libMesh::System::prolong_vectors() + 25 > 9: 9 libmesh_dbg.0.dylib 0x0000000105357a9c libMesh::EquationSystems::reinit() + 9948 > 10: 10 libmoose-dbg.0.dylib 0x00000001017e8703 FEProblemBase::meshChanged() + 211 > 11: 11 libmoose-dbg.0.dylib 0x00000001017e85b1 FEProblemBase::updateMeshXFEM() + 161 > 12: 12 libmoose-dbg.0.dylib 0x0000000101e45216 Transient::solveStep(double) + 1238 > 13: 13 libmoose-dbg.0.dylib 0x0000000101e44cfd Transient::takeStep(double) + 253 > 14: 14 libmoose-dbg.0.dylib 0x0000000101e449a3 Transient::execute() + 211 > 15: 15 libmoose-dbg.0.dylib 0x0000000101bac773 MooseApp::executeExecutioner() + 179 > 16: 16 libmoose-dbg.0.dylib 0x0000000101baeeda MooseApp::run() + 282 > 17: 17 xfem-dbg 0x0000000100002343 main + 307 > 18: 18 libdyld.dylib 0x00007fff9139d235 start + 1 > > On Fri, Jan 12, 2018 at 8:59 AM, Roy Stogner <roy...@ic...> wrote: > > On Thu, 11 Jan 2018, Jiang, Wen wrote: > > This is Wen. I have been working with Ben at INL on the XFEM > development. Recently, I made some changes to the XFEM codes and I > got a libmesh assertion error, shown below, > > The change I made is minor. Originally, we create new child element > from parent element and then delete parent element. Now, I just > changed some nodes with the parent element and use it without > deleting it. I am not sure why it will cause this issue. Could you > give me some suggestions? > > > Assertion `node->old_dof_object' failed. > > > That stack trace is in the first step of a mesh-to-mesh projection, > where we figure out what data needs to be set from processor to > processor: if new element A is owned by processor 1 and its solution > is going to be a restriction or prolongation involving element B which > used to be owned by processor 2, then processor 2 needs to send some > DoF coefficients for element B to processor 1. So we're looping over > all new elements and checking for the old data which is going to be > needed. > > At nodes on C0 elements, restriction and prolongation are easy: > they're just copies. If for some variable V, node A used to have DoF > index 100 and now has DoF index 110, then the processor which used to > own node A needs to send DoF coefficient 100 to the processor which > now owns node A, so that it can be used to set coefficient 110. > > So, we look at the Node parent class DofObject to figure out the new > indexing, and at the DofObject::old_dof_object to figure out the old > indexing, and if the old_dof_object doesn't exist then we have no idea > what to do. > > We hit and fixed this problem already with new element creation, I > believe... looks like that's at system_projection.C lines 1116-1132, > in the same method that you're seeing fail now? > > I think I understand. > > What *used* to happen is that you created elements fresh, so when the > projection code hit them it saw that they had no old_dof_object and no > JUST_REFINED or JUST_COARSENED flag, so it recognized them as new > elements and didn't bother trying to do solution projection on them. > > What *now* happens is that you're reusing an old Elem object (which is > a decent idea for performance reasons alone, don't get me wrong!) but > you're assigning brand new Node objects to it? So when the projection > code looks at this Elem it sees the old_dof_object on it, it doesn't > consider the Elem to be "new", it tries to look up old DoF indices, it > can't *find* any old_dof_object on the new Nodes, and it screams and > dies. > > So what the fix should be depends on what you want to happen with the > solution on this object, I guess. If this is a "shrinking" element > that should keep its old solution values then you'd want to try to > reuse its old Node objects too; if it's (conceptually) a "new" element > on which libMesh should leave new DoF values unset then you'd want to > manually clear_old_dof_object() on it so libMesh doesn't mistake it > for an old element. > > Hope this helps! Let me know if my guesses turn out to be wrong or if > you have more questions. > --- > Roy > > > Stack frames: 20 > 0: 0 libmesh_dbg.0.dylib 0x0000000103ced1cf libMesh::print_trace(std::__1::basic_ostream<char, > std::__1::char_traits<char> >&) + > 2287 > 1: 1 libmesh_dbg.0.dylib 0x0000000103cdbfea libMesh::MacroFunctions::report_error(char const*, int, char > const*, char const*) + > 634 > 2: 2 libmesh_dbg.0.dylib 0x0000000103b3dd7a libMesh::DofMap::old_dof_indices(libMesh::Elem const*, > std::__1::vector<unsigned > int, std::__1::allocator<unsigned int> >&, unsigned int) const + 10218 > 3: 3 libmesh_dbg.0.dylib 0x00000001054d13de > libMesh::BuildProjectionList::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem > const*> const&) + 5726 > 4: 4 libmesh_dbg.0.dylib 0x00000001054cd229 void > libMesh::Threads::parallel_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, > libMesh::BuildProjectionList>(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> > const&, > libMesh::BuildProjectionList&) + 153 > 5: 5 libmesh_dbg.0.dylib 0x00000001054ca3ad > libMesh::System::project_vector(libMesh::NumericVector<double> const&, > libMesh::NumericVector<double>&, int) const + 3149 > > > > |