From: John P. <jwp...@gm...> - 2013-11-15 06:15:54
|
Hi, (Ben and Roy, since you guys wrote most of the Parallel communication stuff I hope you can comment on this...) Cubit 14 has apparently started doing something new when writing Exodus files: it uses the node_num_map and elem_num_map data fields to define a node and element numbering which is different from the order in which the x,y,z coordinates and element connectivities are actually written to the file. I patched the Exodus reader to try and respect these element and node numbering mappings in c4a7cd1a71 and e669b64e17. This is important for users who assign nodesets to specific nodes using the Cubit GUI, or who have to interface with different codes using the same mesh file, and would like to maintain a consistent node numbering in both. Cubit is also more than happy to use non-contiguous numberings in these maps... in one example mesh I have, there are 61,794 nodes, but the node IDs range from 1 to 108,878, meaning there are around 47,000 "unused" node IDs. This is a little annoying, but keeping NULL pointers in the _nodes and _elements vectors of SerialMesh seems to work OK when running in serial. In parallel, things are working fine until we hit the call to MeshCommunication::broadcast(), as shown in the stack trace below. The issue is that we create a packed range from mesh.nodes_begin()/end(), which automatically skips over all the NULL entries. When you unpack this range, you end up trying to append a Node in the wrong place (by calling add_node()) and get an assert: Assertion `!n->valid_id() || n->id() == _nodes.size()' failed. [1] ../src/mesh/serial_mesh.C, line 478, compiled Nov 13 2013 at 09:56:49 The same thing would most likely happen with the elements as well. Would it be possible to have the mesh_inserter_iterator do something other than call mesh.add_node(n)? I think if it did something with logic more similar to Mesh::add_point(), which inserts a node with a valid ID in the appropriate location of the _nodes vector, MeshCommunication::broadcast() might actually just work... Thoughts? 0: 0 libmesh_dbg.0.dylib 0x00000001091dabb5 libMesh::print_trace(std::ostream&) + 39 1: 1 libmesh_dbg.0.dylib 0x00000001091dadaf libMesh::write_traceout() + 226 2: 2 libmesh_dbg.0.dylib 0x00000001091d6de2 libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*) + 61 3: 3 libmesh_dbg.0.dylib 0x0000000109644168 libMesh::SerialMesh::add_node(libMesh::Node*) + 294 4: 4 libmesh_dbg.0.dylib 0x000000010953c05e libMesh::mesh_inserter_iterator<libMesh::Node>::operator=(libMesh::Node*) + 54 5: 5 libmesh_dbg.0.dylib 0x00000001095373df void libMesh::Parallel::unpack_range<libMesh::MeshBase, unsigned long long, libMesh::mesh_inserter_iterator<libMesh::Node> >(std::vector<unsigned long long, std::allocator<unsigned long long> > const&, libMesh::MeshBase*, libMesh::mesh_inserter_iterator<libMesh::Node>) + 288 6: 6 libmesh_dbg.0.dylib 0x0000000109532a68 void libMesh::Parallel::Communicator::broadcast_packed_range<libMesh::MeshBase, libMesh::MeshBase, libMesh::MeshBase::node_iterator, libMesh::mesh_inserter_iterator<libMesh::Node> >(libMesh::MeshBase const*, libMesh::MeshBase::node_iterator, libMesh::MeshBase::node_iterator, libMesh::MeshBase*, libMesh::mesh_inserter_iterator<libMesh::Node>, unsigned int) const + 332 7: 7 libmesh_dbg.0.dylib 0x0000000109528674 libMesh::MeshCommunication::broadcast(libMesh::MeshBase&) const + 1134 8: 8 libmesh_dbg.0.dylib 0x000000010967569a libMesh::UnstructuredMesh::read(std::string const&, libMesh::MeshData*, bool) + ... -- John |