Re: [Libmesh-devel] libMesh/DistributedMesh Work

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I'll add one more reason to hate std::vector<bool>: threading.
 std::vector<bool> is not at _all_ thread safe: multiple processors working
on the same vector will clobber each other's values.  This leads to hair
pulling out race conditions that take forever to track down.

I personally wish they had made std::vector<bool> take 8 times more
memory... and then let you use something else if you needed the
optimization... but that's just me.

Derek

On Wed, May 2, 2018 at 2:24 PM Roy Stogner <roy...@ic...> wrote:

>
> I hope you don't mind, but I'm going to Cc: this to libmesh-devel.
> These are good questions, and my opinions do not match those of all
> the other developers and it might be unfair not to let them chime in.
>
> On Wed, 2 May 2018, Tuzun, Robert wrote:
>
> > Greetings.  I'm still in the middle of my hiatus, but there was one
> > question I've been meaning to ask.  I notice, in several places, the
> > use of vector<bool> in the code.  Way back in 2001, Scott Meyers
> > (Effective STL, Item 18) recommended against the use of vector<bool>
> > and recommended using deque<bool> or bitset instead.
>
> "As an STL container, there are really only two things wrong with
> vector<bool>.  First, it's not an STL container.  Second, it doesn't
> hold bools.  Other than that, there's not much to object to." - Scott
> Meyers.  I seriously love his writing.  It takes something special to
> make programming language minutia so entertaining.
>
> This is indeed often annoying, because the inability to take real
> references to vector entries or to get a pointer to a underlying array
> means we often can't handle vector<bool> with the same generic code as
> we use to handle vector<literally_anything_else>.  I assume you first
> encountered vector<bool> in our parallel_implementation.h overloads,
> written to handle just that problem?
>
> However, the recommended alternatives are even worse in many cases.
> If a code is always fine using deque<bool>, if using EIGHT TIMES as
> much memory never fazes the author, then Meyers' hundreds of C++
> recommendations should be ignored in favor of "Item 1: Use Java
> Instead" (or maybe Python or Go these days?).  Conversely, if a code
> is always happy with bitset, and never needs an array that can be
> resized, then what are you writing, autotranslated Fortran 77?
>
> Personally, I'm mostly with Howard Hinnant:
> https://isocpp.org/blog/2012/11/on-vectorbool
>
> What's abominable about vector<bool> is the name, because that's
> basically a lie about supporting the whole std::vector interface, but
> the data structure itself is worthwhile and often useful.  And I'm
> even sympathetic to the name!  Once you create a few well-chosen
> overloads (like the parallel_implementation.h stuff) then the rest of
> your generic code can *mostly* support vector<bool> as if it were a
> real std::vector.  And when you miss a needed specialization, the
> resulting breakage is a compile-time failure, so it's a hassle rather
> than a disaster.
>
>
> > I don't know if this advice still holds, and my search on the Web
> > yields conflicting answers (just do a Google search on vector<bool>
> > and you'll see what I mean).  Is there reason for concern?
>
> Well, despite everything I said above: yes, there is.
>
>
> First, to Meyers' objections I'd also have added "Third, it requires
> bit-fiddling for every access", which means if you have a vector small
> enough to *not* worry about memory savings, maybe you ought to be
> using vector<unsigned char> instead.  Grepping through the code for
> short logically-boolean vectors, I see that we're doing the
> unsigned-char-really-holding-bool trick in dof_map.C, and
> mesh_communication_global_indices.C... but we're using small
> vector<bool> in about a dozen places?
>
> I'm not going to change those without benchmarking first, because
> "unsigned char is faster due to less bit fiddling" vs "bool is faster
> due to fitting more stuff in cache" is the sort of system-dependent
> question I can't answer a priori, and in the absence of data I think
> vector<bool> is more self-documenting for something that's logically
> boolean.  If anyone wants to try to prove me wrong in the libMesh
> context, I'd say try System::boundary_project_solution for the
> benchmark.  That has a few vector<bool> inside loops over elements,
> so it looks like the easiest place to expose performance differences.
>
>
> Second and more seriously, in libMesh whenever find ourselves in a
> situation where tactically vector<bool> is definitely the right
> choice, it may mean that strategically we've made a huge mistake!
>
> Our classic use case for vector<bool> are vectors indexed over nodes
> or over elements, where we need to support millions or billions of
> entries and so using 8 times as much space is clearly a bad idea.
>
> *However*, some of that code was a reasonable design only because it
> dates back to ReplicatedMesh-only days, where we couldn't get away
> from having data stored for every element on every processor.  These
> days, a DistributedMesh user should be able to avoid ever requiring
> O(N_elements) memory on a single MPI rank, which means that
> applications should never allocate such a long vector!  So the library
> should never be using one, at least on a non-debug-mode
> non-ReplicatedMesh-only code path.
>
> So let's see...
>
> Some of these uses are OK.  E.g. the MeshTools assertion code is all
> debug-mode-only, where readability is much more important than
> efficiency.
>
> Some are more dubious.  E.g.  UnstructuredMesh::all_first_order() only
> gets used in practice for writing serialized viz output of an error
> estimate, so making it perfectly scalable would require improving
> multiple other parts of that code path first, meaning it's suboptimal
> but not high priority.
>
> Some of these uses are probably outweighed by other distributed mesh
> incompatibility issues.  At first glance I'd guess
> MeshTools::Modification::Smooth may not even work on a distributed
> mesh, but we've just never tried to use it that way.  Looks like
> LaplaceMeshSmoother is distributed-mesh-aware, so maybe everyone who
> needs smoothing post-distribution has just been using that?  I don't
> think we have test coverage of it though so I wouldn't trust it.
>
> But a couple of these uses are just way past due to be fixed!  We're
> still relying on big serialized vectors (not just of bool; ErrorVector
> is a straight-up vector<float> under the hood!) in our error
> estimation and our mesh refinement flagging code.
>
>
> > A separate question: in looking at parallel_implementation.h, I note
> > several references to Communicator::get() (e.g., line 298).  Can you
> > tell me where this is defined?  I could not find it anywhere.  For
> > example, the command grep Communicator::get -r * from the main
> > directory did not yield the answer.
>
> And it probably doesn't help that our online Doxygen isn't generated
> from master/HEAD!
>
>
> http://libmesh.github.io/doxygen/classlibMesh_1_1Parallel_1_1Communicator.html#a4b6bd36c38eb7de350fcd56510306ced
>
> is still useful, though, even though refactoring has moved that method
> to include/parallel/communicator.h, it's still the same shim accessor
> method.  We often leave any definition that short in the class
> declaration itself, where it doesn't need the ClassName:: prefix.
> ---
> Roy
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Libmesh-devel mailing list
> Lib...@li...
> https://lists.sourceforge.net/lists/listinfo/libmesh-devel
>