We build a std::map<> from Elem* to a unique sorted contiguous ID, as Metis only considers the active elements and needs some contiguous numbering. I expect that gets quite big, and maybe should be refactored to use a sorted std::vector<std::pair<Elem*, dof_id_type> > instead?

You are quite correct about this.  I ran our memory logger to compare the size of memory of a map<int,int> and a vector of pair<int,int>, both with 8M elements.  The results were pretty striking:


The map's peak memory usage is almost exactly 6X that of the vector's...  

The extra memory presumably comes from the "color", "parent", "left", and "right" data members stored at each node of the RB tree.  The last 3 of those are each 8-byte pointers on 64bit machines...