From: Benjamin K. <ben...@na...> - 2008-11-18 19:26:45
|
>> So I am looking into the size of Nodes and Elems, with an eye to adding some >> state information while hopefully not increasing the size of the objects by >> much (or at all). > > Interesting idea, but I wonder how much overhead the inefficiencies in > the default "new" allocator add. We do quite a bit of dynamic > allocation for DofObject indices and Elem neighbor/node/family links. > We even dynamically allocate the Nodes and Elems themselves, which I'd > expect incurs a few bytes of overhead. Really good question. I'd like to look into it, if for no other reason because we use it inside the DofObjects to create 2D arrays. John brought up the point that packing into a single vector may use the same memory or even less. Similarly, when we allocate the neighbors array, somewhere the size of that array is hidden so that delete [] can do its thing. Since we always know its size I wonder if there is any way around that... >> What has motivated this is that I would like an 'is_shared()' method in a >> Node to efficiently identify nodes which are shared between processors. >> (Note that testing the processor id of the node is insufficient when you own >> the node.) > > And getting a patch around the node to look for active ghost elements > isn't efficient. Right... >> 'on_boundary()' and 'has_dirichlet_bcs()' also come to mind as >> something which could be useful, > > Shouldn't BoundaryInfo (plus any Dirichlet constraint system we > eventually hack in) take care of those two? > It should probably manage them for sure, but the flags could still be useful I'd think to provide a 'boundary_node_iterator' for example. (not hard over on this.) >> In looking at the Elem, for example, we have two unsigned chars in there >> right now for h&p refinement flags... These are used to hold one of 7 >> possible values for RefinementState. I think this could be replaced with 6 >> bit states (the 'DO_NOTHING' inferred from no bits set) in the case of the >> h-refinement flag, and maybe 4 for the p-refinement flag. > > We can always interpret bits in tandem; 5 * 7 boolean states with 3 > bits each is simple enough. (using 5 bits total might be excessive) > >> (does INACTIVE or COARSEN_INACTIVE mean anything for the >> p-refinement flag?) > > No, they don't. > >> We might actually also be able to do away with the _p_level as well >> if we are willing to impose some maximum p-level, say 16 or >> whatever... > > I have a fantasy where we some day support spectral elements with > FFT-based techniques to handle p in the 100s... but for now, 15 (4 > bits) would be more than adequate. So I dusted off an old C book and found bit fields, which seem perfect... Something like struct PackedState { bool is_shared:1; bool is_remote:1; unsigned char h_flag:3; unsigned char p_flag:3; }; On my machine produces sizeof(PackedState)=1 Gotta admit, this is the first I've encountered these... The only thing that would have to change is that the few places we do this kind of thing: elem->refinement_flag() = COARSEN; We'd have to change to elem->refinement_flag(COARSEN); Since you can't readily get a reference to the members of a bit field. -Ben |