|
From: Siddharth N. <sn...@dr...> - 2014-06-17 05:57:26
|
On 17 June 2014 00:51, Philippe Waroquiers <phi...@sk...> wrote: > On Mon, 2014-06-16 at 22:27 -0400, Siddharth Nilakantan wrote: > > Thanks Philippe. After some investigation, I believe what you > > mentioned is happening exactly. My secondary maps in the shadow memory > > are sized at 2.5MB in the particular example we were discussing. This > > would result in 4MB block sizes being allocated by the VG_arena_malloc > > function. > > > No matter what the payload size, is it always going to alloc at 4MB > > chunks? > It will always mmap superblocks of 4MB or more (in case a > malloc > 4Mb is requested). > > > Does the pszB_to_bszB always return 4MB? > IIRC, pszB_to_bszB is for calculating the usable size of a malloc-ed > block, not used for a mmap-ed superblock. > > > I assume not, because the numbers from profile-heap don't add up > > otherwise. Below is an example just before the tool died: > I think you are confusing malloc-ed blocks with mmap-ed superblocks. > > Ah yes, sorry, I got confused when reading the code. Every malloc does not necessarily lead to a new superblock. In my case, just the requests for a Secondary map entry necessarily resulted in a new superblock. A quick calculation, 7512 sec map entries * (4MB - 2.5MB) = 11GB is very close to the difference I see between VMSpace usage and allocated bytes (29G - 18G). > > > --26681-- tool : 31516000256/31516000256 max/curr mmap'd, 0/0 > > unsplit/split sb unmmap'd, 19706049824/19706049824 max/curr, > > 20929/19706065264 totalloc-blocks/bytes, 28217271 searches > totalloc is the sum of all blocks that have been allocated, some of > these blocks have been freed (very few in your case). > > > > > > -------- Arena "tool": 31516000256/31516000256 max/curr mmap'd, 0/0 > > unsplit/split sb unmmap'd, 19706049824/19706049824 max/curr on_loan > > -------- > > 16 in 1: sched_lock > > 64 in 1: cl.threads.nes.1 > > 80 in 1: cl.init_funcarray.sys.3 > > 80 in 1: cl.clo.nf.1 > > 96 in 1: cl.dump.init_dumps.1 > > 96 in 2: hashtable.Hc.1 > > 112 in 1: options.efn.1 > > 128 in 1: cl.dump.init_dumps.2 > > 144 in 5: cl.events.group.1 > > 160 in 2: commandline.sua.3 > > 160 in 1: cl.events.geMapping.1 > > 208 in 1: cl.init_funcarray.sys.4 > > 208 in 2: commandline.sua.2 > > 208 in 1: cl.threads.nt.1 > > 256 in 1: initimg-linux.sce.5 > > 400 in 7: cl.fn.non.2 > > 448 in 7: cl.events.eventset.1 > > 592 in 2: cl.clo.nc.1 > > 640 in 13: cl.fn.nfn.2 > > 3,840 in 8: cl.fn.non.1 > > 4,000 in 1: cl.context.ifs.1 > > 5,728 in 223: cl.fn.nfnnd.2 > > 9,568 in 13: cl.fn.nfn.1 > > 10,368 in 162: cl.funcinfo.gc.1 > > 12,320 in 2: hashtable.Hc.2 > > 17,840 in 223: cl.fn.nfnnd.1 > > 20,304 in 1: cl.context.ict.1 > > 28,000 in 1: cl.callstack.ics.1 > > 29,824 in 466: cl.jumps.nj.1 > > 35,504 in 1: cl.jumps.ijh.1 > > 40,288 in 1: cl.fn.gfe.1 > > 58,704 in 3,667: cl.bbcc.nr.1 > > 67,504 in 1: cl.bb.ibh.1 > > 80,080 in 1: cl.drwinit_thread.gc.1 > > 80,080 in 1: cl.init_funcarray.sys.2 > > 83,504 in 1: cl.bbcc.ibh.1 > > 240,000 in 1: cl.init_funcarray.sys.1 > > 259,200 in 162: cl.funccontext.gc.1 > > 467,376 in 2,053: cl.bb.nb.1 > > 475,072 in 571: cl.context.nc.1 > > 521,712 in 3,752: cl.bbcc.nb.1 > > 1,581,056 in 3: cl.sim.cs_ic.1 > > 1,600,064 in 2: cl.costs.gc.1 > > 2,621,440 in 1: cl.init_funcarray.sm.1 > > 5,435,072 in 2,016: cl.funcinst.gc.1 > > 19,692,257,280 in 7,512: cl.copy_for_writing.sm.1 -----> Secondary > > Map allocations > > > > > > As we can see, for the tool the "totalloc-blocks" are 20929, with its > > bytes at about 18.3G. > totalloc-blocks are malloc-ed blocks, not mmap-ed superblocks. > > > 20929 * 4MB is a very large number and much greater than even 29G > > which is the reported size of the entire "tool" arena. > > Looking at the details for the "tool" arena and counting the total > > number of allocations, I see 20897 allocations, almost the same as > > 20929 shown at the top. The allocations are dominated by my Secondary > > Map. I'm guessing that fixing the size of the Secondary Maps to > > something that 'aligns' well with 4MB (2.5MB does pretty terribly) > > will reduce the mismatch between VMSpace and counted bytes allocated. > > Do you concur? > Yes these 2.5 MB blocks are very probably the problem. > So, either you choose a better size, or alternatively rather than using > VG_(malloc), you use mmap. See pub_tool_aspacemgr.h about how to get > memory with mmap, for example you could call VG_(am_shadow_alloc) > > Thanks very much. Any special considerations when using Valgrind's mmap? I'll try a few things and report if this solves the problem. > Philippe > > > > |