Thread: [Valgrind-users] reply： helgrind use more than 32G memory.

Brought to you by: njn, sewardj, wielaard

valgrind-users

[Valgrind-users] reply： helgrind use more than 32G memory.

From: 王阳 <412...@qq...> - 2015-06-16 10:53:54

Hi Philippe,


>The best way to understand where valgrind/helgrind spends
>memory is to use --stats=yes and post the result here.
>Really, it is really the best way :).
>If it takes too long to reach the end (or it crashes
>before producing the stats),
>you can from the command line use
>   vgdb v.info stats
>to get the needed info while helgrind is running.


I can not post log directly, I copy it and write here.as follows:
 use --stats=yes valgrind 3.10.1


......
--8146--libhb:EvM GC： delete generations 129 and below, retaining 505211 entries
--8146--libhb:EvM GC： delete generations 139 and below, retaining 526605 entries
--8146--libhb:EvM GC： delete generations 149 and below, retaining 520572 entries
--8146--libhb:EvM GC： delete generations 159 and below, retaining 504065 entries
--8146-- univ_laog_do_GC enter cardinality 31
--8146-- univ_laog_do_GC exit seen 24 next gc at cardinality 32
--8146-- univ_laog_do_GC enter cardinality 33
--8146-- univ_laog_do_GC exit seen 30 next gc at cardinality 33

....
--8146-- univ_laog_do_GC enter cardinality 61779
--8146-- univ_laog_do_GC exit seen  41326 next gc at cardinality 61780

--8146-- univ_laog_do_GC enter cardinality 61780
--8146-- VALGRIND INTERNAL ERROR:Valgrind received a signal 11 (SIGSEGV) -exiting
--8146-- si_code = 1; Faulting address:0x8031AD000; sp: 0x80317db70


valgrind :the 'impossible' happened:
   Killed by fatal signal


host stacktrace:
==8146== at 0x3802D180:reclaimSuperblock(m_mallocfree.c:918)
==8146== by 0x3802F2D0:deferred_reclaimSuperblock(m_mallocfree.c:1939)
==8146== by 0x3800CA22:delete_WV (hg_wordset.c:204)
==8146== by 0x3800EC54:vgHelgrind_dieWS(hg_wordset.c:487)
==8146== by 0x38005816:univ_laog_do_GC(hg_main.c:3454)
















HI Philippe,


》memory is to use --stats=yes and post the result here.
>Really, it is really the best way :).
I can not export log from my PC, because PC is isolated from internet.
Can I take a photo of log to you or Can pick up some key information from log to you.
By the way, the size of email is limited to 40KB, am I right? so posting a photo maybe not work out. 


> HI Philippe,
>  I read valgrind user manual, there are some hints which are related
> to my problem I guess. As follows,
> Myprog uses tons of mmap and memory pool ,and do not use free/delete
> to give back memory to pool.
> My question is that If mypog use memoy pool without free/delete and
> don't use VALGRIND_HG_CLEAN_MEMORY will lead to myprog do mmap
> endlessly until use 64G memory?
No, HG_CLEAN_MEMORY is not useful to use less memory.
It is only useful if you recycle memory, and you get
false positive race errors due to this recycling.

The best way to understand where valgrind/helgrind spends
memory is to use --stats=yes and post the result here.
Really, it is really the best way :).
If it takes too long to reach the end (or it crashes
before producing the stats),
you can from the command line use
   vgdb v.info stats
to get the needed info while helgrind is running.

Philippe

[Valgrind-users] reply： helgrind use more than 32G memory.

From: 王阳 <412...@qq...> - 2015-06-18 06:32:01

Hi Philippe,


>If --track-lockorders=no does not solve the problem,
yes, it does not work, but the increasing speed of cousuming memory is enhanced, and it break the limit of 64G quickly. 


>can you then re-run with
>and post the result ?

I have done it two times with different options series. I send it with two email because of limit size of email.
The result is as follows:
1.--tool=helgrind --gen-suppressions=all --stats=yes --profile-heap=yes --track-lockorders=no --read-var-info=yes --trace-children=yes --error-limit=no --log-file=xxx.log myprog 


sending command v.info stats to pid 19749
--19749-- translate:            fast SP updates identified: 0 (  0.0%)
--19749-- translate:   generic_known SP updates identified: 4,510 ( 93.6%)
--19749-- translate: generic_unknown SP updates identified: 306 (  6.3%)
--19749--     tt/tc: 2,095,227 tt lookups requiring 2,141,230 probes
--19749--     tt/tc: 2,072,199 fast-cache updates, 6 flushes
--19749--  transtab: new        30,619 (860,271 -> 7,678,917; ratio 89:10) [0 scs]
--19749--  transtab: dumped     0 (0 -> ??)
--19749--  transtab: discarded  18 (594 -> ??)
--19749-- scheduler: 265,075,965 event checks.
--19749-- scheduler: 164,155,368 indir transfers, 1,638,407 misses (1 in 100)
--19749-- scheduler: 2,649/21,359,755 major/minor sched events.
--19749--    sanity: 2655 cheap, 54 expensive checks.
--19749--    exectx: 12,289 lists, 10,469 contexts (avg 0 per list)
--19749--    exectx: 9,270,498 searches, 9,276,450 full compares (1,000 per 1000)
--19749--    exectx: 0 cmp2, 8,581 cmp4, 0 cmpAll
--19749--  errormgr: 3 supplist searches, 55 comparisons during search
--19749--  errormgr: 4,586 errlist searches, 8,581 comparisons during search


   WordSet "univ_lsets":
      addTo           2057968 (113892 uncached)
      delFrom         1831092 (228 uncached)
      union                 0
      intersect             1 (0 uncached) [nb. incl isSubsetOf]
      minus                 0 (0 uncached)
      elem                  0
      doubleton             0
      isEmpty               0
      isSingleton           0
      anyElementOf          0
      isSubsetOf            1
      dieWS                 0


        locksets:  113,832 unique lock sets
  LockN-to-P map:        1 queries (1 map size)
string table map:        0 queries (0 map size)
           locks: 1,029,288 acquires, 915,546 releases
   sanity checks:        1


<<< BEGIN libhb stats >>>
 secmaps:     71,826 allocd ( 588,398,592 g-a-range)
  linesZ:  9,193,728 allocd ( 441,298,944 bytes occupied)
  linesF:    422,865 allocd ( 219,889,800 bytes occupied)
 secmaps:          0 iterator steppings
 secmaps: 34,955,464 searches (  11,249,487 slow)


   cache: 1,317,086,773 totrefs (14,233,749 misses)
   cache:     13,278,898 Z-fetch,           954,851 F-fetch
   cache:     12,856,393 Z-wback,         1,311,820 F-wback
   cache:             19 invals,                 18 flushes
   cache:  6,553,527,149 arange_New   1,782,704,320 direct-to-Zreps


   cline: 14,233,749 normalises
   cline: c rds 8/4/2/1:   110,690,124    16,459,300     4,482,262    56,304,543
   cline: c wrs 8/4/2/1:   272,628,312    29,239,745    17,673,480    39,104,868
   cline: s wrs 8/4/2/1:   765,271,067     1,532,438     2,226,925     1,414,393
   cline: s rd1s 63,510, s copy1s 63,510
   cline:    splits: 8to4    6,452,847    4to2    7,537,034    2to1    7,914,706
   cline: pulldowns: 8to4   25,768,693    4to2   17,244,662    2to1   24,124,885


   libhb:   183,405,289 msmcread  (123,158,943 dragovers)
   libhb:   341,272,758 msmcwrite (47,503,666 dragovers)
   libhb:   160,362,776 cmpLEQ queries (12,643,035 misses)
   libhb:   124,995,684 join2  queries (6,250,924 misses)


   libhb: VTSops: tick 1,831,132,  join 6,250,924,  cmpLEQ 12,643,035
   libhb: VTSops: cmp_structural 191,246,876 (176,949,227 slow)
   libhb: VTSset: find__or__clone_and_add 8,082,057 (925,673 allocd)
   libhb: VTSops: indexAt_SLOW 6


   libhb: 673314 entries in vts_table (approximately 16159536 bytes)
   libhb: 673314 entries in vts_set


   libhb: ctxt__rcdec: 1=160967522(38314009 eq), 2=9, 3=8793505
   libhb: ctxt__rcdec: calls 169761036, discards 0
   libhb: contextTab: 196613 slots, 3466 max ents
   libhb: contextTab: 170662609 queries, 174681088 cmps
<<< END libhb stats >>>


sending command v.info memory aspacemgr to pid 19749
58646061056 bytes have already been allocated.
--19749-- core    : 57180975104/57180975104  max/curr mmap'd, 22/23 unsplit/split sb unmmap'd,  52561781768/52561765192 max/curr,     7522704/55692746552 totalloc-blocks/bytes,   305297631 searches 8 rzB
--19749-- dinfo   : 580026368/499212288  max/curr mmap'd, 51/31 unsplit/split sb unmmap'd,  424546576/167730720 max/curr,    12146404/1396066272 totalloc-blocks/bytes,    12735394 searches 8 rzB
--19749-- client  : 402145280/402145280  max/curr mmap'd, 2/2 unsplit/split sb unmmap'd,  189697952/188416512 max/curr,     7974243/ 558505808 totalloc-blocks/bytes,     7974428 searches 24 rzB
--19749-- demangle:        0/       0  max/curr mmap'd, 0/0 unsplit/split sb unmmap'd,         0/       0 max/curr,           0/         0 totalloc-blocks/bytes,           0 searches 8 rzB
--19749-- ttaux   :  1425408/  987136  max/curr mmap'd, 5/0 unsplit/split sb unmmap'd,   1303280/  900224 max/curr,        3370/   2522400 totalloc-blocks/bytes,        3360 searches 8 rzB
-------- Arena "core": 57180975104/57180975104 max/curr mmap'd, 22/23 unsplit/split sb unmmap'd, 52561782008/52561765192 max/curr on_loan 8 rzB --------
51,760,182,272 in   341,498: hg.ids.4
  219,921,328 in    32,858: libhb.aFfw.1 (LineF storage)
  120,972,208 in    57,714: libhb.event_map_init.4 (oldref tree)
  104,788,896 in     2,622: hg_malloc_metadata_pool
   96,104,640 in     1,004: libhb.event_map_init.3 (OldRef pools)
   56,755,872 in   341,226: hg.lNaw.1
   55,645,808 in       538: libhb.event_map_init.1 (RCEC pools)
   32,779,504 in   673,315: libhb.vts_tab__do_GC.new_set
   25,169,840 in         1: hashtable.resize.1
   20,995,456 in   262,371: hg.mk_Lock.1
   18,970,400 in   576,604: libhb.vts_set_focaa.1
   16,801,792 in         2: libhb.vts_tab__do_GC.new_tab
   12,603,904 in   262,372: hg.ids.2
    8,412,208 in   262,398: libhb.SO__Alloc.1
    3,470,672 in    71,827: libhb.zsm_init.1 (map_shmem)
    3,216,640 in    96,710: libhb.vts_tab__do_GC.new_vts
    2,097,168 in         1: libhb.libhb_init.1
    1,572,912 in         1: libhb.event_map_init.2 (context table)
    1,080,408 in    10,469: perm_malloc
       98,320 in         1: execontext.reh1
       65,536 in         4: libhb.Thr__new.2
       36,896 in        22: gdbsrv
        6,160 in         1: hashtable.Hc.2
        4,000 in         1: hg.ids.1
        2,240 in        28: errormgr.losf.1
        1,472 in        22: hg.mctCI.1
        1,072 in        28: errormgr.losf.2
          992 in        39: errormgr.sLTy.1
          832 in        28: errormgr.losf.4
          768 in         3: errormgr.mre.3
          704 in        21: hg.mctCloa.1
          608 in         4: hg.mstSs.1
          576 in         4: hg.mpttT.1
          480 in         2: hg.mLPfLN.1
          288 in         6: hg.pSfs.1
          256 in         4: hg.mk_Thread.1
          240 in         3: errormgr.mre.1
          192 in         6: errormgr.sLTy.2
          192 in         4: libhb.Thr__new.1
          160 in         2: commandline.sua.2
          144 in         1: initimg-linux.sce.5
          144 in         1: m_cache
          128 in         2: libhb.Thr__new.4
          128 in         4: stacks.rs.1
          128 in         2: commandline.sua.3
          128 in         2: hashtable.Hc.1
           96 in         6: hg.eWSiLPa
           80 in         1: hg.mLPfLN.2
           80 in         1: gdbserved_watches
           80 in         1: libhb.verydead_thread_table_init.1
           64 in         1: main.mpclo.3
           64 in         1: options.efn.4
           16 in         1: sched_lock


-------- Arena "dinfo": 580026368/499212288 max/curr mmap'd, 51/31 unsplit/split sb unmmap'd, 424546576/167730720 max/curr on_loan 8 rzB --------
   51,720,032 in   609,243: di.readdwarf3.mgGX.2
   41,726,624 in    98,768: di.storage.avta.2
   17,505,424 in         5: di.storage.addInl.1
   12,078,432 in         6: di.storage.addLoc.1
    7,079,824 in        12: di.readdwarf3.ndrw.4
    6,725,712 in        12: di.readdwarf3.ndrw.7 (TyEnt to-keep array)
    5,241,120 in    95,776: di.storage.addVar.2
    4,713,904 in    50,423: di.storage.avta.1
    4,579,296 in       131: di.storage.addStr.1
    3,326,080 in    13,086: di.readdwarf3.ptD.struct_type.2
    3,121,120 in    96,105: di.readdwarf3.msGX.1
    2,512,000 in        16: di.storage.addSym.1
    2,186,032 in    29,895: di.readdwarf3.pTD.struct_type.1
    1,608,768 in        16: di.storage.finCfSI.1
    1,355,216 in         6: di.storage.addLoc.2
      485,504 in    25,397: di.readdwarf3.ptD.member.1
      348,864 in         4: di.ccCt.2
      333,488 in        16: di.storage.finCfSI.2
      220,016 in     8,539: di.readdwarf3.pTD.enumerator.1
      148,816 in     1,306: di.readdwarf3.ptD.enum_type.1
      129,152 in     2,430: di.readdwarf3.ptD.array_type.1
      105,808 in        64: di.storage.DiCfSI_m_pool
      103,968 in     4,922: di.readdwarf3.ptD.typedef.1
       86,224 in     4,062: di.storage.aDntf.1
       83,344 in     5,209: di.readdwarf3.ptD.member.2
       72,320 in        63: di.storage.addVar.3
       51,456 in        24: di.storage.addFnDn.1
       19,776 in       209: redir.rnnD.1
       14,336 in        16: di.debuginfo.aDI.1
       13,760 in       140: redir.ri.1
       12,816 in       400: di.readdwarf3.pTD.enum_type.3
        5,344 in       253: di.readdwarf3.pTD.enum_type.2
        4,608 in       209: redir.rnnD.3
        3,440 in       209: redir.rnnD.2
        2,784 in        32: di.debuginfo.aDI.3
        1,776 in        86: di.readdwarf3.ptD.base_type.1
        1,232 in        12: di.storage.addVar.1
          640 in        16: di.debuginfo.aDI.2
          512 in        16: redir.rnnD.4
          448 in         2: di.ccCt.1
          416 in        17: di.readdwarf3.ptD.member.3
          224 in        12: di.redi.1
           64 in         4: di.redi.2


-------- Arena "client": 402145280/402145280 max/curr mmap'd, 2/2 unsplit/split sb unmmap'd, 189697952/188416512 max/curr on_loan 24 rzB --------
  188,416,512 in 2,618,098: replacemalloc.cm.1


-------- Arena "demangle": 0/0 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 0/0 max/curr on_loan 8 rzB --------


-------- Arena "ttaux": 1425408/987136 max/curr mmap'd, 5/0 unsplit/split sb unmmap'd, 1303280/900224 max/curr on_loan 8 rzB --------
      659,456 in         2: transtab.initialiseSector(host_extents)
      116,144 in       656: transtab.IEA__add
      106,864 in       257: transtab.aECN.1
       17,760 in       230: transtab.OEA__add

Re: [Valgrind-users] reply： helgrind use more than 32G memory.

From: Philippe W. <phi...@sk...> - 2015-06-16 19:35:17

On Tue, 2015-06-16 at 18:53 +0800, 王阳 wrote:

> --8146-- univ_laog_do_GC exit seen  41326 next gc at cardinality 61780
> --8146-- univ_laog_do_GC enter cardinality 61780
> --8146-- VALGRIND INTERNAL ERROR:Valgrind received a signal 11
> (SIGSEGV) -exiting
> --8146-- si_code = 1; Faulting address:0x8031AD000; sp: 0x80317db70
> valgrind :the 'impossible' happened:
>    Killed by fatal signal
The above means Valgrind crashed, for an undetermined reason
(either an internal bug in Valgrind, or alternatively, a bug
in your application that corrupts the memory).
Is your application 'memcheck-clean' ?

Before fixing the above, what you can do then is to capture
the statistics before it crashes, using from a shell:
   vgdb v.info stats
and post the resulting output
(do the above when your application has already consumed a significant
memory, but before it has crashed :).

Philippe

[Valgrind-users] reply： helgrind use more than 32G memory.

From: 王阳 <412...@qq...> - 2015-06-17 12:41:08

Hi Philippe,


>Is your application 'memcheck-clean' ?
yes , it is. By the way, myprog uses 5 GB memory without Valgrind, and uses 10GB memory with memcheck, and uses over 64GB with helgrind. Using memcheck myprog can run right, but using helgrind it crashs like what you have said.




>Before fixing the above, what you can do then is to capture
>the statistics before it crashes, using from a shell:
>   vgdb v.info stats
>and post the resulting output
>(do the above when your application has already consumed a significant
>memory, but before it has crashed :).
I take it when myprog under valgrind used 29.2GB memory, I guess it is significant enough.


sending command v.info stats to pid 28712
--28712-- translate :                             fast sp updates identified: 0 (0.0%)
--28712-- translate :             generic_known sp updates identified:4,510 (93.6%)
--28712-- translate :          generic_unknown sp updates identified:306 (6.3%)
--28712-- tt/tc:2,095,337 tt lookups requiring 2141437 probes
--28712-- tt/tc:2,072,309 fast-cache updates, 6 flushes
--28712-- new             30620(860,393 ->7,679,672; ratio 89:10) [0 scs]
--28712-- dumped      0(0->??)
--28712-- discarded    18(594 ->??)
--28712--  scheduler:264,186,172 event checks.
--28712--  scheduler:163,765,532 indir transfers, 1,638,538 misses (1 in 99)
--28712--  scheduler:2,641/21,065,894 major/minor sched events.
--28712--  sanity 2644 cheap, 53 expensive checks.
--28712--  exectx:12,289 lists, 10,468 contexts (avg 0 per list)
--28712--  exectx :9,172,551 searches, 9,179,201 full compares (1,000 per 1000)
--28712-- exectx : 0 cmp2,   8,734 cmp4, 0 cmpAll
--28712-- errormgr: 3 supplist searches, 55 comparisions during search
--28712-- errormgr: 4,705 errlist searches, 8,734 comparisons during search
Wordset " univ_lsets ":
         addTo                         1861656(15667 uncached)
         delForm                       1831214(228 uncached)
         union                           0
         intersect                      1(0 uncached) [nb. incl isSubsetOf]
         minus                          0 (0 uncached)
         elem                            947507
         doubleton                    0
         isEmpty                       931132
         isSingleton                   0
         anyElemtentOf              0
         ifsubsetof                    1
         dieWS                         0


Wordset "univ_laog":
         addTo 240996259(240994769 uncached)
         delForm           10 (10 uncached)
         union                0
         intersect           0 (0 uncahed )  [nb .incl isSubsetof]
         minus               0
         elem                 0
         doubleton         15559
         isEmpty            0
         isSingleton        0
         anyElementOf    0
         isSubsetof        0
         dieWS              240966779


          locksets:   15,615 unique lock sets
        univ_laog:   46,445 unique lock sets
LockN-to-P map:      1 queries(1 map size)
string table map:      0 queries(0 map size)
               LAOG:15.557 map size
LAOG exposition: 120,505,100 map size
                locks: 931,132 acquires, 915,607 releases
    sanity checks:


<<<BEGIN libhb stats>>>
secmaps:     66,706 allocd (546,455,552 g-a-range)
linesZ:         8,538,368 allocd(409,841,664 bytes occupied)
linesF:          531,515 allocd (276,387,800 bytes occupied)
secmaps:  0 iterator steppings
secmaps:34,659,884 searches(10,913,136 slow)


cache: 1,312,567,566 totrefs (14,120,124 misses)
cache:  13,092,998 Z-fetch,    1027,126 F-fetch
cache:  12,669,322 Z-wback,   1,385,266 F-wback
cache:                  19 invals,  18 flushes
cache:  6,522,063,696 arang_New  1,770,286,400 direct-to-Zreps


cline:14,120,124 normalises
cline:c rds 8/4/2/1 : 110,939,011   16,127,798   4,417,962   56,259,749
cline:c wrs 8/4/2/1: 271,897,522    29,052,560   17,694,752   39,060,366
cline:s wrs 8/4/2/:761,884,776        1,532,437    2,226,925     1,414,392
cline:s rdls 63,510  s copyls 63,510
cline: splits:8to4     6,458,760  4to2    7,524,761  2to1 7,814,224
cline:pulldowns:8to4  25,582,382  4to2 17,336,120 2to1 24,085,196




libhb: 183,323,453 msmcread (122,877,511 dragovers)
libhb: 340,328,522 msmcwrite (47,347,878 dragovers)
libhb:159,883,155 cmpLEQ queries (12,644,655 misses)
libhb:124,714,476 join2 queries(6,251,276 misses)


libhb:VTSops:tick 1,831,254 ,join 6,251,276, cmpLEQ 12,644,655
libhb:VTSops: cmp_structural 195,342,726 (176,646,626 slow)
libhb:VRSset: find_or_clone_and_add 8,082,531 (925,969 allocd)
libhb:VTSops:indexAt_SLOW 6


libhb:679272 entries in vts_table (approximately 16302528 bytes)
libhb:679272 entries in vts_set


libhb:ctxt_rcdec:1=161037815(38320474 eq), 2=9, 3=8292043
libhb:ctxt_rcdec:calls 169329867,discards 0
libhb:contextTab: 196613 slots ,258461 max ents
libhb:contextTab：170225389 queries, 173948778 cmps


<<<END libhb stats>>>





















> --8146-- univ_laog_do_GC exit seen  41326 next gc at cardinality 61780
> --8146-- univ_laog_do_GC enter cardinality 61780
> --8146-- VALGRIND INTERNAL ERROR:Valgrind received a signal 11
> (SIGSEGV) -exiting
> --8146-- si_code = 1; Faulting address:0x8031AD000; sp: 0x80317db70
> valgrind :the 'impossible' happened:
>    Killed by fatal signal
The above means Valgrind crashed, for an undetermined reason
(either an internal bug in Valgrind, or alternatively, a bug
in your application that corrupts the memory).
Is your application 'memcheck-clean' ?

Before fixing the above, what you can do then is to capture
the statistics before it crashes, using from a shell:
   vgdb v.info stats
and post the resulting output
(do the above when your application has already consumed a significant
memory, but before it has crashed :).

Philippe

Re: [Valgrind-users] reply： helgrind use more than 32G memory.

From: Philippe W. <phi...@sk...> - 2015-06-17 20:49:06

On Wed, 2015-06-17 at 20:25 +0800, 王阳 wrote:

> >Before fixing the above, what you can do then is to capture
> >the statistics before it crashes, using from a shell:
> >   vgdb v.info stats
> >and post the resulting output
> >(do the above when your application has already consumed a
> significant
> >memory, but before it has crashed :).
> I take it when myprog under valgrind used 29.2GB memory, I guess it is
> significant enough.
Yes, this is for sure ok.

> Wordset "univ_laog":
>          addTo 240996259(240994769 uncached)
>          delForm           10 (10 uncached)
This (and the LAOG exposition size) below is a possible
candidate for the big memory use.
Can you try to run with --track-lockorders=no
so as to disable the LAOG algorithm ?


>          union                0
>          intersect           0 (0 uncahed )  [nb .incl isSubsetof]
>          minus               0
>          elem                 0
>          doubleton         15559
>          isEmpty            0
>          isSingleton        0
>          anyElementOf    0
>          isSubsetof        0
>          dieWS              240966779
> 
> 
>           locksets:   15,615 unique lock sets
>         univ_laog:   46,445 unique lock sets
> LockN-to-P map:      1 queries(1 map size)
> string table map:      0 queries(0 map size)
>                LAOG:15.557 map size
> LAOG exposition: 120,505,100 map size
This LAOG exposition seems huge.
>                 locks: 931,132 acquires, 915,607 releases
>     sanity checks:
> 
> 
> <<<BEGIN libhb stats>>>
> secmaps:     66,706 allocd (546,455,552 g-a-range)
this means that your own memory allocation is not huge (slightly more
than 0.5GB).

> 
If --track-lockorders=no does not solve the problem,
can you then re-run with
   --stats=yes --profile-heap=yes

and while it runs (and has consumed already significant memory), do
   vgdb -c v.info stats  -c v.info memory aspacemgr
and post the result ?

Thanks

Philippe

Re: [Valgrind-users] reply： helgrind use more than 32G memory.

From: Philippe W. <phi...@sk...> - 2015-06-18 19:52:08

On Thu, 2015-06-18 at 14:20 +0800, 王阳 wrote:

> I have done it two times with different options series. The result is
> as follows:
> 1.--tool=helgrind --gen-suppressions=all --stats=yes
> --profile-heap=yes --track-lockorders=no --read-var-info=yes
> --trace-children=yes --error-limit=no --log-file=xxx.log myprog 
Ok, the origin of the memory is related to the locks/nr of locks
your application creates and/or locks/keeps locked.

About 50Gb of memory is consumed for hg.ids.4.
> 51,760,182,272 in   341,498: hg.ids.4
This is the 'universe of locksets' : ie. all the locksets that
helgrind has ever seen.
It looks like your application uses a lot of locks
and keeps a lot of locks in a status 'locked' shown by:
>           locks: 1,029,288 acquires, 915,546 releases
>From the above, I guess your application has thousands
of locks, and that each thread has sometimes hundreds
of locks held.

Well, if that is the case, I think helgrind data structure
is not done for such a behaviour :(  
There is no garbage collection of the lock set universe.

At this point, I think there are 2 possible approaches:
  * implement in helgrind a garbage collection of the univ_lsets
    (unclear if this is implementable, and for sure not trivial)
  * you change the limits of Valgrind max memory, to e.g. go
    to 128G or 256G.
    You might refer for that to the revision r13278, which increased
    from 32G to 64G. That gives the various constants to multiply by 2
    or 4 to go to 128G or 256G.
    So, checkout the SVN version, then do
    svn diff -r13277:13278
    and follow that pattern to increase to 128 or 256G
    
Philippe

[Valgrind-users] reply： helgrind use more than 32G memory.

From: 王阳 <412...@qq...> - 2015-06-19 07:55:50

Hi Philippe,


Thank you for your reply.
>Ok, the origin of the memory is related to the locks/nr of locks
>your application creates and/or locks/keeps locked.
>About 50Gb of memory is consumed for hg.ids.4.
What is the  hg.ids.4? Is it helgrind's memory using for analysing lock?
I know that a lock can not cost big memory, how many memory  does helgrind cost for analysing one lock?
It is unbelievable  that  helgrind will cost 50GB for 1,029,288 locks.




>From the above, I guess your application has thousands
>of locks, and that each thread has sometimes hundreds
>of locks held.Myprog have creates about 4096*64 locks, sometimes myprog will lock almost all the locks at one time, and release them later.


>There is no garbage collection of the lock set universe.
You mean the big memory allocated using for analysing locks can not be reused/recyled by  valgrind?




>At this point, I think there are 2 possible approaches:
>  * implement in helgrind a garbage collection of the univ_lsets
>    (unclear if this is implementable, and for sure not trivial)
>  * you change the limits of Valgrind max memory, to e.g. go
>    to 128G or 256G.
Both ways are so difficult for me, I am want to try the second way, but I need your your help.
Can you give me a patch for 3.10.1 to  change the limits of Valgrind max memory to 256G.Thanks.




> I have done it two times with different options series. The result is
> as follows:
> 1.--tool=helgrind --gen-suppressions=all --stats=yes
> --profile-heap=yes --track-lockorders=no --read-var-info=yes
> --trace-children=yes --error-limit=no --log-file=xxx.log myprog 
Ok, the origin of the memory is related to the locks/nr of locks
your application creates and/or locks/keeps locked.

About 50Gb of memory is consumed for hg.ids.4.
> 51,760,182,272 in   341,498: hg.ids.4
This is the 'universe of locksets' : ie. all the locksets that
helgrind has ever seen.
It looks like your application uses a lot of locks
and keeps a lot of locks in a status 'locked' shown by:
>           locks: 1,029,288 acquires, 915,546 releases
From the above, I guess your application has thousands
of locks, and that each thread has sometimes hundreds
of locks held.

Well, if that is the case, I think helgrind data structure
is not done for such a behaviour :(  
There is no garbage collection of the lock set universe.

At this point, I think there are 2 possible approaches:
  * implement in helgrind a garbage collection of the univ_lsets
    (unclear if this is implementable, and for sure not trivial)
  * you change the limits of Valgrind max memory, to e.g. go
    to 128G or 256G.
    You might refer for that to the revision r13278, which increased
    from 32G to 64G. That gives the various constants to multiply by 2
    or 4 to go to 128G or 256G.
    So, checkout the SVN version, then do
    svn diff -r13277:13278
    and follow that pattern to increase to 128 or 256G
    
Philippe