|
From: Christoph B. <bar...@or...> - 2007-07-27 12:55:13
|
Hi, I was quite confident with my version of valgrind. But when I run another big program I often see that more than 30% of runtime goes to vgPlain_search_transtab. Is there a document that describes the purpose and machinery behind this function? Christoph |
|
From: Konstantin S. <kon...@gm...> - 2011-01-13 08:47:37
|
Hi,
I am running one large test (chrome browser on a heavy JS page) under
Memcheck and ThreadSanitizer.
The profile for ThreadSanitizer process looks like this:
151192 56.6740 tsan-amd64-linux tsan-amd64-linux
vgPlain_search_transtab
10702 4.0116 tsan-amd64-linux tsan-amd64-linux
vgPlain_run_innerloop__dispatch_unprofiled
9741 3.6514 tsan-amd64-linux tsan-amd64-linux
vgPlain_discard_translations
Most of the time is spent in the inner loop in vgPlain_search_transtab
6 0.0024 : 3807fd75: cltq
33 0.0133 : 3807fd77: mov 0x4241ba(%rip),%rcx # 384a3f38
<n_lookup_probes>
8 0.0032 : 3807fd7e: imul $0x1030,%rax,%rax
57 0.0230 : 3807fd85: lea 0xfff1(%rcx),%rbp
: 3807fd8c: mov 0x384a3fa8(%rax),%rbx
175 0.0706 : 3807fd93: mov %edx,%eax
10 0.0040 : 3807fd95: jmp 3807fdb7
<vgPlain_search_transtab+0xa7>
: 3807fd97: nopw 0x0(%rax,%rax,1)
3992 1.6106 : 3807fda0: cmp %rsi,0x18(%r10)
27213 10.9791 : 3807fda4: je 3807fe00
<vgPlain_search_transtab+0xf0>
7304 2.9468 : 3807fda6: add $0x1,%eax
641 0.2586 : 3807fda9: cmp $0xfff1,%eax
1485 0.5991 : 3807fdae: cmove %r12d,%eax
7334 2.9589 : 3807fdb2: cmp %rbp,%rcx
420 0.1694 : 3807fdb5: je 3807fde0
<vgPlain_search_transtab+0xd0>
2269 0.9154 : 3807fdb7: movslq %eax,%r10
414 0.1670 : 3807fdba: add $0x1,%rcx
5084 2.0511 : 3807fdbe: lea (%r10,%r10,4),%r11
409 0.1650 : 3807fdc2: mov %rcx,0x42416f(%rip) # 384a3f38
<n_lookup_probes>
3501 1.4125 : 3807fdc9: lea (%r10,%r11,2),%r10
697 0.2812 : 3807fdcd: lea (%rbx,%r10,8),%r10
5691 2.2960 : 3807fdd1: mov 0x8(%r10),%r11d
65972 26.6165 : 3807fdd5: test %r11d,%r11d
6220 2.5095 : 3807fdd8: je 3807fda0
<vgPlain_search_transtab+0x90>
3302 1.3322 : 3807fdda: cmp $0x2,%r11d
7211 2.9093 : 3807fdde: jne 3807fda6
<vgPlain_search_transtab+0x96>
11 0.0044 : 3807fde0: add $0x1,%r13d
40 0.0161 : 3807fde4: add $0x4,%r14
11 0.0044 : 3807fde8: cmp $0x8,%r13d
8 0.0032 : 3807fdec: jne 3807fd6d
<vgPlain_search_transtab+0x5d>
Memcheck profile looks a bit less scary, but still most of the time is spent
in transtab.
34472 12.4832 memcheck-amd64-linux memcheck-amd64-linux
delete_translations_in_sector_eclass
31870 11.5409 memcheck-amd64-linux memcheck-amd64-linux
vgMemCheck_helperc_MAKE_STACK_UNINIT
26495 9.5945 memcheck-amd64-linux memcheck-amd64-linux
vgPlain_search_transtab
26203 9.4888 memcheck-amd64-linux memcheck-amd64-linux
vgPlain_discard_translations
Is there any known performance trouble in transtab when running jitted
code?
Are there any knobs one could tweak to boost transtab?
Thanks!
--kcc
|
|
From: Julian S. <js...@ac...> - 2011-01-21 15:00:26
|
I am really surprised to see this. I know that vgPlain_search_transtab does take some time, but it's not much more than 2 or 3 %. Especially after I put in some hacks to make it cheaper, some time around 3.6.0 (not sure when). The guest->host mapping is cached in a direct-mapped cache, VG_(tt_fast), and VG_(search_transtab) is only used when the cache misses. But the cache typically has a 99% hit rate, so VG_(search_transtab) should not see much action. The only way I can see is that you are jumping between two pieces of code which are exactly 2^N bytes apart (for N=17, or something like that), in the address space. Then the cache will miss on each reference because both addresses map to the same line and there is no associativity and no victim cache. Can you send the results from --stats=yes ? From that we can see the miss rate on VG_(tt_fast) and perhaps some other significant numbers. J On Thursday, January 13, 2011, Konstantin Serebryany wrote: > Hi, > > I am running one large test (chrome browser on a heavy JS page) under > Memcheck and ThreadSanitizer. > The profile for ThreadSanitizer process looks like this: > > 151192 56.6740 tsan-amd64-linux tsan-amd64-linux > vgPlain_search_transtab > 10702 4.0116 tsan-amd64-linux tsan-amd64-linux > vgPlain_run_innerloop__dispatch_unprofiled > 9741 3.6514 tsan-amd64-linux tsan-amd64-linux > vgPlain_discard_translations > > > Most of the time is spent in the inner loop in vgPlain_search_transtab > > 6 0.0024 : 3807fd75: cltq > 33 0.0133 : 3807fd77: mov 0x4241ba(%rip),%rcx # > 384a3f38 <n_lookup_probes> > 8 0.0032 : 3807fd7e: imul $0x1030,%rax,%rax > 57 0.0230 : 3807fd85: lea 0xfff1(%rcx),%rbp > > : 3807fd8c: mov 0x384a3fa8(%rax),%rbx > > 175 0.0706 : 3807fd93: mov %edx,%eax > 10 0.0040 : 3807fd95: jmp 3807fdb7 > <vgPlain_search_transtab+0xa7> > > : 3807fd97: nopw 0x0(%rax,%rax,1) > > 3992 1.6106 : 3807fda0: cmp %rsi,0x18(%r10) > 27213 10.9791 : 3807fda4: je 3807fe00 > <vgPlain_search_transtab+0xf0> > 7304 2.9468 : 3807fda6: add $0x1,%eax > 641 0.2586 : 3807fda9: cmp $0xfff1,%eax > 1485 0.5991 : 3807fdae: cmove %r12d,%eax > 7334 2.9589 : 3807fdb2: cmp %rbp,%rcx > 420 0.1694 : 3807fdb5: je 3807fde0 > <vgPlain_search_transtab+0xd0> > 2269 0.9154 : 3807fdb7: movslq %eax,%r10 > 414 0.1670 : 3807fdba: add $0x1,%rcx > 5084 2.0511 : 3807fdbe: lea (%r10,%r10,4),%r11 > 409 0.1650 : 3807fdc2: mov %rcx,0x42416f(%rip) # > 384a3f38 <n_lookup_probes> > 3501 1.4125 : 3807fdc9: lea (%r10,%r11,2),%r10 > 697 0.2812 : 3807fdcd: lea (%rbx,%r10,8),%r10 > 5691 2.2960 : 3807fdd1: mov 0x8(%r10),%r11d > 65972 26.6165 : 3807fdd5: test %r11d,%r11d > 6220 2.5095 : 3807fdd8: je 3807fda0 > <vgPlain_search_transtab+0x90> > 3302 1.3322 : 3807fdda: cmp $0x2,%r11d > 7211 2.9093 : 3807fdde: jne 3807fda6 > <vgPlain_search_transtab+0x96> > 11 0.0044 : 3807fde0: add $0x1,%r13d > 40 0.0161 : 3807fde4: add $0x4,%r14 > 11 0.0044 : 3807fde8: cmp $0x8,%r13d > 8 0.0032 : 3807fdec: jne 3807fd6d > <vgPlain_search_transtab+0x5d> > > Memcheck profile looks a bit less scary, but still most of the time is > spent in transtab. > > 34472 12.4832 memcheck-amd64-linux memcheck-amd64-linux > delete_translations_in_sector_eclass > 31870 11.5409 memcheck-amd64-linux memcheck-amd64-linux > vgMemCheck_helperc_MAKE_STACK_UNINIT > 26495 9.5945 memcheck-amd64-linux memcheck-amd64-linux > vgPlain_search_transtab > 26203 9.4888 memcheck-amd64-linux memcheck-amd64-linux > vgPlain_discard_translations > > > Is there any known performance trouble in transtab when running jitted > code? > Are there any knobs one could tweak to boost transtab? > > Thanks! > --kcc |
|
From: Konstantin S. <kon...@gm...> - 2011-01-21 16:12:57
|
2011/1/21 Julian Seward <js...@ac...> > > I am really surprised to see this. I know that vgPlain_search_transtab > does take some time, but it's not much more than 2 or 3 %. Especially > after I put in some hacks to make it cheaper, some time around 3.6.0 > (not sure when). > > The guest->host mapping is cached in a direct-mapped cache, > VG_(tt_fast), and VG_(search_transtab) is only used when > the cache misses. But the cache typically has a 99% hit > rate, so VG_(search_transtab) should not see much action. > > The only way I can see is that you are jumping between two > pieces of code which are exactly 2^N bytes apart (for N=17, > or something like that), in the address space. > Then the > cache will miss on each reference because both addresses map > to the same line and there is no associativity and no > victim cache. > > Can you send the results from --stats=yes ? From that we can > see the miss rate on VG_(tt_fast) and perhaps some other > significant numbers. > This? --17273-- translate: fast SP updates identified: 0 ( --%) --17273-- translate: generic_known SP updates identified: 0 ( --%) --17273-- translate: generic_unknown SP updates identified: 0 ( --%) --17273-- tt/tc: 14,941,381 tt lookups requiring 97,504,544 probes --17273-- tt/tc: 14,941,381 fast-cache updates, 13 flushes --17273-- transtab: new 292,328 (4,802,899 -> 45,933,795; ratio 95:10) [0 scs] --17273-- transtab: dumped 0 (0 -> ??) --17273-- transtab: discarded 151 (1,798 -> ??) --17273-- scheduler: 736,878,221 jumps (bb entries). --17273-- scheduler: 9,067/37,285,698 major/minor sched events. --17273-- sanity: 9068 cheap, 115 expensive checks. --17273-- exectx: 769 lists, 0 contexts (avg 0 per list) --17273-- exectx: 0 searches, 0 full compares (0 per 1000) --17273-- exectx: 0 cmp2, 0 cmp4, 0 cmpAll --17273-- errormgr: 0 supplist searches, 0 comparisons during search --17273-- errormgr: 0 errlist searches, 0 comparisons during search > > J > > On Thursday, January 13, 2011, Konstantin Serebryany wrote: > > Hi, > > > > I am running one large test (chrome browser on a heavy JS page) under > > Memcheck and ThreadSanitizer. > > The profile for ThreadSanitizer process looks like this: > > > > 151192 56.6740 tsan-amd64-linux tsan-amd64-linux > > vgPlain_search_transtab > > 10702 4.0116 tsan-amd64-linux tsan-amd64-linux > > vgPlain_run_innerloop__dispatch_unprofiled > > 9741 3.6514 tsan-amd64-linux tsan-amd64-linux > > vgPlain_discard_translations > > > > > > Most of the time is spent in the inner loop in vgPlain_search_transtab > > > > 6 0.0024 : 3807fd75: cltq > > 33 0.0133 : 3807fd77: mov 0x4241ba(%rip),%rcx # > > 384a3f38 <n_lookup_probes> > > 8 0.0032 : 3807fd7e: imul $0x1030,%rax,%rax > > 57 0.0230 : 3807fd85: lea 0xfff1(%rcx),%rbp > > > > : 3807fd8c: mov 0x384a3fa8(%rax),%rbx > > > > 175 0.0706 : 3807fd93: mov %edx,%eax > > 10 0.0040 : 3807fd95: jmp 3807fdb7 > > <vgPlain_search_transtab+0xa7> > > > > : 3807fd97: nopw 0x0(%rax,%rax,1) > > > > 3992 1.6106 : 3807fda0: cmp %rsi,0x18(%r10) > > 27213 10.9791 : 3807fda4: je 3807fe00 > > <vgPlain_search_transtab+0xf0> > > 7304 2.9468 : 3807fda6: add $0x1,%eax > > 641 0.2586 : 3807fda9: cmp $0xfff1,%eax > > 1485 0.5991 : 3807fdae: cmove %r12d,%eax > > 7334 2.9589 : 3807fdb2: cmp %rbp,%rcx > > 420 0.1694 : 3807fdb5: je 3807fde0 > > <vgPlain_search_transtab+0xd0> > > 2269 0.9154 : 3807fdb7: movslq %eax,%r10 > > 414 0.1670 : 3807fdba: add $0x1,%rcx > > 5084 2.0511 : 3807fdbe: lea (%r10,%r10,4),%r11 > > 409 0.1650 : 3807fdc2: mov %rcx,0x42416f(%rip) # > > 384a3f38 <n_lookup_probes> > > 3501 1.4125 : 3807fdc9: lea (%r10,%r11,2),%r10 > > 697 0.2812 : 3807fdcd: lea (%rbx,%r10,8),%r10 > > 5691 2.2960 : 3807fdd1: mov 0x8(%r10),%r11d > > 65972 26.6165 : 3807fdd5: test %r11d,%r11d > > 6220 2.5095 : 3807fdd8: je 3807fda0 > > <vgPlain_search_transtab+0x90> > > 3302 1.3322 : 3807fdda: cmp $0x2,%r11d > > 7211 2.9093 : 3807fdde: jne 3807fda6 > > <vgPlain_search_transtab+0x96> > > 11 0.0044 : 3807fde0: add $0x1,%r13d > > 40 0.0161 : 3807fde4: add $0x4,%r14 > > 11 0.0044 : 3807fde8: cmp $0x8,%r13d > > 8 0.0032 : 3807fdec: jne 3807fd6d > > <vgPlain_search_transtab+0x5d> > > > > Memcheck profile looks a bit less scary, but still most of the time is > > spent in transtab. > > > > 34472 12.4832 memcheck-amd64-linux memcheck-amd64-linux > > delete_translations_in_sector_eclass > > 31870 11.5409 memcheck-amd64-linux memcheck-amd64-linux > > vgMemCheck_helperc_MAKE_STACK_UNINIT > > 26495 9.5945 memcheck-amd64-linux memcheck-amd64-linux > > vgPlain_search_transtab > > 26203 9.4888 memcheck-amd64-linux memcheck-amd64-linux > > vgPlain_discard_translations > > > > > > Is there any known performance trouble in transtab when running jitted > > code? > > Are there any knobs one could tweak to boost transtab? > > > > Thanks! > > --kcc > > |
|
From: Julian S. <js...@ac...> - 2011-01-21 17:28:25
|
> > Can you send the results from --stats=yes ? From that we can > > see the miss rate on VG_(tt_fast) and perhaps some other > > significant numbers. > > This? > --17273-- translate: fast SP updates identified: 0 ( --%) > --17273-- translate: generic_known SP updates identified: 0 ( --%) > --17273-- translate: generic_unknown SP updates identified: 0 ( --%) > --17273-- tt/tc: 14,941,381 tt lookups requiring 97,504,544 probes > --17273-- tt/tc: 14,941,381 fast-cache updates, 13 flushes > --17273-- transtab: new 292,328 (4,802,899 -> 45,933,795; ratio > 95:10) [0 scs] > --17273-- transtab: dumped 0 (0 -> ??) > --17273-- transtab: discarded 151 (1,798 -> ??) > --17273-- scheduler: 736,878,221 jumps (bb entries). > --17273-- scheduler: 9,067/37,285,698 major/minor sched events. > --17273-- sanity: 9068 cheap, 115 expensive checks. > --17273-- exectx: 769 lists, 0 contexts (avg 0 per list) > --17273-- exectx: 0 searches, 0 full compares (0 per 1000) > --17273-- exectx: 0 cmp2, 0 cmp4, 0 cmpAll > --17273-- errormgr: 0 supplist searches, 0 comparisons during search > --17273-- errormgr: 0 errlist searches, 0 comparisons during search Yes, this. But .. are these the numbers from the run that had the unexpectedly high costs? These numbers look normal to me: 736 million queries in the fast cache, 14.941 million misses and lookups in the main table (which is what VG_(search_transtab) does), and about 6 hash probes per lookup (97,504,544 / 14,941,381). J |
|
From: Konstantin S. <kon...@gm...> - 2011-01-21 17:40:29
|
2011/1/21 Julian Seward <js...@ac...> > > > > Can you send the results from --stats=yes ? From that we can > > > see the miss rate on VG_(tt_fast) and perhaps some other > > > significant numbers. > > > > This? > > --17273-- translate: fast SP updates identified: 0 ( --%) > > --17273-- translate: generic_known SP updates identified: 0 ( --%) > > --17273-- translate: generic_unknown SP updates identified: 0 ( --%) > > --17273-- tt/tc: 14,941,381 tt lookups requiring 97,504,544 probes > > --17273-- tt/tc: 14,941,381 fast-cache updates, 13 flushes > > --17273-- transtab: new 292,328 (4,802,899 -> 45,933,795; ratio > > 95:10) [0 scs] > > --17273-- transtab: dumped 0 (0 -> ??) > > --17273-- transtab: discarded 151 (1,798 -> ??) > > --17273-- scheduler: 736,878,221 jumps (bb entries). > > --17273-- scheduler: 9,067/37,285,698 major/minor sched events. > > --17273-- sanity: 9068 cheap, 115 expensive checks. > > --17273-- exectx: 769 lists, 0 contexts (avg 0 per list) > > --17273-- exectx: 0 searches, 0 full compares (0 per 1000) > > --17273-- exectx: 0 cmp2, 0 cmp4, 0 cmpAll > > --17273-- errormgr: 0 supplist searches, 0 comparisons during search > > --17273-- errormgr: 0 errlist searches, 0 comparisons during search > > Yes, this. But .. are these the numbers from the run that had the > unexpectedly high costs? Yes (unless I am very much mistaken) The profile was like this: 143136 51.6351 tsan-amd64-linux tsan-amd64-linux vgPlain_search_transtab 14517 5.2369 tsan-amd64-linux tsan-amd64-linux vgPlain_discard_translations 13297 4.7968 tsan-amd64-linux tsan-amd64-linux delete_translations_in_sector_eclass 12440 4.4876 tsan-amd64-linux tsan-amd64-linux ThreadSanitizerHandleTrace(int, TraceInfo*, unsigned long*) 11463 4.1352 tsan-amd64-linux tsan-amd64-linux vgPlain_run_innerloop__dispatch_unprofiled 10679 3.8524 anon (tgid:17291 range:0x404797a000-0x40525d2000) tsan-amd64-linux anon (tgid:17291 range:0x404797a000-0x40525d2000) 6528 2.3549 tsan-amd64-linux tsan-amd64-linux vgPlain_run_innerloop 6259 2.2579 tsan-amd64-linux tsan-amd64-linux invalidateFastCache > These numbers look normal to me: 736 million > queries in the fast cache, 14.941 million misses and lookups in the > main table (which is what VG_(search_transtab) does), and about 6 > hash probes per lookup (97,504,544 / 14,941,381). > > J > |
|
From: Julian S. <js...@ac...> - 2011-01-22 10:59:34
|
> > Yes, this. But .. are these the numbers from the run that had the > > unexpectedly high costs? > > Yes (unless I am very much mistaken) > The profile was like this: [...] Well, that's very strange. Can you reproduce the same thing running on standard trunk valgrind with (eg) --tool=memcheck? J |
|
From: Konstantin S. <kon...@gm...> - 2011-01-26 15:29:59
|
2011/1/22 Julian Seward <js...@ac...> > > > > Yes, this. But .. are these the numbers from the run that had the > > > unexpectedly high costs? > > > > Yes (unless I am very much mistaken) > > The profile was like this: [...] > > Well, that's very strange. Can you reproduce the same thing running > on standard trunk valgrind with (eg) --tool=memcheck? > Not with memcheck (it is too slow; I am testing a heavy interactive thing). Under tool=none I get this: --20320-- translate: fast SP updates identified: 0 ( --%) --20320-- translate: generic_known SP updates identified: 0 ( --%) --20320-- translate: generic_unknown SP updates identified: 0 ( --%) --20320-- tt/tc: 154,277,025 tt lookups requiring 12,970,652,710 probes --20320-- tt/tc: 154,277,025 fast-cache updates, 67,995 flushes --20320-- transtab: new 285,190 (10,103,997 -> 50,034,757; ratio 49:10) [0 scs] --20320-- transtab: dumped 0 (0 -> ??) --20320-- transtab: discarded 98,158 (4,880,670 -> ??) --20320-- scheduler: 1,561,493,047 jumps (bb entries). --20320-- scheduler: 17,733/154,518,633 major/minor sched events. --20320-- sanity: 17735 cheap, 168 expensive checks. --20320-- exectx: 769 lists, 0 contexts (avg 0 per list) --20320-- exectx: 0 searches, 0 full compares (0 per 1000) --20320-- exectx: 0 cmp2, 0 cmp4, 0 cmpAll --20320-- errormgr: 0 supplist searches, 0 comparisons during search --20320-- errormgr: 0 errlist searches, 0 comparisons during search ==20302== --20302-- translate: fast SP updates identified: 0 ( --%) --20302-- translate: generic_known SP updates identified: 0 ( --%) --20302-- translate: generic_unknown SP updates identified: 0 ( --%) --20302-- tt/tc: 6,612,289 tt lookups requiring 34,869,528 probes --20302-- tt/tc: 6,612,289 fast-cache updates, 11 flushes --20302-- transtab: new 251,524 (6,866,739 -> 42,129,034; ratio 61:10) [0 scs] --20302-- transtab: dumped 0 (0 -> ??) --20302-- transtab: discarded 92 (1,687 -> ??) --20302-- scheduler: 653,064,538 jumps (bb entries). --20302-- scheduler: 6,640/6,526,732 major/minor sched events. --20302-- sanity: 6641 cheap, 96 expensive checks. --20302-- exectx: 769 lists, 0 contexts (avg 0 per list) --20302-- exectx: 0 searches, 0 full compares (0 per 1000) --20302-- exectx: 0 cmp2, 0 cmp4, 0 cmpAll --20302-- errormgr: 0 supplist searches, 0 comparisons during search --20302-- errormgr: 0 errlist searches, 0 comparisons during search 245903 43.2118 none-amd64-linux none-amd64-linux vgPlain_search_transtab 132410 23.2680 none-amd64-linux none-amd64-linux delete_translations_in_sector_eclass 52773 9.2736 none-amd64-linux none-amd64-linux vgPlain_discard_translations 34777 6.1113 anon (tgid:20320 range:0x418db0000-0x41c1d9000) none-amd64-linux anon (tgid:20320 range:0x418db0000-0x41c1d9000) 22547 3.9621 none-amd64-linux none-amd64-linux vgPlain_run_innerloop__dispatch_unprofiled 14557 2.5581 no-vmlinux no-vmlinux /no-vmlinux 11741 2.0632 none-amd64-linux none-amd64-linux vgPlain_run_innerloop 6645 1.1677 none-amd64-linux none-amd64-linux invalidateFastCache --kcc > > J > > |
|
From: Julian S. <js...@ac...> - 2011-01-28 08:28:25
|
One other thing that's unusual about this is: there's obviously a lot of code discarding going on: > 132410 23.2680 none-amd64-linux none-amd64-linux > delete_translations_in_sector_eclass > 52773 9.2736 none-amd64-linux none-amd64-linux > vgPlain_discard_translations Those costs would be incurred if the app frequently asks, via client request, to discard translations. The strange thing is that these stats --20302-- transtab: new 251,524 (6,866,739 -> 42,129,034; ratio 61:10) [0 scs] --20302-- transtab: dumped 0 (0 -> ??) --20302-- transtab: discarded 92 (1,687 -> ??) show that almost no code actually did get discarded, though. Possibly means the app is making lots of code-discard requests for parts of the address space where in fact there is no code? Just guessing. J |
|
From: Konstantin S. <kon...@gm...> - 2011-01-26 22:37:37
|
2011/1/26 Julian Seward <js...@ac...> > > One other thing that's unusual about this is: there's obviously > a lot of code discarding going on: > > > 132410 23.2680 none-amd64-linux none-amd64-linux > > delete_translations_in_sector_eclass > > 52773 9.2736 none-amd64-linux none-amd64-linux > > vgPlain_discard_translations > > Those costs would be incurred if the app frequently asks, via > client request, to discard translations. The strange thing is > that these stats > > --20302-- transtab: new 251,524 (6,866,739 -> 42,129,034; ratio > 61:10) [0 scs] > --20302-- transtab: dumped 0 (0 -> ??) > --20302-- transtab: discarded 92 (1,687 -> ??) > > show that almost no code actually did get discarded, though. > > Possibly means the app is making lots of code-discard requests > for parts of the address space where in fact there is no code? > I am running chrome which runs a heavy JavaScript program. v8, the JS engine, obviously discards code via a client request, but afaict is does it only when it needs to. The first log does discard translations, right? --20320-- transtab: discarded 98,158 (4,880,670 -> ??) --kcc > Just guessing. > > J > |
|
From: Nicholas N. <nj...@cs...> - 2007-07-28 02:47:30
|
On Fri, 27 Jul 2007, Christoph Bartoschek wrote: > I was quite confident with my version of valgrind. But when I run another big > program I often see that more than 30% of runtime goes to > vgPlain_search_transtab. > > Is there a document that describes the purpose and machinery behind this > function? Not really, although the academic publications on valgrind.org (eg. the 2007 PLDI paper) has a little. The translation table is where the instrumented blocks of code are held. That function is called when the fast cache lookups fail. Nick |