You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
1
(11) |
2
(13) |
3
(7) |
|
4
(9) |
5
(23) |
6
(19) |
7
(18) |
8
(2) |
9
(7) |
10
(21) |
|
11
(13) |
12
|
13
(8) |
14
(17) |
15
(19) |
16
(25) |
17
(43) |
|
18
(22) |
19
(12) |
20
(19) |
21
(12) |
22
(9) |
23
(12) |
24
(5) |
|
25
(16) |
26
(25) |
27
(24) |
28
(19) |
29
(26) |
30
(25) |
31
(6) |
|
From: Bob F. <bfr...@si...> - 2004-07-06 22:55:09
|
On Tue, 6 Jul 2004, Nicholas Nethercote wrote: > On Tue, 6 Jul 2004, Bob Friesenhahn wrote: > >> Valgrind should consider using one Makefile.am to build everything in the >> package and avoid recursive Makefile.am's. There are definite advantages >> to this since then 'make' is aware of the full dependency tree. As a >> developer it is nice to type 'make' when there is nothing to do and have >> it immediately say that there is nothing to do rather than to watch 'make' >> go off and do a lot of unnecessary work. > > Indeed, that would be wonderful! > > So, how do you do it? Can you have a single Makefile for multiple > directories? A patch for this would be most welcome :) I would be happy to supply a patch but unfortunately I don't have available time to work on it. :-( Recent/modern Automake's support non-recursive builds quite well. The rules are exactly the same as with a normal Makefile.am except that you reference sources with full relative paths (from the top), and targets need to reflect the full relative path to where they will be placed. There is one annoying twist in that target paths need to have the '/' replaced with '_' using the same rules as used for the special characters that Automake doesn't otherwise support in target names. If the existing Makefile.am's are written cleanly, it should not be a difficult task to roll up into a single Makefile.am. It is mostly a matter of adding prefix paths and ensuring that script code does not assume that files are always in the "current" directory since the "current" directory will always be the top directory in a non-recursive build. A side effect of using one Makefile.am is that you would always type 'make' from the top directory. So you would do make path/to/tool if you wanted to build only 'tool' rather than cd path/to make tool or make -C path/to tool Unless the software takes a long time to build, using a non-recursive build signficantly reduces the desire/need to build just one component since no-time is wasted needlessly rebuilding components or checking to see if components need to be built. Valgrind is very small so this should not be a problem. Bob ====================================== Bob Friesenhahn bfr...@si... http://www.simplesystems.org/users/bfriesen |
|
From: Nicholas N. <nj...@ca...> - 2004-07-06 22:46:49
|
CVS commit by nethercote:
minor
M +2 -0 cg_main.c 1.70
--- valgrind/cachegrind/cg_main.c #1.69:1.70
@@ -472,4 +472,6 @@ void end_of_x86_instr(UCodeBlock* cb, in
}
#undef IS_
+#undef INV
+
// Setup 1st arg: CC addr
do_details( i_node, bb_seen_before, instr_addr, instr_size, data_size );
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-06 22:40:32
|
On Tue, 6 Jul 2004, Bob Friesenhahn wrote: > Valgrind should consider using one Makefile.am to build everything in the > package and avoid recursive Makefile.am's. There are definite advantages to > this since then 'make' is aware of the full dependency tree. As a developer > it is nice to type 'make' when there is nothing to do and have it immediately > say that there is nothing to do rather than to watch 'make' go off and do a > lot of unnecessary work. Indeed, that would be wonderful! So, how do you do it? Can you have a single Makefile for multiple directories? A patch for this would be most welcome :) N |
|
From: Bob F. <bfr...@si...> - 2004-07-06 22:23:07
|
On Tue, 6 Jul 2004, Jeremy Fitzhardinge wrote: > automake thing, or just plain Makefile includes. I think. It's tricky > because Makefile.am isn't really a makefile, and gets interpreted in > magic ways, so you can't do anything clever with make variable expansion > without confusing automake. Makefile.am is both a makefile, and not a makefile (binomial personality disorder). Normal (non-GNU-specific) makefile rules will be passed to the output Makefile. In order to interface with what Automake generates, you need to read its output Makefile and become familiar with what it generates. Then you can use the funny variable names and targets that it generates in your own rules. Of course future Automake's may change the way the generated Makefile works so it is risky to depend on internals. Bob ====================================== Bob Friesenhahn bfr...@si... http://www.simplesystems.org/users/bfriesen |
|
From: Bob F. <bfr...@si...> - 2004-07-06 22:17:04
|
On Tue, 6 Jul 2004, Nicholas Nethercote wrote: > > Can you do #includes in Makefile.am files? That's what is needed here for > the common parts. No, you can not. Valgrind should consider using one Makefile.am to build everything in the package and avoid recursive Makefile.am's. There are definite advantages to this since then 'make' is aware of the full dependency tree. As a developer it is nice to type 'make' when there is nothing to do and have it immediately say that there is nothing to do rather than to watch 'make' go off and do a lot of unnecessary work. Bob ====================================== Bob Friesenhahn bfr...@si... http://www.simplesystems.org/users/bfriesen |
|
From: Jeremy F. <je...@go...> - 2004-07-06 22:11:25
|
On Tue, 2004-07-06 at 23:02 +0100, Nicholas Nethercote wrote: > ie. no "addprefix". Does the difference matter? Is one preferred? The addprefix is necessary if $(val_PROGRAMS) expands to more than one word. If you don't have it you end up with something like "my/valgrind/ tool/foo1 foo2" when you wanted "my/valgrind/tool/foo1 my/valgrind/tool/ foo2". They should probably all have addprefix for consistency's sake. > Can you do #includes in Makefile.am files? That's what is needed here for > the common parts. I've never understood what the best level to do this is at. There's at least 3 places you could do it, I think: as autoconf macros, some automake thing, or just plain Makefile includes. I think. It's tricky because Makefile.am isn't really a makefile, and gets interpreted in magic ways, so you can't do anything clever with make variable expansion without confusing automake. J |
|
From: Nicholas N. <nj...@ca...> - 2004-07-06 22:02:36
|
Hi,
Looking at some of the tool Makefile.am files, some have this:
all-local:
mkdir -p $(inplacedir)
-rm -f $(addprefix $(inplacedir)/,$(val_PROGRAMS))
ln -f -s $(addprefix $(top_builddir)/$(subdir)/,$(val_PROGRAMS)) $(inplacedir)
and some have this:
all-local:
mkdir -p $(inplacedir)
-rm -f $(inplacedir)/$(val_PROGRAMS)
ln -f -s $(top_builddir)/$(subdir)/$(val_PROGRAMS) $(inplacedir)/$(val_PROGRAMS)
ie. no "addprefix". Does the difference matter? Is one preferred?
Can you do #includes in Makefile.am files? That's what is needed here for
the common parts.
N
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-06 21:57:54
|
CVS commit by nethercote:
Completely overhauled Cachegrind's data structures. With the new
scheme, there are two main structures:
1. The CC table holds a cost centre (CC) for every distinct source code
line, as found using debug/symbol info. It's arranged by files, then
functions, then lines.
2. The instr-info-table holds certain important pieces of info about
each instruction -- instr_addr, instr_size, data_size, its line-CC.
A pointer to the instr's info is passed to the simulation functions,
which is shorter and quicker than passing the pieces individually.
This is nice and simple. Previously, there was a single data structure
(the BBCC table) which mingled the two purposes (maintaining CCs and
caching instruction info). The CC stuff was done at the level of
instructions, and there were different CC types for different kinds of
instructions, and it was pretty yucky. The two simple data structures
together are much less complex than the original single data structure.
As a result, we have the following general improvements:
- Previously, when code was unloaded all its hit/miss counts were stuck
in a single "discard" CC, and so that code would not be annotated. Now
this code is profiled and annotatable just like all other code.
- Source code size is 27% smaller. cg_main.c is now 1472 lines, down
from 2174. Some (1/3?) of this is from removing the special handling
of JIFZ and general compaction, but most is from the data structure
changes. Happily, a lot of the removed code was nasty.
- Object code size (vgskin_cachegrind.so) is 15% smaller.
- cachegrind.out.pid size is about 90+% smaller(!) Annotation time is
accordingly *much* faster. Doing cost-centres at the level of source
code lines rather than instructions makes a big difference, since
there's typically 2--3 instructions per source line. Even better,
when debug info is not present, entire functions (and even files) get
collapsed into a single "???" CC. (This behaviour is no different
to what happened before, it's just the collapsing used to occur in the
annotation script, rather than within Cachegrind.) This is a huge win
for stripped libraries.
- Memory consumption is about 10--20% less, due to fewer CCs.
- Speed is not much changed -- the changes were not in the intensive
parts, so the only likely change is a cache improvement due to using
less memory. SPEC experiments go -3 -- 10% faster, with the "average"
being unchanged or perhaps a tiny bit faster.
I've tested it reasonably thoroughly, it seems extremely similar result
as the old version, which is highly encouraging. (The results aren't
quite the same, because they are so sensitive to memory layout; even
tiny changes to Cachegrind affect the results slightly.)
Some particularly nice changes that happened:
- No longer need an instrumentation prepass; this is because CCs are not
stored grouped by BB, and they're all the same size now. (This makes
various bits of code much simpler than before).
- The actions to take when a BB translation is discarded (due to the
translation table getting full) are much easier -- just chuck all the
instr-info nodes for the BB, without touching the CCs.
- Dumping the cachegrind.out.pid file at the end is much simpler, just
because the CC data structure is much neater.
Some other, specific changes:
- Removed the JIFZ special handling, which never did what it was
intended to do and just complicated things. This changes the results
for REP-prefixed instructions very slightly, but it's not important.
- Abbreviated the FP/MMX/SSE crap by being slightly laxer with size
checking -- not an issue, since this checking was just a pale
imitation of the stricter checking done in codegen anyway.
- Removed "fi" and "fe" handling from cg_annotate, no longer needed due
to neatening of the CC-table.
- Factorised out some code a bit, so fewer monolithic slabs,
particularly in SK_(instrument)().
- Just improved formatting and compacted code in general in various
places.
- Removed the long-commented-out sanity checking code at the bottom.
Phew.
M +0 -12 cg_annotate.in 1.20
M +547 -1249 cg_main.c 1.69
--- valgrind/cachegrind/cg_annotate.in #1.19:1.20
@@ -409,16 +409,4 @@
$curr_file_ind_CCs = {} unless (defined $curr_file_ind_CCs);
- } elsif (s/^(fi|fe)=(.*)$//) {
- (defined $curr_name) or die("Line $.: Unexpected fi/fe line\n");
- $fn_totals{$curr_name} = $curr_fn_CC;
- $all_ind_CCs{$curr_file} = $curr_file_ind_CCs;
-
- $curr_file = $2;
- $curr_name = "$curr_file:$curr_fn";
- $curr_file_ind_CCs = $all_ind_CCs{$curr_file};
- $curr_file_ind_CCs = {} unless (defined $curr_file_ind_CCs);
- $curr_fn_CC = $fn_totals{$curr_name};
- $curr_fn_CC = [] unless (defined $curr_fn_CC);
-
} elsif (s/^\s*$//) {
# blank, do nothing
--- valgrind/cachegrind/cg_main.c #1.68:1.69
@@ -1,6 +1,5 @@
/*--------------------------------------------------------------------*/
-/*--- Cachegrind: cache detection; instrumentation, recording and ---*/
-/*--- results printing. ---*/
+/*--- Cachegrind: every but the simulation itself. ---*/
/*--- cg_main.c ---*/
/*--------------------------------------------------------------------*/
@@ -47,16 +46,8 @@ typedef struct {
/*------------------------------------------------------------*/
-/* According to IA-32 Intel Architecture Software Developer's Manual: Vol 2 */
-#define MAX_x86_INSTR_SIZE 16
-
+#define MAX_x86_INSTR_SIZE 16 // According to ia32 sw dev manual vol 2
#define MIN_LINE_SIZE 16
-
-/* Size of various buffers used for storing strings */
-#define FILENAME_LEN 256
-#define FN_NAME_LEN 256
-#define BUF_LEN 512
-#define COMMIFY_BUF_LEN 128
-#define RESULTS_BUF_LEN 128
-#define LINE_BUF_LEN 64
+#define FILE_LEN 256
+#define FN_LEN 256
/*------------------------------------------------------------*/
@@ -66,5 +57,5 @@ typedef struct {
typedef
enum {
- VgpGetBBCC = VgpFini+1,
+ VgpGetLineCC = VgpFini+1,
VgpCacheSimulate,
VgpCacheResults
@@ -73,20 +64,5 @@ typedef
/*------------------------------------------------------------*/
-/*--- Output file related stuff ---*/
-/*------------------------------------------------------------*/
-
-static Char* cachegrind_out_file;
-
-static void file_err ( void )
-{
- VG_(message)(Vg_UserMsg,
- "error: can't open cache simulation output file `%s'",
- cachegrind_out_file );
- VG_(message)(Vg_UserMsg,
- " ... so simulation results will be missing.");
-}
-
-/*------------------------------------------------------------*/
-/*--- Cost center types, operations ---*/
+/*--- Types and Data Structures ---*/
/*------------------------------------------------------------*/
@@ -98,206 +74,78 @@ struct _CC {
};
-static __inline__ void initCC(CC* cc) {
- cc->a = 0;
- cc->m1 = 0;
- cc->m2 = 0;
-}
+//------------------------------------------------------------
+// Primary data structure #1: CC table
+// - Holds the per-source-line hit/miss stats, grouped by file/function/line.
+// - hash(file, hash(fn, hash(line+CC)))
+// - Each hash table is separately chained.
+// - The array sizes below work fairly well for Konqueror.
+// - Lookups done by instr_addr, which is converted immediately to a source
+// location.
+// - Traversed for dumping stats at end in file/func/line hierarchy.
-typedef
- enum {
- InstrCC, /* eg. mov %eax, %ebx */
- ReadCC, /* eg. mov (%ecx), %esi */
- WriteCC, /* eg. mov %eax, (%edx) */
- ModCC, /* eg. incl (%eax) (read+write one addr) */
- ReadWriteCC, /* eg. call*l (%esi), pushl 0x4(%ebx), movsw
- (read+write two different addrs) */
- } CC_type;
+#define N_FILE_ENTRIES 251
+#define N_FN_ENTRIES 53
+#define N_LINE_ENTRIES 37
-/* Instruction-level cost-centres.
- *
- * WARNING: the 'tag' field *must* be the first byte of both CC types.
- *
- * This is because we use it to work out what kind of CC we're dealing with.
- */
-typedef
- struct {
- /* word 1 */
- UChar tag;
- UChar instr_size;
- /* 2 bytes padding */
+typedef struct _lineCC lineCC;
+struct _lineCC {
+ Int line;
+ CC Ir;
+ CC Dr;
+ CC Dw;
+ lineCC* next;
+};
- /* words 2+ */
- Addr instr_addr;
- CC I;
- }
- iCC;
+typedef struct _fnCC fnCC;
+struct _fnCC {
+ Char* fn;
+ fnCC* next;
+ lineCC* lines[N_LINE_ENTRIES];
+};
-typedef
- struct _idCC {
- /* word 1 */
- UChar tag;
- UChar instr_size;
- UChar data_size;
- /* 1 byte padding */
+typedef struct _fileCC fileCC;
+struct _fileCC {
+ Char* file;
+ fileCC* next;
+ fnCC* fns[N_FN_ENTRIES];
+};
- /* words 2+ */
- Addr instr_addr;
- CC I;
- CC D;
- }
- idCC;
+// Top level of CC table. Auto-zeroed.
+static fileCC *CC_table[N_FILE_ENTRIES];
-typedef
- struct _iddCC {
- /* word 1 */
- UChar tag;
- UChar instr_size;
- UChar data_size;
- /* 1 byte padding */
+//------------------------------------------------------------
+// Primary data structre #2: Instr-info table
+// - Holds the cached info about each instr that is used for simulation.
+// - table(BB_start_addr, list(instr_info))
+// - For each BB, each instr_info in the list holds info about the
+// instruction (instr_size, instr_addr, etc), plue a pointer to its line
+// CC. This node is what's passed to the simulation function.
+// - When BBs are discarded the relevant list(instr_details) is freed.
- /* words 2+ */
+typedef struct _instr_info instr_info;
+struct _instr_info {
Addr instr_addr;
- CC I;
- CC Da;
- CC Db;
- }
- iddCC;
-
-static void init_iCC(iCC* cc, Addr instr_addr, UInt instr_size)
-{
- cc->tag = InstrCC;
- cc->instr_size = instr_size;
- cc->instr_addr = instr_addr;
- initCC(&cc->I);
-}
-
-static void init_idCC(CC_type X_CC, idCC* cc, Addr instr_addr,
- UInt instr_size, UInt data_size)
-{
- cc->tag = X_CC;
- cc->instr_size = instr_size;
- cc->data_size = data_size;
- cc->instr_addr = instr_addr;
- initCC(&cc->I);
- initCC(&cc->D);
-}
-
-static void init_iddCC(iddCC* cc, Addr instr_addr,
- UInt instr_size, UInt data_size)
-{
- cc->tag = ReadWriteCC;
- cc->instr_size = instr_size;
- cc->data_size = data_size;
- cc->instr_addr = instr_addr;
- initCC(&cc->I);
- initCC(&cc->Da);
- initCC(&cc->Db);
-}
-
-#define ADD_CC_TO(CC_type, cc, total) \
- total.a += ((CC_type*)BBCC_ptr)->cc.a; \
- total.m1 += ((CC_type*)BBCC_ptr)->cc.m1; \
- total.m2 += ((CC_type*)BBCC_ptr)->cc.m2;
-
-/* If 1, address of each instruction is printed as a comment after its counts
- * in cachegrind.out */
-#define PRINT_INSTR_ADDRS 0
-
-static __inline__ void sprint_iCC(Char buf[BUF_LEN], iCC* cc)
-{
-#if PRINT_INSTR_ADDRS
- VG_(sprintf)(buf, "%llu %llu %llu # %x\n",
- cc->I.a, cc->I.m1, cc->I.m2, cc->instr_addr);
-#else
- VG_(sprintf)(buf, "%llu %llu %llu\n",
- cc->I.a, cc->I.m1, cc->I.m2);
-#endif
-}
-
-static __inline__ void sprint_read_or_mod_CC(Char buf[BUF_LEN], idCC* cc)
-{
-#if PRINT_INSTR_ADDRS
- VG_(sprintf)(buf, "%llu %llu %llu %llu %llu %llu # %x\n",
- cc->I.a, cc->I.m1, cc->I.m2,
- cc->D.a, cc->D.m1, cc->D.m2, cc->instr_addr);
-#else
- VG_(sprintf)(buf, "%llu %llu %llu %llu %llu %llu\n",
- cc->I.a, cc->I.m1, cc->I.m2,
- cc->D.a, cc->D.m1, cc->D.m2);
-#endif
-}
-
-static __inline__ void sprint_write_CC(Char buf[BUF_LEN], idCC* cc)
-{
-#if PRINT_INSTR_ADDRS
- VG_(sprintf)(buf, "%llu %llu %llu . . . %llu %llu %llu # %x\n",
- cc->I.a, cc->I.m1, cc->I.m2,
- cc->D.a, cc->D.m1, cc->D.m2, cc->instr_addr);
-#else
- VG_(sprintf)(buf, "%llu %llu %llu . . . %llu %llu %llu\n",
- cc->I.a, cc->I.m1, cc->I.m2,
- cc->D.a, cc->D.m1, cc->D.m2);
-#endif
-}
-
-static __inline__ void sprint_read_write_CC(Char buf[BUF_LEN], iddCC* cc)
-{
-#if PRINT_INSTR_ADDRS
- VG_(sprintf)(buf, "%llu %llu %llu %llu %llu %llu # %x\n",
- cc->I.a, cc->I.m1, cc->I.m2,
- cc->Da.a, cc->Da.m1, cc->Da.m2,
- cc->Db.a, cc->Db.m1, cc->Db.m2, cc->instr_addr);
-#else
- VG_(sprintf)(buf, "%llu %llu %llu %llu %llu %llu %llu %llu %llu\n",
- cc->I.a, cc->I.m1, cc->I.m2,
- cc->Da.a, cc->Da.m1, cc->Da.m2,
- cc->Db.a, cc->Db.m1, cc->Db.m2);
-#endif
-}
-
-
-/*------------------------------------------------------------*/
-/*--- BBCC hash table stuff ---*/
-/*------------------------------------------------------------*/
-
-/* The table of BBCCs is of the form hash(filename, hash(fn_name,
- * hash(BBCCs))). Each hash table is separately chained. The sizes below work
- * fairly well for Konqueror. */
-
-#define N_FILE_ENTRIES 251
-#define N_FN_ENTRIES 53
-#define N_BBCC_ENTRIES 37
-
-/* The cost centres for a basic block are stored in a contiguous array.
- * They are distinguishable by their tag field. */
-typedef struct _BBCC BBCC;
-struct _BBCC {
- Addr orig_addr;
- UInt array_size; /* byte-size of variable length array */
- BBCC* next;
- Addr array[0]; /* variable length array */
-};
-
-typedef struct _fn_node fn_node;
-struct _fn_node {
- Char* fn_name;
- BBCC* BBCCs[N_BBCC_ENTRIES];
- fn_node* next;
+ UChar instr_size;
+ UChar data_size;
+ struct _lineCC* parent; // parent line-CC
};
-typedef struct _file_node file_node;
-struct _file_node {
- Char* filename;
- fn_node* fns[N_FN_ENTRIES];
- file_node* next;
+typedef struct _BB_info BB_info;
+struct _BB_info {
+ BB_info* next; // next field
+ Addr BB_addr; // key
+ Int n_instrs;
+ instr_info instrs[0];
};
-/* BBCC_table structure: list(filename, list(fn_name, list(BBCC))) */
-static file_node *BBCC_table[N_FILE_ENTRIES];
+VgHashTable instr_info_table; // hash(Addr, BB_info)
+//------------------------------------------------------------
+// Stats
static Int distinct_files = 0;
static Int distinct_fns = 0;
-
+static Int distinct_lines = 0;
static Int distinct_instrs = 0;
+
static Int full_debug_BBs = 0;
static Int file_line_debug_BBs = 0;
@@ -302,547 +150,395 @@ static Int distinct_instrs = 0;
static Int full_debug_BBs = 0;
static Int file_line_debug_BBs = 0;
-static Int fn_name_debug_BBs = 0;
+static Int fn_debug_BBs = 0;
static Int no_debug_BBs = 0;
static Int BB_retranslations = 0;
-static CC Ir_discards;
-static CC Dr_discards;
-static CC Dw_discards;
-
-static void init_BBCC_table()
-{
- Int i;
- for (i = 0; i < N_FILE_ENTRIES; i++)
- BBCC_table[i] = NULL;
-}
+/*------------------------------------------------------------*/
+/*--- CC table operations ---*/
+/*------------------------------------------------------------*/
-static void get_debug_info(Addr instr_addr, Char filename[FILENAME_LEN],
- Char fn_name[FN_NAME_LEN], Int* line_num)
+static void get_debug_info(Addr instr_addr, Char file[FILE_LEN],
+ Char fn[FN_LEN], Int* line)
{
- Bool found1, found2;
-
- found1 = VG_(get_filename_linenum)(instr_addr, filename,
- FILENAME_LEN, line_num);
- found2 = VG_(get_fnname)(instr_addr, fn_name, FN_NAME_LEN);
-
- if (!found1 && !found2) {
- no_debug_BBs++;
- VG_(strcpy)(filename, "???");
- VG_(strcpy)(fn_name, "???");
- *line_num = 0;
-
- } else if ( found1 && found2) {
- full_debug_BBs++;
-
- } else if ( found1 && !found2) {
- file_line_debug_BBs++;
- VG_(strcpy)(fn_name, "???");
+ Bool found_file_line = VG_(get_filename_linenum)(instr_addr, file,
+ FILE_LEN, line);
+ Bool found_fn = VG_(get_fnname)(instr_addr, fn, FN_LEN);
- } else /*(!found1 && found2)*/ {
- fn_name_debug_BBs++;
- VG_(strcpy)(filename, "???");
- *line_num = 0;
+ if (!found_file_line) {
+ VG_(strcpy)(file, "???");
+ *line = 0;
+ }
+ if (!found_fn) {
+ VG_(strcpy)(fn, "???");
+ }
+ if (found_file_line) {
+ if (found_fn) full_debug_BBs++;
+ else file_line_debug_BBs++;
+ } else {
+ if (found_fn) fn_debug_BBs++;
+ else no_debug_BBs++;
}
}
-/* Forward declaration. */
-static Int compute_BBCC_array_size(UCodeBlock* cb);
-
-static __inline__
-file_node* new_file_node(Char filename[], file_node* next)
+static UInt hash(Char *s, UInt table_size)
{
- Int i;
- file_node* new = VG_(malloc)(sizeof(file_node));
- new->filename = VG_(strdup)(filename);
- for (i = 0; i < N_FN_ENTRIES; i++) {
- new->fns[i] = NULL;
- }
- new->next = next;
- return new;
+ const int hash_constant = 256;
+ int hash_value = 0;
+ for ( ; *s; s++)
+ hash_value = (hash_constant * hash_value + *s) % table_size;
+ return hash_value;
}
static __inline__
-fn_node* new_fn_node(Char fn_name[], fn_node* next)
+fileCC* new_fileCC(Char filename[], fileCC* next)
{
- Int i;
- fn_node* new = VG_(malloc)(sizeof(fn_node));
- new->fn_name = VG_(strdup)(fn_name);
- for (i = 0; i < N_BBCC_ENTRIES; i++) {
- new->BBCCs[i] = NULL;
- }
- new->next = next;
- return new;
+ // Using calloc() zeroes the fns[] array
+ fileCC* cc = VG_(calloc)(1, sizeof(fileCC));
+ cc->file = VG_(strdup)(filename);
+ cc->next = next;
+ return cc;
}
static __inline__
-BBCC* new_BBCC(Addr bb_orig_addr, UCodeBlock* cb, BBCC* next)
+fnCC* new_fnCC(Char fn[], fnCC* next)
{
- Int BBCC_array_size = compute_BBCC_array_size(cb);
- BBCC* new;
-
- new = (BBCC*)VG_(malloc)(sizeof(BBCC) + BBCC_array_size);
- new->orig_addr = bb_orig_addr;
- new->array_size = BBCC_array_size;
- new->next = next;
-
- return new;
+ // Using calloc() zeroes the lines[] array
+ fnCC* cc = VG_(calloc)(1, sizeof(fnCC));
+ cc->fn = VG_(strdup)(fn);
+ cc->next = next;
+ return cc;
}
-#define HASH_CONSTANT 256
-
-static UInt hash(Char *s, UInt table_size)
+static __inline__
+lineCC* new_lineCC(Int line, lineCC* next)
{
- int hash_value = 0;
- for ( ; *s; s++)
- hash_value = (HASH_CONSTANT * hash_value + *s) % table_size;
- return hash_value;
+ // Using calloc() zeroes the Ir/Dr/Dw CCs and the instrs[] array
+ lineCC* cc = VG_(calloc)(1, sizeof(lineCC));
+ cc->line = line;
+ cc->next = next;
+ return cc;
}
-/* This is a backup for get_BBCC() when removing BBs from the table.
- * Necessary because the debug info can change when code is removed. For
- * example, when inserting, the info might be "myprint.c:myprint()", but
- * upon removal, the info might be "myprint.c:???", which causes the
- * hash-lookup to fail (but it doesn't always happen). So we do a horrible,
- * slow search through all the file nodes and function nodes (but we can do
- * 3rd stage with the fast hash-lookup). */
-static BBCC* get_BBCC_slow_removal(Addr bb_orig_addr)
+static __inline__
+instr_info* new_instr_info(Addr instr_addr, lineCC* parent, instr_info* next)
{
- Int i, j;
- UInt BBCC_hash;
- file_node *curr_file_node;
- fn_node *curr_fn_node;
- BBCC **prev_BBCC_next_ptr, *curr_BBCC;
-
- for (i = 0; i < N_FILE_ENTRIES; i++) {
-
- for (curr_file_node = BBCC_table[i];
- NULL != curr_file_node;
- curr_file_node = curr_file_node->next)
- {
- for (j = 0; j < N_FN_ENTRIES; j++) {
-
- for (curr_fn_node = curr_file_node->fns[j];
- NULL != curr_fn_node;
- curr_fn_node = curr_fn_node->next)
- {
- BBCC_hash = bb_orig_addr % N_BBCC_ENTRIES;
- prev_BBCC_next_ptr = &(curr_fn_node->BBCCs[BBCC_hash]);
- curr_BBCC = curr_fn_node->BBCCs[BBCC_hash];
-
- while (NULL != curr_BBCC) {
- if (bb_orig_addr == curr_BBCC->orig_addr) {
- // Found it!
- sk_assert(curr_BBCC->array_size > 0
- && curr_BBCC->array_size < 1000000);
- if (VG_(clo_verbosity) > 2) {
- VG_(message)(Vg_DebugMsg, "did slow BB removal");
- }
-
- // Remove curr_BBCC from chain; it will be used and
- // free'd by the caller.
- *prev_BBCC_next_ptr = curr_BBCC->next;
- return curr_BBCC;
- }
-
- prev_BBCC_next_ptr = &(curr_BBCC->next);
- curr_BBCC = curr_BBCC->next;
- }
- }
- }
- }
- }
- VG_(printf)("failing BB address: %p\n", bb_orig_addr);
- VG_(skin_panic)("slow BB removal failed");
+ // Using calloc() zeroes instr_size and data_size
+ instr_info* ii = VG_(calloc)(1, sizeof(instr_info));
+ ii->instr_addr = instr_addr;
+ ii->parent = parent;
+ return ii;
}
-/* Do a three step traversal: by filename, then fn_name, then instr_addr.
- * In all cases prepends new nodes to their chain. Returns a pointer to the
- * cost centre. Also sets BB_seen_before by reference.
- */
-static BBCC* get_BBCC(Addr bb_orig_addr, UCodeBlock* cb,
- Bool remove, Bool *BB_seen_before)
+// Do a three step traversal: by file, then fn, then line.
+// In all cases prepends new nodes to their chain. Returns a pointer to the
+// line node, creates a new one if necessary.
+static lineCC* get_lineCC(Addr orig_addr)
{
- file_node *curr_file_node;
- fn_node *curr_fn_node;
- BBCC **prev_BBCC_next_ptr, *curr_BBCC;
- Char filename[FILENAME_LEN], fn_name[FN_NAME_LEN];
- UInt filename_hash, fnname_hash, BBCC_hash;
- Int dummy_line_num;
+ fileCC *curr_fileCC;
+ fnCC *curr_fnCC;
+ lineCC *curr_lineCC;
+ Char file[FILE_LEN], fn[FN_LEN];
+ Int line;
+ UInt file_hash, fn_hash, line_hash;
- get_debug_info(bb_orig_addr, filename, fn_name, &dummy_line_num);
+ get_debug_info(orig_addr, file, fn, &line);
- VGP_PUSHCC(VgpGetBBCC);
- filename_hash = hash(filename, N_FILE_ENTRIES);
- curr_file_node = BBCC_table[filename_hash];
- while (NULL != curr_file_node &&
- VG_(strcmp)(filename, curr_file_node->filename) != 0) {
- curr_file_node = curr_file_node->next;
+ VGP_PUSHCC(VgpGetLineCC);
+
+ // level 1
+ file_hash = hash(file, N_FILE_ENTRIES);
+ curr_fileCC = CC_table[file_hash];
+ while (NULL != curr_fileCC && !VG_STREQ(file, curr_fileCC->file)) {
+ curr_fileCC = curr_fileCC->next;
}
- if (NULL == curr_file_node) {
- BBCC_table[filename_hash] = curr_file_node =
- new_file_node(filename, BBCC_table[filename_hash]);
+ if (NULL == curr_fileCC) {
+ CC_table[file_hash] = curr_fileCC =
+ new_fileCC(file, CC_table[file_hash]);
distinct_files++;
}
- fnname_hash = hash(fn_name, N_FN_ENTRIES);
- curr_fn_node = curr_file_node->fns[fnname_hash];
- while (NULL != curr_fn_node &&
- VG_(strcmp)(fn_name, curr_fn_node->fn_name) != 0) {
- curr_fn_node = curr_fn_node->next;
+ // level 2
+ fn_hash = hash(fn, N_FN_ENTRIES);
+ curr_fnCC = curr_fileCC->fns[fn_hash];
+ while (NULL != curr_fnCC && !VG_STREQ(fn, curr_fnCC->fn)) {
+ curr_fnCC = curr_fnCC->next;
}
- if (NULL == curr_fn_node) {
- curr_file_node->fns[fnname_hash] = curr_fn_node =
- new_fn_node(fn_name, curr_file_node->fns[fnname_hash]);
+ if (NULL == curr_fnCC) {
+ curr_fileCC->fns[fn_hash] = curr_fnCC =
+ new_fnCC(fn, curr_fileCC->fns[fn_hash]);
distinct_fns++;
}
- BBCC_hash = bb_orig_addr % N_BBCC_ENTRIES;
- prev_BBCC_next_ptr = &(curr_fn_node->BBCCs[BBCC_hash]);
- curr_BBCC = curr_fn_node->BBCCs[BBCC_hash];
- while (NULL != curr_BBCC && bb_orig_addr != curr_BBCC->orig_addr) {
- prev_BBCC_next_ptr = &(curr_BBCC->next);
- curr_BBCC = curr_BBCC->next;
- }
- if (curr_BBCC == NULL) {
-
- if (remove == False) {
- curr_fn_node->BBCCs[BBCC_hash] = curr_BBCC =
- new_BBCC(bb_orig_addr, cb, curr_fn_node->BBCCs[BBCC_hash]);
- *BB_seen_before = False;
- } else {
- // Ok, BB not found when removing: the debug info must have
- // changed. Do a slow removal.
- curr_BBCC = get_BBCC_slow_removal(bb_orig_addr);
- *BB_seen_before = True;
+ // level 3
+ line_hash = line % N_LINE_ENTRIES;
+ curr_lineCC = curr_fnCC->lines[line_hash];
+ while (NULL != curr_lineCC && line != curr_lineCC->line) {
+ curr_lineCC = curr_lineCC->next;
}
-
- } else {
- sk_assert(bb_orig_addr == curr_BBCC->orig_addr);
- sk_assert(curr_BBCC->array_size > 0 && curr_BBCC->array_size < 1000000);
- if (VG_(clo_verbosity) > 2) {
- VG_(message)(Vg_DebugMsg,
- "BB retranslation/invalidation, retrieving from BBCC table");
+ if (NULL == curr_lineCC) {
+ curr_fnCC->lines[line_hash] = curr_lineCC =
+ new_lineCC(line, curr_fnCC->lines[line_hash]);
+ distinct_lines++;
}
- *BB_seen_before = True;
-
- if (True == remove) {
- // Remove curr_BBCC from chain; it will be used and free'd by the
- // caller.
- *prev_BBCC_next_ptr = curr_BBCC->next;
- } else {
- BB_retranslations++;
- }
- }
- VGP_POPCC(VgpGetBBCC);
- return curr_BBCC;
+ VGP_POPCC(VgpGetLineCC);
+ return curr_lineCC;
}
/*------------------------------------------------------------*/
-/*--- Cache simulation instrumentation phase ---*/
+/*--- Cache simulation functions ---*/
/*------------------------------------------------------------*/
-static Int compute_BBCC_array_size(UCodeBlock* cb)
-{
- UInstr* u_in;
- Int i, CC_size, BBCC_size = 0;
- Bool is_LOAD, is_STORE, is_FPU_R, is_FPU_W;
- Int t_read, t_write;
-
- is_LOAD = is_STORE = is_FPU_R = is_FPU_W = False;
- t_read = t_write = INVALID_TEMPREG;
-
- for (i = 0; i < VG_(get_num_instrs)(cb); i++) {
- u_in = VG_(get_instr)(cb, i);
- switch(u_in->opcode) {
-
- case INCEIP:
- goto case_for_end_of_instr;
-
- case JMP:
- if (u_in->cond != CondAlways) break;
-
- goto case_for_end_of_instr;
-
- case_for_end_of_instr:
-
- if (((is_LOAD && is_STORE) || (is_FPU_R && is_FPU_W)) &&
- t_read != t_write)
- CC_size = sizeof(iddCC);
- else if (is_LOAD || is_STORE || is_FPU_R || is_FPU_W)
- CC_size = sizeof(idCC);
- else
- CC_size = sizeof(iCC);
-
- BBCC_size += CC_size;
- is_LOAD = is_STORE = is_FPU_R = is_FPU_W = False;
- break;
-
- case LOAD:
- /* Two LDBs are possible for a single instruction */
- /* Also, a STORE can come after a LOAD for bts/btr/btc */
- sk_assert(/*!is_LOAD &&*/ /* !is_STORE && */
- !is_FPU_R && !is_FPU_W);
- t_read = u_in->val1;
- is_LOAD = True;
- break;
-
- case STORE:
- /* Multiple STOREs are possible for 'pushal' */
- sk_assert( /*!is_STORE &&*/ !is_FPU_R && !is_FPU_W);
- t_write = u_in->val2;
- is_STORE = True;
- break;
-
- case MMX2_MemRd:
- sk_assert(u_in->size == 4 || u_in->size == 8);
- /* fall through */
- case FPU_R:
- sk_assert(!is_LOAD && !is_STORE && !is_FPU_R && !is_FPU_W);
- t_read = u_in->val2;
- is_FPU_R = True;
- break;
-
- case MMX2a1_MemRd:
- sk_assert(u_in->size == 8);
- sk_assert(!is_LOAD && !is_STORE && !is_FPU_R && !is_FPU_W);
- t_read = u_in->val3;
- is_FPU_R = True;
- break;
-
- case SSE2a_MemRd:
- case SSE2a1_MemRd:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16 || u_in->size == 512);
- t_read = u_in->val3;
- is_FPU_R = True;
- break;
-
- case SSE3a_MemRd:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16);
- t_read = u_in->val3;
- is_FPU_R = True;
- break;
-
- case SSE3a1_MemRd:
- sk_assert(u_in->size == 8 || u_in->size == 16);
- t_read = u_in->val3;
- is_FPU_R = True;
- break;
-
- case SSE3ag_MemRd_RegWr:
- sk_assert(u_in->size == 4 || u_in->size == 8);
- t_read = u_in->val1;
- is_FPU_R = True;
- break;
-
- case MMX2_MemWr:
- sk_assert(u_in->size == 4 || u_in->size == 8);
- /* fall through */
- case FPU_W:
- sk_assert(!is_LOAD && !is_STORE && !is_FPU_R && !is_FPU_W);
- t_write = u_in->val2;
- is_FPU_W = True;
- break;
-
- case SSE2a_MemWr:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16 || u_in->size == 512);
- t_write = u_in->val3;
- is_FPU_W = True;
- break;
-
- case SSE3a_MemWr:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16);
- t_write = u_in->val3;
- is_FPU_W = True;
- break;
-
- default:
- break;
- }
- }
-
- return BBCC_size;
-}
-
static __attribute__ ((regparm (1)))
-void log_1I_0D_cache_access(iCC* cc)
+void log_1I_0D_cache_access(instr_info* n)
{
//VG_(printf)("1I_0D: CCaddr=0x%x, iaddr=0x%x, isize=%u\n",
- // cc, cc->instr_addr, cc->instr_size)
+ // n, n->instr_addr, n->instr_size)
VGP_PUSHCC(VgpCacheSimulate);
- cachesim_I1_doref(cc->instr_addr, cc->instr_size, &cc->I.m1, &cc->I.m2);
- cc->I.a++;
+ cachesim_I1_doref(n->instr_addr, n->instr_size,
+ &n->parent->Ir.m1, &n->parent->Ir.m2);
+ n->parent->Ir.a++;
VGP_POPCC(VgpCacheSimulate);
}
-/* Difference between this function and log_1I_0D_cache_access() is that
- this one can be passed any kind of CC, not just an iCC. So we have to
- be careful to make sure we don't make any assumptions about CC layout.
- (As it stands, they would be safe, but this will avoid potential heartache
- if anyone else changes CC layout.)
- Note that we only do the switch for the JIFZ version because if we always
- called this switching version, things would run about 5% slower. */
-static __attribute__ ((regparm (1)))
-void log_1I_0D_cache_access_JIFZ(iCC* cc)
+static __attribute__ ((regparm (2)))
+void log_1I_1Dr_cache_access(instr_info* n, Addr data_addr)
{
- UChar instr_size;
- Addr instr_addr;
- CC* I;
-
- //VG_(printf)("1I_0D: CCaddr=0x%x, iaddr=0x%x, isize=%u\n",
- // cc, cc->instr_addr, cc->instr_size)
+ //VG_(printf)("1I_1Dr: CCaddr=%p, iaddr=%p, isize=%u, daddr=%p, dsize=%u\n",
+ // n, n->instr_addr, n->instr_size, data_addr, n->data_size)
VGP_PUSHCC(VgpCacheSimulate);
+ cachesim_I1_doref(n->instr_addr, n->instr_size,
+ &n->parent->Ir.m1, &n->parent->Ir.m2);
+ n->parent->Ir.a++;
- switch(cc->tag) {
- case InstrCC:
- instr_size = cc->instr_size;
- instr_addr = cc->instr_addr;
- I = &(cc->I);
- break;
- case ReadCC:
- case WriteCC:
- case ModCC:
- instr_size = ((idCC*)cc)->instr_size;
- instr_addr = ((idCC*)cc)->instr_addr;
- I = &( ((idCC*)cc)->I );
- break;
- case ReadWriteCC:
- instr_size = ((iddCC*)cc)->instr_size;
- instr_addr = ((iddCC*)cc)->instr_addr;
- I = &( ((iddCC*)cc)->I );
- break;
- default:
- VG_(skin_panic)("Unknown CC type in log_1I_0D_cache_access_JIFZ()\n");
- break;
- }
- cachesim_I1_doref(instr_addr, instr_size, &I->m1, &I->m2);
- I->a++;
+ cachesim_D1_doref(data_addr, n->data_size,
+ &n->parent->Dr.m1, &n->parent->Dr.m2);
+ n->parent->Dr.a++;
VGP_POPCC(VgpCacheSimulate);
}
-__attribute__ ((regparm (2))) static
-void log_0I_1D_cache_access(idCC* cc, Addr data_addr)
+static __attribute__ ((regparm (2)))
+void log_1I_1Dw_cache_access(instr_info* n, Addr data_addr)
{
- //VG_(printf)("0I_1D: CCaddr=%p, iaddr=%p, isize=%u, daddr=%p, dsize=%u\n",
- // cc, cc->instr_addr, cc->instr_size, data_addr, cc->data_size)
+ //VG_(printf)("1I_1Dw: CCaddr=%p, iaddr=%p, isize=%u, daddr=%p, dsize=%u\n",
+ // n, n->instr_addr, n->instr_size, data_addr, n->data_size)
VGP_PUSHCC(VgpCacheSimulate);
- cachesim_D1_doref(data_addr, cc->data_size, &cc->D.m1, &cc->D.m2);
- cc->D.a++;
+ cachesim_I1_doref(n->instr_addr, n->instr_size,
+ &n->parent->Ir.m1, &n->parent->Ir.m2);
+ n->parent->Ir.a++;
+
+ cachesim_D1_doref(data_addr, n->data_size,
+ &n->parent->Dw.m1, &n->parent->Dw.m2);
+ n->parent->Dw.a++;
VGP_POPCC(VgpCacheSimulate);
}
-__attribute__ ((regparm (2))) static
-void log_1I_1D_cache_access(idCC* cc, Addr data_addr)
+static __attribute__ ((regparm (3)))
+void log_1I_2D_cache_access(instr_info* n, Addr data_addr1, Addr data_addr2)
{
- //VG_(printf)("1I_1D: CCaddr=%p, iaddr=%p, isize=%u, daddr=%p, dsize=%u\n",
- // cc, cc->instr_addr, cc->instr_size, data_addr, cc->data_size)
+ //VG_(printf)("1I_2D: CCaddr=%p, iaddr=%p, isize=%u, daddr1=%p, daddr2=%p, dsize=%u\n",
+ // n, n->instr_addr, n->instr_size, data_addr1, data_addr2, n->data_size)
VGP_PUSHCC(VgpCacheSimulate);
- cachesim_I1_doref(cc->instr_addr, cc->instr_size, &cc->I.m1, &cc->I.m2);
- cc->I.a++;
+ cachesim_I1_doref(n->instr_addr, n->instr_size,
+ &n->parent->Ir.m1, &n->parent->Ir.m2);
+ n->parent->Ir.a++;
- cachesim_D1_doref(data_addr, cc->data_size, &cc->D.m1, &cc->D.m2);
- cc->D.a++;
+ cachesim_D1_doref(data_addr1, n->data_size,
+ &n->parent->Dr.m1, &n->parent->Dr.m2);
+ n->parent->Dr.a++;
+ cachesim_D1_doref(data_addr2, n->data_size,
+ &n->parent->Dw.m1, &n->parent->Dw.m2);
+ n->parent->Dw.a++;
VGP_POPCC(VgpCacheSimulate);
}
-__attribute__ ((regparm (3))) static
-void log_0I_2D_cache_access(iddCC* cc, Addr data_addr1, Addr data_addr2)
+/*------------------------------------------------------------*/
+/*--- Instrumentation ---*/
+/*------------------------------------------------------------*/
+
+BB_info* get_BB_info(UCodeBlock* cb_in, Addr orig_addr, Bool* bb_seen_before)
{
- //VG_(printf)("0I_2D: CCaddr=%p, iaddr=%p, isize=%u, daddr1=0x%x, daddr2=%p, size=%u\n",
- // cc, cc->instr_addr, cc->instr_size, data_addr1, data_addr2, cc->data_size)
- VGP_PUSHCC(VgpCacheSimulate);
- cachesim_D1_doref(data_addr1, cc->data_size, &cc->Da.m1, &cc->Da.m2);
- cc->Da.a++;
- cachesim_D1_doref(data_addr2, cc->data_size, &cc->Db.m1, &cc->Db.m2);
- cc->Db.a++;
- VGP_POPCC(VgpCacheSimulate);
+ Int i, n_instrs;
+ UInstr* u_in;
+ BB_info* bb_info;
+ VgHashNode** dummy;
+
+ // Count number of x86 instrs in BB
+ n_instrs = 1; // start at 1 because last x86 instr has no INCEIP
+ for (i = 0; i < VG_(get_num_instrs)(cb_in); i++) {
+ u_in = VG_(get_instr)(cb_in, i);
+ if (INCEIP == u_in->opcode) n_instrs++;
+ }
+
+ // Get the BB_info
+ bb_info = (BB_info*)VG_(HT_get_node)(instr_info_table, orig_addr, &dummy);
+ *bb_seen_before = ( NULL == bb_info ? False : True );
+ if (*bb_seen_before) {
+ // BB must have been translated before, but flushed from the TT
+ sk_assert(bb_info->n_instrs == n_instrs );
+ BB_retranslations++;
+ } else {
+ // BB never translated before (at this address, at least; could have
+ // been unloaded and then reloaded elsewhere in memory)
+ bb_info =
+ VG_(calloc)(1, sizeof(BB_info) + n_instrs*sizeof(instr_info));
+ bb_info->BB_addr = orig_addr;
+ bb_info->n_instrs = n_instrs;
+ VG_(HT_add_node)( instr_info_table, (VgHashNode*)bb_info );
+ distinct_instrs++;
+ }
+ return bb_info;
}
-__attribute__ ((regparm (3))) static
-void log_1I_2D_cache_access(iddCC* cc, Addr data_addr1, Addr data_addr2)
+void do_details( instr_info* n, Bool bb_seen_before,
+ Addr instr_addr, Int instr_size, Int data_size )
{
- //VG_(printf)("1I_2D: CCaddr=%p, iaddr=%p, isize=%u, daddr1=%p, daddr2=%p, dsize=%u\n",
- // cc, cc->instr_addr, cc->instr_size, data_addr1, data_addr2, cc->data_size)
- VGP_PUSHCC(VgpCacheSimulate);
- cachesim_I1_doref(cc->instr_addr, cc->instr_size, &cc->I.m1, &cc->I.m2);
- cc->I.a++;
+ lineCC* parent = get_lineCC(instr_addr);
+ if (bb_seen_before) {
+ sk_assert( n->instr_addr == instr_addr );
+ sk_assert( n->instr_size == instr_size );
+ sk_assert( n->data_size == data_size );
+ // Don't assert that (n->parent == parent)... it's conceivable that
+ // the debug info might change; the other asserts should be enough to
+ // detect anything strange.
+ } else {
+ n->instr_addr = instr_addr;
+ n->instr_size = instr_size;
+ n->data_size = data_size;
+ n->parent = parent;
+ }
+}
- cachesim_D1_doref(data_addr1, cc->data_size, &cc->Da.m1, &cc->Da.m2);
- cc->Da.a++;
- cachesim_D1_doref(data_addr2, cc->data_size, &cc->Db.m1, &cc->Db.m2);
- cc->Db.a++;
- VGP_POPCC(VgpCacheSimulate);
+Bool is_valid_data_size(Int data_size)
+{
+ return (4 == data_size || 2 == data_size || 1 == data_size ||
+ 8 == data_size || 10 == data_size || MIN_LINE_SIZE == data_size);
}
-UCodeBlock* SK_(instrument)(UCodeBlock* cb_in, Addr orig_addr)
+// Instrumentation for the end of each x86 instruction.
+void end_of_x86_instr(UCodeBlock* cb, instr_info* i_node, Bool bb_seen_before,
+ UInt instr_addr, UInt instr_size, UInt data_size,
+ Int t_read, Int t_read_addr,
+ Int t_write, Int t_write_addr)
{
-/* Use this rather than eg. -1 because it's a UInt. */
-#define INVALID_DATA_SIZE 999999
+ Addr helper;
+ Int argc;
+ Int t_CC_addr,
+ t_data_addr1 = INVALID_TEMPREG,
+ t_data_addr2 = INVALID_TEMPREG;
+
+ sk_assert(instr_size >= 1 &&
+ instr_size <= MAX_x86_INSTR_SIZE);
+
+#define IS_(X) (INVALID_TEMPREG != t_##X##_addr)
+#define INV(qqt) (INVALID_TEMPREG == (qqt))
+
+ // Work out what kind of x86 instruction it is
+ if (!IS_(read) && !IS_(write)) {
+ sk_assert( 0 == data_size );
+ sk_assert(INV(t_read) && INV(t_write));
+ helper = (Addr) & log_1I_0D_cache_access;
+ argc = 1;
+ } else if (IS_(read) && !IS_(write)) {
+ sk_assert( is_valid_data_size(data_size) );
+ sk_assert(!INV(t_read) && INV(t_write));
+ helper = (Addr) & log_1I_1Dr_cache_access;
+ argc = 2;
+ t_data_addr1 = t_read_addr;
+
+ } else if (!IS_(read) && IS_(write)) {
+ sk_assert( is_valid_data_size(data_size) );
+ sk_assert(INV(t_read) && !INV(t_write));
+ helper = (Addr) & log_1I_1Dw_cache_access;
+ argc = 2;
+ t_data_addr1 = t_write_addr;
+
+ } else {
+ sk_assert(IS_(read) && IS_(write));
+ sk_assert( is_valid_data_size(data_size) );
+ sk_assert(!INV(t_read) && !INV(t_write));
+ if (t_read == t_write) {
+ helper = (Addr) & log_1I_1Dr_cache_access;
+ argc = 2;
+ t_data_addr1 = t_read_addr;
+ } else {
+ helper = (Addr) & log_1I_2D_cache_access;
+ argc = 3;
+ t_data_addr1 = t_read_addr;
+ t_data_addr2 = t_write_addr;
+ }
+ }
+#undef IS_
+ // Setup 1st arg: CC addr
+ do_details( i_node, bb_seen_before, instr_addr, instr_size, data_size );
+ t_CC_addr = newTemp(cb);
+ uInstr2(cb, MOV, 4, Literal, 0, TempReg, t_CC_addr);
+ uLiteral(cb, (Addr)i_node);
+
+ // Call the helper
+ if (1 == argc)
+ uInstr1(cb, CCALL, 0, TempReg, t_CC_addr);
+ else if (2 == argc)
+ uInstr2(cb, CCALL, 0, TempReg, t_CC_addr,
+ TempReg, t_data_addr1);
+ else if (3 == argc)
+ uInstr3(cb, CCALL, 0, TempReg, t_CC_addr,
+ TempReg, t_data_addr1,
+ TempReg, t_data_addr2);
+ else
+ VG_(skin_panic)("argc... not 1 or 2 or 3?");
+
+ uCCall(cb, helper, argc, argc, False);
+}
+
+UCodeBlock* SK_(instrument)(UCodeBlock* cb_in, Addr orig_addr)
+{
UCodeBlock* cb;
- Int i;
UInstr* u_in;
- BBCC* BBCC_node;
- Int t_CC_addr, t_read_addr, t_write_addr, t_data_addr1,
- t_data_addr2, t_read, t_write;
- Int CC_size = -1; /* Shut gcc warnings up */
+ Int i, bb_info_i;
+ BB_info* bb_info;
+ Bool bb_seen_before = False;
+ Int t_read_addr, t_write_addr, t_read, t_write;
Addr x86_instr_addr = orig_addr;
- UInt x86_instr_size, data_size = INVALID_DATA_SIZE;
- Addr helper;
- Int argc;
- Bool BB_seen_before = False;
- Bool instrumented_Jcond = False;
- Bool has_rep_prefix = False;
- Addr BBCC_ptr0, BBCC_ptr;
+ UInt x86_instr_size, data_size = 0;
+ Bool instrumented_Jcc = False;
- /* Get BBCC (creating if necessary -- requires a counting pass over the BB
- * if it's the first time it's been seen), and point to start of the
- * BBCC array. */
- BBCC_node = get_BBCC(orig_addr, cb_in, /*remove=*/False, &BB_seen_before);
- BBCC_ptr0 = BBCC_ptr = (Addr)(BBCC_node->array);
+ bb_info = get_BB_info(cb_in, orig_addr, &bb_seen_before);
+ bb_info_i = 0;
cb = VG_(setup_UCodeBlock)(cb_in);
- t_CC_addr = t_read_addr = t_write_addr = t_data_addr1 = t_data_addr2 =
- t_read = t_write = INVALID_TEMPREG;
+ t_read_addr = t_write_addr = t_read = t_write = INVALID_TEMPREG;
for (i = 0; i < VG_(get_num_instrs)(cb_in); i++) {
u_in = VG_(get_instr)(cb_in, i);
- /* What this is all about: we want to instrument each x86 instruction
- * translation. The end of these are marked in three ways. The three
- * ways, and the way we instrument them, are as follows:
- *
- * 1. UCode, INCEIP --> UCode, Instrumentation, INCEIP
- * 2. UCode, Juncond --> UCode, Instrumentation, Juncond
- * 3. UCode, Jcond, Juncond --> UCode, Instrumentation, Jcond, Juncond
- *
- * The last UInstr in a basic block is always a Juncond. Jconds,
- * when they appear, are always second last. We check this with
- * various assertions.
- *
- * We must put the instrumentation before any jumps so that it is always
- * executed. We don't have to put the instrumentation before the INCEIP
- * (it could go after) but we do so for consistency.
- *
- * x86 instruction sizes are obtained from INCEIPs (for case 1) or
- * from .extra4b field of the final JMP (for case 2 & 3).
- *
- * Note that JIFZ is treated differently.
- *
- * The instrumentation is just a call to the appropriate helper function,
- * passing it the address of the instruction's CC.
- */
- if (instrumented_Jcond) sk_assert(u_in->opcode == JMP);
+ // We want to instrument each x86 instruction with a call to the
+ // appropriate simulation function, which depends on whether the
+ // instruction does memory data reads/writes. x86 instructions can
+ // end in three ways, and this is how they are instrumented:
+ //
+ // 1. UCode, INCEIP --> UCode, Instrumentation, INCEIP
+ // 2. UCode, JMP --> UCode, Instrumentation, JMP
+ // 3. UCode, Jcc, JMP --> UCode, Instrumentation, Jcc, JMP
+ //
+ // The last UInstr in a BB is always a JMP. Jccs, when they appear,
+ // are always second last. This is checked with assertions.
+ // Instrumentation must go before any jumps. (JIFZ is the exception;
+ // if a JIFZ succeeds, no simulation is done for the instruction.)
+ //
+ // x86 instruction sizes are obtained from INCEIPs (for case 1) or
+ // from .extra4b field of the final JMP (for case 2 & 3).
+
+ if (instrumented_Jcc) sk_assert(u_in->opcode == JMP);
switch (u_in->opcode) {
- case NOP: case LOCK: case CALLM_E: case CALLM_S:
- break;
- /* For memory-ref instrs, copy the data_addr into a temporary to be
- * passed to the cachesim_* helper at the end of the instruction.
- */
+ // For memory-ref instrs, copy the data_addr into a temporary to be
+ // passed to the cachesim_* helper at the end of the instruction.
case LOAD:
+ case SSE3ag_MemRd_RegWr:
t_read = u_in->val1;
t_read_addr = newTemp(cb);
@@ -852,14 +548,10 @@ UCodeBlock* SK_(instrument)(UCodeBlock*
break;
- case MMX2_MemRd:
- sk_assert(u_in->size == 4 || u_in->size == 8);
- /* fall through */
case FPU_R:
+ case MMX2_MemRd:
t_read = u_in->val2;
t_read_addr = newTemp(cb);
uInstr2(cb, MOV, 4, TempReg, u_in->val2, TempReg, t_read_addr);
- data_size = ( u_in->size <= MIN_LINE_SIZE
- ? u_in->size
- : MIN_LINE_SIZE);
+ data_size = u_in->size;
VG_(copy_UInstr)(cb, u_in);
break;
@@ -867,40 +559,8 @@ UCodeBlock* SK_(instrument)(UCodeBlock*
case MMX2a1_MemRd:
- sk_assert(u_in->size == 8);
- t_read = u_in->val3;
- t_read_addr = newTemp(cb);
- uInstr2(cb, MOV, 4, TempReg, u_in->val3, TempReg, t_read_addr);
- data_size = ( u_in->size <= MIN_LINE_SIZE
- ? u_in->size
- : MIN_LINE_SIZE);
- VG_(copy_UInstr)(cb, u_in);
- break;
-
case SSE2a_MemRd:
case SSE2a1_MemRd:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16 || u_in->size == 512);
- t_read = u_in->val3;
- t_read_addr = newTemp(cb);
- uInstr2(cb, MOV, 4, TempReg, u_in->val3, TempReg, t_read_addr);
- /* 512 B data-sized instructions will be done inaccurately
- * but they're very rare and this avoids errors from
- * hitting more than two cache lines in the simulation. */
- data_size = ( u_in->size <= MIN_LINE_SIZE
- ? u_in->size
- : MIN_LINE_SIZE);
- VG_(copy_UInstr)(cb, u_in);
- break;
-
case SSE3a_MemRd:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16);
- t_read = u_in->val3;
- t_read_addr = newTemp(cb);
- uInstr2(cb, MOV, 4, TempReg, u_in->val3, TempReg, t_read_addr);
- data_size = u_in->size;
- VG_(copy_UInstr)(cb, u_in);
- break;
-
case SSE3a1_MemRd:
- sk_assert(u_in->size == 8 || u_in->size == 16);
t_read = u_in->val3;
t_read_addr = newTemp(cb);
@@ -910,232 +570,74 @@ UCodeBlock* SK_(instrument)(UCodeBlock*
break;
- case SSE3ag_MemRd_RegWr:
- sk_assert(u_in->size == 4 || u_in->size == 8);
- t_read = u_in->val1;
- t_read_addr = newTemp(cb);
- uInstr2(cb, MOV, 4, TempReg, u_in->val1, TempReg, t_read_addr);
- data_size = u_in->size;
- VG_(copy_UInstr)(cb, u_in);
- break;
-
- /* Note that we must set t_write_addr even for mod instructions;
- * That's how the code above determines whether it does a write.
- * Without it, it would think a mod instruction is a read.
- * As for the MOV, if it's a mod instruction it's redundant, but it's
- * not expensive and mod instructions are rare anyway. */
- case MMX2_MemWr:
- sk_assert(u_in->size == 4 || u_in->size == 8);
- /* fall through */
+ // Note that we must set t_write_addr even for mod instructions;
+ // That's how the code above determines whether it does a write.
+ // Without it, it would think a mod instruction is a read.
+ // As for the MOV, if it's a mod instruction it's redundant, but it's
+ // not expensive and mod instructions are rare anyway. */
case STORE:
case FPU_W:
+ case MMX2_MemWr:
t_write = u_in->val2;
t_write_addr = newTemp(cb);
uInstr2(cb, MOV, 4, TempReg, u_in->val2, TempReg, t_write_addr);
- /* 28 and 108 B data-sized instructions will be done
- * inaccurately but they're very rare and this avoids errors
- * from hitting more than two cache lines in the simulation. */
- data_size = ( u_in->size <= MIN_LINE_SIZE
- ? u_in->size
- : MIN_LINE_SIZE);
+ data_size = u_in->size;
VG_(copy_UInstr)(cb, u_in);
break;
case SSE2a_MemWr:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16 || u_in->size == 512);
- /* fall through */
case SSE3a_MemWr:
- sk_assert(u_in->size == 4 || u_in->size == 8 || u_in->size == 16 || u_in->size == 512);
t_write = u_in->val3;
t_write_addr = newTemp(cb);
uInstr2(cb, MOV, 4, TempReg, u_in->val3, TempReg, t_write_addr);
- /* 512 B data-sized instructions will be done inaccurately
- * but they're very rare and this avoids errors from
- * hitting more than two cache lines in the simulation. */
- data_size = ( u_in->size <= MIN_LINE_SIZE
- ? u_in->size
- : MIN_LINE_SIZE);
- VG_(copy_UInstr)(cb, u_in);
- break;
-
- /* For rep-prefixed instructions, log a single I-cache access
- * before the UCode loop that implements the repeated part, which
- * is where the multiple D-cache accesses are logged. */
- case JIFZ:
- has_rep_prefix = True;
-
- /* Setup 1st and only arg: CC addr */
- t_CC_addr = newTemp(cb);
- uInstr2(cb, MOV, 4, Literal, 0, TempReg, t_CC_addr);
- uLiteral(cb, BBCC_ptr);
-
- /* Call helper */
- uInstr1(cb, CCALL, 0, TempReg, t_CC_addr);
- uCCall(cb, (Addr) & log_1I_0D_cache_access_JIFZ, 1, 1, False);
+ data_size = u_in->size;
VG_(copy_UInstr)(cb, u_in);
break;
-
- /* INCEIP: insert instrumentation */
+ // INCEIP: insert instrumentation
case INCEIP:
x86_instr_size = u_in->val1;
goto instrument_x86_instr;
- /* JMP: insert instrumentation if the first JMP */
+ // JMP: insert instrumentation if the first JMP
case JMP:
- if (instrumented_Jcond) {
+ if (instrumented_Jcc) {
sk_assert(CondAlways == u_in->cond);
sk_assert(i+1 == VG_(get_num_instrs)(cb_in));
VG_(copy_UInstr)(cb, u_in);
- instrumented_Jcond = False; /* reset */
+ instrumented_Jcc = False; // rest
break;
- }
- /* The first JMP... instrument. */
+ } else {
+ // The first JMP... instrument.
if (CondAlways != u_in->cond) {
sk_assert(i+2 == VG_(get_num_instrs)(cb_in));
- instrumented_Jcond = True;
+ instrumented_Jcc = True;
} else {
sk_assert(i+1 == VG_(get_num_instrs)(cb_in));
}
-
- /* Get x86 instr size from final JMP. */
+ // Get x86 instr size from final JMP.
x86_instr_size = VG_(get_last_instr)(cb_in)->extra4b;
-
goto instrument_x86_instr;
-
-
- /* Code executed at the end of each x86 instruction. */
- instrument_x86_instr:
-
- /* Initialise the CC in the BBCC array appropriately if it
- * hasn't been initialised before. Then call appropriate sim
- * function, passing it the CC address. */
- sk_assert(x86_instr_size >= 1 &&
- x86_instr_size <= MAX_x86_INSTR_SIZE);
-
-#define IS_(X) (INVALID_TEMPREG != t_##X##_addr)
-
- if (!IS_(read) && !IS_(write)) {
- sk_assert(INVALID_DATA_SIZE == data_size);
- sk_assert(INVALID_TEMPREG == t_read_addr &&
- INVALID_TEMPREG == t_read &&
- INVALID_TEMPREG == t_write_addr &&
- INVALID_TEMPREG == t_write);
- CC_size = sizeof(iCC);
- if (!BB_seen_before)
- init_iCC((iCC*)BBCC_ptr, x86_instr_addr, x86_instr_size);
- helper = ( has_rep_prefix
- ? (Addr)0 /* no extra log needed */
- : (Addr) & log_1I_0D_cache_access
- );
- argc = 1;
-
- } else {
- sk_assert(4 == data_size || 2 == data_size || 1 == data_size ||
- 8 == data_size || 10 == data_size ||
- MIN_LINE_SIZE == data_size);
-
- if (IS_(read) && !IS_(write)) {
- CC_size = sizeof(idCC);
- /* If it uses 'rep', we've already logged the I-cache
- * access at the JIFZ UInstr (see JIFZ case below) so
- * don't do it here */
- helper = ( has_rep_prefix
- ? (Addr) & log_0I_1D_cache_access
- : (Addr) & log_1I_1D_cache_access
- );
- argc = 2;
- if (!BB_seen_before)
- init_idCC(ReadCC, (idCC*)BBCC_ptr, x86_instr_addr,
- x86_instr_size, data_size);
- sk_assert(INVALID_TEMPREG != t_read_addr &&
- INVALID_TEMPREG != t_read &&
- INVALID_TEMPREG == t_write_addr &&
- INVALID_TEMPREG == t_write);
- t_data_addr1 = t_read_addr;
-
- } else if (!IS_(read) && IS_(write)) {
- CC_size = sizeof(idCC);
- helper = ( has_rep_prefix
- ? (Addr) & log_0I_1D_cache_access
- : (Addr) & log_1I_1D_cache_access
- );
- argc = 2;
- if (!BB_seen_before)
- init_idCC(WriteCC, (idCC*)BBCC_ptr, x86_instr_addr,
- x86_instr_size, data_size);
- sk_assert(INVALID_TEMPREG == t_read_addr &&
- INVALID_TEMPREG == t_read &&
- INVALID_TEMPREG != t_write_addr &&
- INVALID_TEMPREG != t_write);
- t_data_addr1 = t_write_addr;
-
- } else {
- sk_assert(IS_(read) && IS_(write));
- sk_assert(INVALID_TEMPREG != t_read_addr &&
- INVALID_TEMPREG != t_read &&
- INVALID_TEMPREG != t_write_addr &&
- INVALID_TEMPREG != t_write);
- if (t_read == t_write) {
- CC_size = sizeof(idCC);
- helper = ( has_rep_prefix
- ? (Addr) & log_0I_1D_cache_access
- : (Addr) & log_1I_1D_cache_access
- );
- argc = 2;
- if (!BB_seen_before)
- init_idCC(ModCC, (idCC*)BBCC_ptr, x86_instr_addr,
- x86_instr_size, data_size);
- t_data_addr1 = t_read_addr;
- } else {
- CC_size = sizeof(iddCC);
- helper = ( has_rep_prefix
- ? (Addr) & log_0I_2D_cache_access
- : (Addr) & log_1I_2D_cache_access
- );
- argc = 3;
- if (!BB_seen_before)
- init_iddCC((iddCC*)BBCC_ptr, x86_instr_addr,
- x86_instr_size, data_size);
- t_data_addr1 = t_read_addr;
- t_data_addr2 = t_write_addr;
- }
- }
-#undef IS_
}
- /* Call the helper, if necessary */
- if ((Addr)0 != helper) {
-
- /* Setup 1st arg: CC addr */
- t_CC_addr = newTemp(cb);
- uInstr2(cb, MOV, 4, Literal, 0, TempReg, t_CC_addr);
- uLiteral(cb, BBCC_ptr);
-
- /* Call the helper */
- if (1 == argc)
- uInstr1(cb, CCALL, 0, TempReg, t_CC_addr);
- else if (2 == argc)
- uInstr2(cb, CCALL, 0, TempReg, t_CC_addr,
- TempReg, t_data_addr1);
- else if (3 == argc)
- uInstr3(cb, CCALL, 0, TempReg, t_CC_addr,
- TempReg, t_data_addr1,
- TempReg, t_data_addr2);
- else
- VG_(skin_panic)("argc... not 1 or 2 or 3?");
+ // Code executed at the end of each x86 instruction.
+ instrument_x86_instr:
+ // Large (eg. 28B, 108B, 512B) data-sized instructions will be
+ // done inaccurately but they're very rare and this avoids
+ // errors from hitting more than two cache lines in the
+ // simulation.
+ if (data_size > MIN_LINE_SIZE) data_size = MIN_LINE_SIZE;
- uCCall(cb, helper, argc, argc, False);
- }
+ end_of_x86_instr(cb, &bb_info->instrs[ bb_info_i ], bb_seen_before,
+ x86_instr_addr, x86_instr_size, data_size,
+ t_read, t_read_addr, t_write, t_write_addr);
- /* Copy original UInstr (INCEIP or JMP) */
+ // Copy original UInstr (INCEIP or JMP)
VG_(copy_UInstr)(cb, u_in);
- /* Update BBCC_ptr, EIP, de-init read/write temps for next instr */
- BBCC_ptr += CC_size;
+ // Update loop state for next x86 instr
+ bb_info_i++;
x86_instr_addr += x86_instr_size;
- t_CC_addr = t_read_addr = t_write_addr = t_data_addr1 =
- t_data_addr2 = t_read = t_write = INVALID_TEMPREG;
- data_size...
[truncated message content] |
|
From: Jeremy F. <je...@go...> - 2004-07-06 21:56:30
|
On Tue, 2004-07-06 at 17:31 +0100, Nicholas Nethercote wrote: > Hmm, good point. Does anyone use 2G shapes? Are there any other common > shapes? Ideally we'd be able to detect this at run-time. Yes, people do. If you have 2G of physical memory and you want the kernel to use all of it directly (ie, not use highmem), you need to give the kernel enough address space to map it all: hence 2G+2G. J |
|
From: Jeremy F. <je...@go...> - 2004-07-06 21:53:20
|
On Mon, 2004-07-05 at 08:47 +0100, Tom Hughes wrote: > I agree that it's pretty braindead. In fact older versions of libaio > don't seem to peek at the memory in this way, but the fact that it is > mapped in user space suggests that it was always intended. Well, I think the practical solution is to do what we do to convince ld. so to put things in the right place: pad the Valgrind portion of the address space before the io_setup, and remove the padding afterwards. That will force io_setup to allocate its mapping in the client address space. J |
|
From: Nicholas N. <nj...@ca...> - 2004-07-06 16:31:27
|
On Tue, 6 Jul 2004, Jeremy Fitzhardinge wrote: > But I think I'd prefer it if we built versions of Valgrind for a > selection of address space shapes (2G, 3G, 4G) and selected one at > runtime. The address space shape can change too easily: it just depends > on which kernel you boot (and it could be an exec-time configuration). Hmm, good point. Does anyone use 2G shapes? Are there any other common shapes? Ideally we'd be able to detect this at run-time. N |
|
From: Jeremy F. <je...@go...> - 2004-07-06 16:26:42
|
On Mon, 2004-07-05 at 10:29, Nicholas Nethercote wrote: > On Mon, 5 Jul 2004, Jeremy Fitzhardinge wrote: > > > I think for now, the best way squeeze everything into the address space > > is: > > * take advantage of the 4G/4G patches which some distros are > > shipping with; that gives us an extra Gbyte of address space to > > Do you know how to detect this at configure-time? I think you could write a script which looks at /proc/self/maps and has a guess. But I think I'd prefer it if we built versions of Valgrind for a selection of address space shapes (2G, 3G, 4G) and selected one at runtime. The address space shape can change too easily: it just depends on which kernel you boot (and it could be an exec-time configuration). J |
|
From: Nicholas N. <nj...@ca...> - 2004-07-06 08:48:11
|
[Sorry for the dup, my original too-big message must have been approved by the list moderator] |
|
From: <js...@ac...> - 2004-07-06 03:07:28
|
Nightly build on nemesis ( SuSE 9.1 ) started at 2004-07-06 03:50:00 BST Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow memcheck/tests/null_socket (stderr) memcheck/tests/overlap (stderr) memcheck/tests/pth_once (stderr) memcheck/tests/pushfpopf (stderr) memcheck/tests/realloc1 (stderr) memcheck/tests/realloc2 (stderr) memcheck/tests/realloc3 (stderr) memcheck/tests/sigaltstack (stderr) memcheck/tests/signal2 (stderr) memcheck/tests/supp1 (stderr) memcheck/tests/supp2 (stderr) memcheck/tests/suppfree (stderr) memcheck/tests/threadederrno (stderr) memcheck/tests/trivialleak (stderr) memcheck/tests/tronical (stderr) memcheck/tests/weirdioctl (stderr) memcheck/tests/writev (stderr) memcheck/tests/zeropage (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <to...@co...> - 2004-07-06 02:25:10
|
Nightly build on dunsmere ( Fedora Core 2 ) started at 2004-07-06 03:20:02 BST Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow shorts: valgrind ./shorts smc1: valgrind ./smc1 susphello: valgrind ./susphello syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 169 tests, 7 stderr failures, 1 stdout failure ================= corecheck/tests/fdleak_cmsg (stderr) corecheck/tests/fdleak_fcntl (stderr) corecheck/tests/fdleak_ipv4 (stderr) corecheck/tests/fdleak_socketpair (stderr) memcheck/tests/buflen_check (stderr) memcheck/tests/execve (stderr) memcheck/tests/writev (stderr) none/tests/exec-sigmask (stdout) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-07-06 02:20:24
|
Nightly build on audi ( Red Hat 9 ) started at 2004-07-06 03:15:03 BST Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 susphello: valgrind ./susphello syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 169 tests, 7 stderr failures, 0 stdout failures ================= corecheck/tests/fdleak_cmsg (stderr) corecheck/tests/fdleak_fcntl (stderr) corecheck/tests/fdleak_ipv4 (stderr) corecheck/tests/fdleak_socketpair (stderr) memcheck/tests/buflen_check (stderr) memcheck/tests/execve (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-07-06 02:13:18
|
Nightly build on ginetta ( Red Hat 8.0 ) started at 2004-07-06 03:10:02 BST Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow sem: valgrind ./sem semlimit: valgrind ./semlimit sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 susphello: valgrind ./susphello syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 169 tests, 4 stderr failures, 0 stdout failures ================= helgrind/tests/deadlock (stderr) helgrind/tests/race (stderr) helgrind/tests/race2 (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-07-06 02:08:16
|
Nightly build on alvis ( Red Hat 7.3 ) started at 2004-07-06 03:05:02 BST Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 susphello: valgrind ./susphello syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 169 tests, 5 stderr failures, 1 stdout failure ================= memcheck/tests/badfree-2trace (stderr) memcheck/tests/badjump (stderr) memcheck/tests/brk (stderr) memcheck/tests/error_counts (stdout) memcheck/tests/new_nothrow (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-07-06 02:07:16
|
Nightly build on standard ( Red Hat 7.2 ) started at 2004-07-06 03:00:02 BST Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow resolv: valgrind ./resolv rlimit_nofile: valgrind ./rlimit_nofile seg_override: valgrind ./seg_override sem: valgrind ./sem semlimit: valgrind ./semlimit sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 susphello: valgrind ./susphello syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 169 tests, 1 stderr failure, 0 stdout failures ================= memcheck/tests/badfree-2trace (stderr) make: *** [regtest] Error 1 |