You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(6) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(25) |
2
(24) |
3
(6) |
4
(2) |
5
(4) |
|
6
(5) |
7
(19) |
8
(18) |
9
(7) |
10
(13) |
11
(6) |
12
(8) |
|
13
(4) |
14
(5) |
15
(8) |
16
(4) |
17
(2) |
18
(2) |
19
(6) |
|
20
(14) |
21
(6) |
22
(20) |
23
(13) |
24
(3) |
25
(5) |
26
|
|
27
(4) |
28
(7) |
29
(14) |
30
(6) |
|
|
|
|
From: Goffredo S. <stg...@la...> - 2005-11-17 16:21:28
|
X C L V P V Aa I e A r I mn A v L o A ba L i I z G ix I t U a R e S ra M c A = n 3,75 1,22 3,32 http://www.kpsrjb.provitols.com |
|
From: Aniruddha S. <sh...@cs...> - 2005-11-16 20:25:16
|
Hi Josef, The profiling completes successfully with smaller runs. You have mentioned this and so I can consider the profiling output to be relevant and complete even for the aborted run? I would like you to comment on my understanding of the AcCost and SpLoss events. While I get what they stand for, I am not clear on how their values are actually being obtained. From what I gather :- 1) SpLoss1 is some fraction of (D1 misses * D1 line size + I1 misses * I1 line size) and hence serves as a measure of spatial loss at L1 level. 2) SpLoss2 is some fraction of (L2 misses * L2 line size) and hence serves as a measure of spatial loss at L2 level. 3) (D1 misses + I1 misses) <= AcCost1 <= (D1 misses + I1 misses) * 1000. 4) (L2 misses) <= AcCost2 <= (L2 misses) * 1000. Thanks, Aniruddha ----- Original Message ----- From: "Josef Weidendorfer" <Jos...@gm...> To: <val...@li...> Cc: "Aniruddha Shet" <sh...@cs...> Sent: Wednesday, November 16, 2005 12:45 PM Subject: [SPAM] Re: [Valgrind-users] Clarification on AcCost1, SpLoss1, AcCost2 and SpLoss2 > On Wednesday 16 November 2005 07:15, you wrote: >> Callgrind is aborting during the profiling process. The execution of the >> profiled code completed successfully but the log file obtained by setting >> the --log-file option shows that Callgrind aborted at some stage. I have >> attached the log file for your reference. > > Hmm... > It is really at the very end. So results are fine. > It looks like some thread ID is set to 0 which is not a valid ID. > Does this happen with smaller runs on your machine, too? > >> I need clarification on the numbers that appear under the columns >> AcCost1, >> SpLoss1, AcCost2 and SpLoss2. I initially felt that (AcCost2 = L2 misses >> * >> L2 line size) and something similar for AcCost1 in case of L1. And >> SpLoss2 is >> a fraction of AcCost2, denoting the number of bytes never used. The >> numbers in >> the log file (obtained using --log-file option) don't reflect this and so >> my >> impression is obviously wrong. Can you please explain how these numbers >> are >> obtained? > > The first question was how to get most meaningful values. Understanding > this will help you why I did it this way. > > I wanted to have a metric for the number of bytes actually touched in > a cache line, and some kind of reuse, ie. how often a cache line was > accessed before being evicted. > It is the best to be able to simply sum up metric numbers; thus, a > sum of all such individual numbers for a cache line should be a sign > of the relevance of the performance problem you have at this point > in your program. To put it another way: A large number should > correspond to a large performance problem, because KCachegrind > (or any profiling visualization) shows ordered lists, highest > numbers at top. > Thus, I did not choose the number of bytes touched, but the number of > bytes *not* touched (thus "spatial loss"). Similarily for the number of > accesses to a memory block before being evicted from a cache line: > the performance problem is big if the number of accesses to one > cache line is low. Thus, I took the reziprocal value of the access > number. Because the profile format only handles integer values, I > use as metric 1000/(access count), calling this "access cost". > > The second question is about how to attribute these numbers to code > positions. This is needed because the user wants to see where he > has to do optimization. I get the use metrics at eviction time of > some memory block. > Candidates are: > 1) Data structure of memory block evicted > 2) Data structure of memory block evicting > 3) Source code position which triggered loading of evicted block > 4) Current code position = position which triggered eviction > 5) Any combinations of tupels from 1-4 > As callgrind currently can not attribute regarding data structures, > 1) and 2) is not possible. 5) is not supported by visualization. > 3) should allow you to identify the data strucure of which has a problem > with spatial loss (structure layout should be better arranged according > to usage) or access cost (e.g. candidates for blocking). > 4) is in general not that useful; perhaps to detect code which is > polluting the cache... > I used 3) in Callgrind for the cache metrics, i.e. the numbers are > attributed to the code position where a cacheline was loaded. > > This is all at research state, and I am interested in any comments. > > Regarding implementation details: > For every cache line, I have a bit mask of bytes used and an access count > for the current loaded block. As callgrind simulates an inclusive > cache, I have for every memory block in L1 a pointer to where this > block is in L2. When a block is evicted from L1, I update SpLoss1 and > AcCost1, and combine the values with the coresponding values of this > memory block in L2. On L2 block eviction, I update SpLoss2 and AcCost2 > accordingly. > > If you run e.g. a KDE app with --cacheuse, you will see that most of > the spatial loss comes from the runtime linker: for every symbol name, > it looks up in around 20 hash tables. As a hash table access looses > 60 bytes (of 64 bytes with cache line size 64), you get 1.2 KB loss/sym. > Multiply this with the number of lookups (e.g. konqueror around 20,000), > and the problem with slow startup times will be more obvious. > > Josef > |
|
From: Josef W. <Jos...@gm...> - 2005-11-16 17:45:39
|
On Wednesday 16 November 2005 07:15, you wrote: > Callgrind is aborting during the profiling process. The execution of the > profiled code completed successfully but the log file obtained by setting > the --log-file option shows that Callgrind aborted at some stage. I have > attached the log file for your reference. Hmm... It is really at the very end. So results are fine. It looks like some thread ID is set to 0 which is not a valid ID. Does this happen with smaller runs on your machine, too? > I need clarification on the numbers that appear under the columns AcCost1, > SpLoss1, AcCost2 and SpLoss2. I initially felt that (AcCost2 = L2 misses * > L2 line size) and something similar for AcCost1 in case of L1. And SpLoss2 is > a fraction of AcCost2, denoting the number of bytes never used. The numbers in > the log file (obtained using --log-file option) don't reflect this and so my > impression is obviously wrong. Can you please explain how these numbers are > obtained? The first question was how to get most meaningful values. Understanding this will help you why I did it this way. I wanted to have a metric for the number of bytes actually touched in a cache line, and some kind of reuse, ie. how often a cache line was accessed before being evicted. It is the best to be able to simply sum up metric numbers; thus, a sum of all such individual numbers for a cache line should be a sign of the relevance of the performance problem you have at this point in your program. To put it another way: A large number should correspond to a large performance problem, because KCachegrind (or any profiling visualization) shows ordered lists, highest numbers at top. Thus, I did not choose the number of bytes touched, but the number of bytes *not* touched (thus "spatial loss"). Similarily for the number of accesses to a memory block before being evicted from a cache line: the performance problem is big if the number of accesses to one cache line is low. Thus, I took the reziprocal value of the access number. Because the profile format only handles integer values, I use as metric 1000/(access count), calling this "access cost". The second question is about how to attribute these numbers to code positions. This is needed because the user wants to see where he has to do optimization. I get the use metrics at eviction time of some memory block. Candidates are: 1) Data structure of memory block evicted 2) Data structure of memory block evicting 3) Source code position which triggered loading of evicted block 4) Current code position = position which triggered eviction 5) Any combinations of tupels from 1-4 As callgrind currently can not attribute regarding data structures, 1) and 2) is not possible. 5) is not supported by visualization. 3) should allow you to identify the data strucure of which has a problem with spatial loss (structure layout should be better arranged according to usage) or access cost (e.g. candidates for blocking). 4) is in general not that useful; perhaps to detect code which is polluting the cache... I used 3) in Callgrind for the cache metrics, i.e. the numbers are attributed to the code position where a cacheline was loaded. This is all at research state, and I am interested in any comments. Regarding implementation details: For every cache line, I have a bit mask of bytes used and an access count for the current loaded block. As callgrind simulates an inclusive cache, I have for every memory block in L1 a pointer to where this block is in L2. When a block is evicted from L1, I update SpLoss1 and AcCost1, and combine the values with the coresponding values of this memory block in L2. On L2 block eviction, I update SpLoss2 and AcCost2 accordingly. If you run e.g. a KDE app with --cacheuse, you will see that most of the spatial loss comes from the runtime linker: for every symbol name, it looks up in around 20 hash tables. As a hash table access looses 60 bytes (of 64 bytes with cache line size 64), you get 1.2 KB loss/sym. Multiply this with the number of lookups (e.g. konqueror around 20,000), and the problem with slow startup times will be more obvious. Josef |
|
From: Aniruddha S. <sh...@cs...> - 2005-11-16 06:16:02
|
Hi Josef, I am able to run KCacheGrind. It truly is a wonderful tool, very informative and powerful. Thanks. I need clarification on the numbers that appear under the columns AcCost1, SpLoss1, AcCost2 and SpLoss2. I initially felt that (AcCost2 = L2 misses * L2 line size) and something similar for AcCost1 in case of L1. And SpLoss2 is a fraction of AcCost2, denoting the number of bytes never used. The numbers in the log file (obtained using --log-file option) don't reflect this and so my impression is obviously wrong. Can you please explain how these numbers are obtained? Thanks, Aniruddha -- ----------------------------------------------------------------------------------------- Aniruddha G. Shet | Project webpage: http://forge-fre.ornl.gov/molar/index.html Graduate Research Associate | Project webpage: http://www.cs.unm.edu/~fastos Dept. of Comp. Sci. & Engg | Personal webpage: http://www.cse.ohio-state.edu/~shet The Ohio State University | Office: DL 474 2015 Neil Avenue | Phone: +1 (614) 292 7036 Columbus OH 43210-1277 | Cell: +1 (614) 446 1630 ----------------------------------------------------------------------------------------- |
|
From: Aniruddha S. <sh...@cs...> - 2005-11-16 04:11:48
|
On Mon, 14 Nov 2005, Aniruddha Shet wrote: Hi, Callgrind is aborting during the profiling process. The execution of the profiled code completed successfully but the log file obtained by setting the --log-file option shows that Callgrind aborted at some stage. I have attached the log file for your reference. Thanks, Aniruddha > Hi, > > Can you please help me in correcting the following error that I am > encountering while trying to install KCachegrind? > > if > g++ -DHAVE_CONFIG_H -I. -I. -I.. -I/usr/include/kde > -I/usr/lib/qt-3.1/include > -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT > -Wnon-virtual-dtor > -Wno-long-long -Wundef -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE > -Wcast-align > -Wconversion -Wchar-subscripts -Wall -W -Wpointer-arith -Wwrite-strings > -O2 > -Wformat-security -Wmissing-format-attribute -fno-exceptions > -fno-check-new > -fno-common -MT callgraphview.o -MD -MP -MF > ".deps/callgraphview.Tpo" -c -o callgraphview.o callgraphview.cpp; \ > then mv -f ".deps/callgraphview.Tpo" ".deps/callgraphview.Po"; else rm -f > ".deps/callgraphview.Tpo"; exit 1; fi > callgraphview.cpp: In constructor `PannerView::PannerView(QWidget*, const > char*)': > callgraphview.cpp:955: `WNoAutoErase' undeclared (first use this function) > callgraphview.cpp:955: (Each undeclared identifier is reported only once > for > each function it appears in.) > make[2]: *** [callgraphview.o] Error 1 > make[2]: Leaving directory > `/a/osu4005/Valgrind/kcachegrind-0.4.6/kcachegrind' > make[1]: *** [all-recursive] Error 1 > make[1]: Leaving directory `/a/osu4005/Valgrind/kcachegrind-0.4.6' > make: *** [all] Error 2 > > > Thanks, > Aniruddha > > On Sun, 13 Nov 2005, Josef Weidendorfer wrote: > > > On Saturday 12 November 2005 20:01, Aniruddha Shet wrote: > > > On Fri, 11 Nov 2005, Josef Weidendorfer wrote: > > > Hi, > > > > > > As you have indicated, I too want to use the --simulate-hwpref option to > > > determine the performance benefit with and without prefetcher. It serves > > > as a measure of spatial locality in the profiled code. > > > > The P4-like hardware prefetcher gives you a benefit if you do > > large sequential accesses into memory, ie. streams. This is a special > > kind of spatial locality. > > > > But note that this hardware prefetcher (at least my simulation) detects > > streams of accessed *cache lines*, i.e. it will work even with a stride size > > of 64 bytes (if your cache line size is 64 bytes). > > This does not give you inner-cache-line spatial locality. > > > > The best to see spatial locality is to change the cache line size of the > > simulator and compare the miss results. If you have no spatial locality at > > all, you should get the same number of misses independent on the cache line > > size. If the misses go down with larger cache line size, your program exhibits > > spatial locality. > > > > You can set the cache parameters with cachegrind/callgrind. > > Compare e.g. the usual result with a result with 8 byte line size. > > > > > I am yet to view the output of --cacheuse option. Again, the objective is > > > to understand the extent of spatial locality in the profiled code. > > > > Yes. "No spatial locality" would mean: only one byte or word accessed per cache > > line before eviction. And this should be visible via cache use. > > > > Josef > > > > > > > > Thanks, > > > Aniruddha > > > > > > > > > > On Friday 11 November 2005 05:05, you wrote: > > > > > On Wed, 9 Nov 2005, Josef Weidendorfer wrote: > > > > > Hi, > > > > > > > > > > I am running Callgrind with the options -v --log-file=summary --simulate-cache=yes > > > > > --simulate-hwpref=yes --cacheuse=yes. The summary log file contains the > > > > > lines: > > > > > > > > > > Prefetch Up: 0 > > > > > Prefetch Down: 0 > > > > > > > > Oh, someone which is using the more advanced (and probably not that much tested), > > > > code! Very good. It would be nice if you can tell me if these features are > > > > useful for you. > > > > > > > > You can not use --simulate-hwpref=yes and --cacheuse=yes at once. This is > > > > separated simulator code. > > > > I will change the code to give out a warning regarding this, thanks. > > > > > > > > If I do e.g. > > > > > > > > callgrind -v --simulate-hwpref=yes ls > > > > > > > > This option also switches on cache simulation. I get > > > > > > > > --12922-- Prefetch Up: 1507 > > > > --12922-- Prefetch Down: 36 > > > > > > > > so I think this still works fine. > > > > > > > > > What do these lines mean? From what I understand, --simulate-hwpref=yes > > > > > simulates a hardware prefetcher, as is found in the Intel Pentium 4 > > > > > processor. > > > > > > > > Yes. The P4 (and P-M) automatically detects upward and downward streaming, > > > > stopping at 4kB boundaries (streams on virtual addresses get a disrupted > > > > stream of physical addresses at 4kB boundaries because of VM). > > > > > > > > A nice thing is that the Pentium-M has hardware performance counters exact > > > > for the Prefetch/Up and Prefetch/Down events, i.e. you can observe the > > > > hardware prefetcher on the Pentium-M in action by using OProfile/Perfex/PAPI, > > > > and compare the results with that from Callgrind. > > > > > > > > By using --simulate-hwpref=yes I add this heuristic, and presume that > > > > every line loaded by the hardware prefetcher will give a hit when accessed > > > > later on. > > > > > > > > Note that this is not always the case: the real access could come that > > > > early that you still would get a miss in reality, even if the hardware > > > > prefetcher has catched the line. > > > > Unfortunately, callgrind has no way to get a simulated wall clock time, > > > > which would be needed to detect such cases. > > > > > > > > So callgrind --simulate-hwpref will give the best case possible for the > > > > prefetcher. In reality, it is between the results without and with this > > > > option. > > > > > > > > The usage is to compare results with and without the prefetcher. > > > > For functions where you see a big difference, the prefetcher is working > > > > quite good, i.e. any microoptimizations to bring down the usual callgrind > > > > results (without prefetcher) will not lead to any real improvements. > > > > > > > > But in the code regions, where the results are not really different, you > > > > see that the prefetching heuristic of the P4/PM is not working, and you > > > > can try to add software prefetch instructions (or otherwise change the code). > > > > > > > > A drawback is that callgrind does not take software prefetch instructions > > > > into account, as Valgrind does not feed these instructions to the tool, but > > > > ignores them. But if there really are users for this simulator enhancement, > > > > we can try to include them into VG core (e.g. cachegrind). > > > > > > > > To make the comparision of the two runs more easy, I should include a compare > > > > mode in KCachegrind. > > > > > > > > > Also, does --cacheuse=yes collect cache line utilization statistics i.e. > > > > > what percentage of a line is utilized after being brought into cache and > > > > > before being evicted from the cache? Where can this information viewed? > > > > > > > > Yes. The number of bytes never used in a cache line will be attributed to > > > > the instruction which triggered the load. This is event SpLoss1 (for L1) > > > > and more important SpLoss2 (for L2). > > > > > > > > The full amount of bytes loaded by an instruction is given by the number of > > > > L1 or L2 misses this instruction gets attributed, multiplied with the cache > > > > line size. In KCachegrind, add new derived events with the formula > > > > "64 L1m" and "64 L2m" to directly get the numbers to compare. > > > > > > > > You can view this information with KCachegrind. Unfortunately, there was a > > > > a hardcoded maximum of 10 event types in KCachegrind found till KDE 3.4.x. > > > > And --cacheuse=yes gives you 12 event types, leading to a load error. > > > > This changes in the version in KDE 3.5, or use the newest one from the > > > > website (kcachegrind.sf.net). > > > > > > > > Theoretically, callgrind_annotate should be able to show these results, too. > > > > For it to cope with the format, you have to additionally provide > > > > --compress-pos=no --compress-strings=no > > > > on the callgrind line. Even then, it fails with > > > > Line xxxx: summary event and total event mismatch > > > > > > > > Oh yeah, it is time to provide a better command line tool... > > > > > > > > Josef > > > > > > > > > > > > > > > > > > Thanks, > > > > > Aniruddha > > > > > > > > > > > > ------------------------------------------------------- > > > > SF.Net email is sponsored by: > > > > Tame your development challenges with Apache's Geronimo App Server. Download > > > > it for free - -and be entered to win a 42" plasma tv or your very own > > > > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > > > > _______________________________________________ > > > > Valgrind-users mailing list > > > > Val...@li... > > > > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > > > > > > > > > > > ------------------------------------------------------- > > SF.Net email is sponsored by: > > Tame your development challenges with Apache's Geronimo App Server. Download > > it for free - -and be entered to win a 42" plasma tv or your very own > > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > > _______________________________________________ > > Valgrind-users mailing list > > Val...@li... > > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > > -- ----------------------------------------------------------------------------------------- Aniruddha G. Shet | Project webpage: http://forge-fre.ornl.gov/molar/index.html Graduate Research Associate | Project webpage: http://www.cs.unm.edu/~fastos Dept. of Comp. Sci. & Engg | Personal webpage: http://www.cse.ohio-state.edu/~shet The Ohio State University | Office: DL 474 2015 Neil Avenue | Phone: +1 (614) 292 7036 Columbus OH 43210-1277 | Cell: +1 (614) 446 1630 ----------------------------------------------------------------------------------------- |
|
From: Eduardo M. <ea...@us...> - 2005-11-15 22:44:19
|
# gdb ./mytest
GNU gdb 6.2.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you=20
are
welcome to change it and/or distribute copies of it under certain=20
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for=20
details.
This GDB was configured as "ppc-linux"...Using host libthread=5Fdb library =
"/lib/.
(gdb) r
Starting program: /tmp/mytest
Program received signal SIGILL, Illegal instruction.
main () at mytest.c:35
warning: Source file is more recent than executable.
35 =5F=5Fasm=5F=5F =5F=5Fvolatile=5F=5F("fmr 0=
,0");
(gdb) c
Continuing.
Program received signal SIGILL, Illegal instruction.
main () at mytest.c:44
44 =5F=5Fasm=5F=5F =5F=5Fvolatile=5F=5F("vor 0,0,0");
(gdb) c
Continuing.
Program terminated with signal SIGILL, Illegal instruction.
The program no longer exists.
(gdb)
Regards,
Eduardo A. Mu=F1oz
Julian Seward <ju...@va...>=20
Sent by: val...@li...
11/15/2005 11:42 AM
To
Eduardo Munoz/Austin/IBM@IBMUS
cc
val...@li...
Subject
Re: Fw: [Valgrind-users] Support questions: ppcnf, helgrind,and addrcheck
> I compiled and run the program you suggested and I get just the
> following:
>
> #valgrind --tool=3Dnone ./mytest
> #Illegal Instruction
No, I mean, what happens when you run ./mytest not on Valgrind?
If it still gets an illegal instruction, please use GDB to find out
which instruction is the problem.
J
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad=5Fid=3D7628&alloc=5Fid=3D16845&op=3Dclick
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F
Valgrind-users mailing list
Val...@li...
https://lists.sourceforge.net/lists/listinfo/valgrind-users
|
|
From: Julian S. <ju...@va...> - 2005-11-15 17:42:00
|
> I compiled and run the program you suggested and I get just the > following: > > #valgrind --tool=none ./mytest > #Illegal Instruction No, I mean, what happens when you run ./mytest not on Valgrind? If it still gets an illegal instruction, please use GDB to find out which instruction is the problem. J |
|
From: Nicholas N. <nj...@cs...> - 2005-11-15 17:08:17
|
On Thu, 10 Nov 2005, Dennis Lubert wrote:
> in the good old days of helgrind, I was able to track down some race
> conditions in our application. Now that helgrind isnt working anymore,
> Im facing a big problem of some heisenbugs. Core dumps are corrupted and
> running within gdb or valgrind solely does not trigger the bug.
> So, whats the current status of helgrind? What can we (the community) do
> to speed up getting helgrind to work?
We want to reinstate it. A reasonable number (off the top of my head:
about 10--15?) of people complained in the survey about its absence.
The main obstacle to getting it working is that we need to support
function wrapping. We currently support function replacement, ie. the
ability to replace a function with our own version. Function wrapping
would extend that to allow us to call the original from within our
replacement, eg:
void replacement_for_foo(int x, char y)
{
// do pre-stuff
foo(x, y); // call original
// do post-stuff
}
This is needed for Helgrind so we can intercept and track calls to
functions like pthread_mutex_lock().
Julian and I have discussed numerous times about how to implement function
wrapping in a sane way, and we've experimented with different approaches,
so far without success. It's a difficult problem. It's definitely on our
radar, but as always finding time for any particular problem is difficult
when there are 101 other problems to be fixed as well.
Nick
|
|
From: Eduardo M. <ea...@us...> - 2005-11-15 16:58:26
|
>>Hmm. Now that really shouldn't happen. Two questions. Firstly,
>>what Linux kernel version and distribution are you using?
Kernel Version --- Kernel 2.6.5.xxx
Distribution --- Based on sles9 sp1
>>And secondly, if you compile and run the attached program (normally,
>>not on V), what result do you get?
I compiled and run the program you suggested and I get just the
following:
#valgrind --tool=none ./mytest
#Illegal Instruction
Eduardo Munoz
Julian Seward <ju...@va...>
Sent by: val...@li...
11/14/2005 07:22 PM
To
Eduardo Munoz/Austin/IBM@IBMUS
cc
val...@li...
Subject
Re: [Valgrind-users] Support questions: ppcnf, helgrind,and addrcheck
> --32569:1:main Dynamic memory manager is running
> --32569:1:main Getting stage1's name
> --32569:1:main Get hardware capabilities ...
> Illegal instruction
Hmm. Now that really shouldn't happen. Two questions. Firstly,
what Linux kernel version and distribution are you using?
And secondly, if you compile and run the attached program (normally,
not on V), what result do you get?
J
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
jmp_buf env;
void hdlr_fp ( int x ) { longjmp(env,1); }
void hdlr_vmx ( int x ) { longjmp(env,1); }
int main ( void )
{
sigset_t saved_set, tmp_set;
struct sigaction saved_act, tmp_act;
int have_fp, have_vmx;
sigemptyset(&tmp_set);
sigaddset(&tmp_set, SIGILL);
sigprocmask(SIG_UNBLOCK, &tmp_set, &saved_set);
sigaction(SIGILL, NULL, &saved_act);
tmp_act = saved_act;
tmp_act.sa_flags &= ~SA_RESETHAND;
tmp_act.sa_flags &= ~SA_SIGINFO;
tmp_act.sa_handler = hdlr_fp;
sigaction(SIGILL, &tmp_act, NULL);
have_fp = 1;
if (setjmp(env)) {
have_fp = 0;
} else {
__asm__ __volatile__("fmr 0,0");
}
tmp_act.sa_handler = hdlr_vmx;
sigaction(SIGILL, &tmp_act, NULL);
have_vmx = 1;
if (setjmp(env)) {
have_vmx = 0;
} else {
__asm__ __volatile__("vor 0,0,0");
}
sigaction(SIGILL, &saved_act, NULL);
sigprocmask(SIG_SETMASK, &saved_set, NULL);
printf("fp %d vmx %d\n", have_fp, have_vmx);
return 0;
}
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Valgrind-users mailing list
Val...@li...
https://lists.sourceforge.net/lists/listinfo/valgrind-users
|
|
From: Julian S. <js...@ac...> - 2005-11-15 11:18:47
|
> I'm trying to run a JIT that uses CLFLUSH after code patching to I just implemented it. svn up (you should get vex r1460, valgrind r5132), make distclean and rebuild everything from scratch (there have been several other changes too). Let us know if it works / does not work. J |
|
From: Julian S. <js...@ac...> - 2005-11-15 01:27:45
|
On Monday 14 November 2005 21:03, Iserovich, Lev wrote: > Hi Julian, > > Thanks, I've gotten the latest SVN (5131 I believe) and it works! Great. One thing is, I have only the vaguest idea about what kinds of embedded ppc dev systems exist. So - what kind of CPU/board/distro combination are you running on? Knowing that will help build a picture of what works and what doesn't. Do pthreaded programs work? I'm having trouble making LinuxThreads (glibc2.3.2) work on a PPC440 right now. > I do see one issue with running ls on my box - apparently syscall 107 > (sys_newlstat) is not handled by valgrind, so it complains. Is the syscall > code cross-platform, I think it's generic. Goto syswrap-ppc32-linux.c line 1566 and uncomment it: //.. GENXY(__NR_lstat, sys_newlstat), // 107 You may have to also uncomment the definition of __NR_lstat in vki_unistd-ppc32-linux.h. Rebuild and see what you get. J |
|
From: Julian S. <ju...@va...> - 2005-11-15 01:21:03
|
> --32569:1:main Dynamic memory manager is running
> --32569:1:main Getting stage1's name
> --32569:1:main Get hardware capabilities ...
> Illegal instruction
Hmm. Now that really shouldn't happen. Two questions. Firstly,
what Linux kernel version and distribution are you using?
And secondly, if you compile and run the attached program (normally,
not on V), what result do you get?
J
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
jmp_buf env;
void hdlr_fp ( int x ) { longjmp(env,1); }
void hdlr_vmx ( int x ) { longjmp(env,1); }
int main ( void )
{
sigset_t saved_set, tmp_set;
struct sigaction saved_act, tmp_act;
int have_fp, have_vmx;
sigemptyset(&tmp_set);
sigaddset(&tmp_set, SIGILL);
sigprocmask(SIG_UNBLOCK, &tmp_set, &saved_set);
sigaction(SIGILL, NULL, &saved_act);
tmp_act = saved_act;
tmp_act.sa_flags &= ~SA_RESETHAND;
tmp_act.sa_flags &= ~SA_SIGINFO;
tmp_act.sa_handler = hdlr_fp;
sigaction(SIGILL, &tmp_act, NULL);
have_fp = 1;
if (setjmp(env)) {
have_fp = 0;
} else {
__asm__ __volatile__("fmr 0,0");
}
tmp_act.sa_handler = hdlr_vmx;
sigaction(SIGILL, &tmp_act, NULL);
have_vmx = 1;
if (setjmp(env)) {
have_vmx = 0;
} else {
__asm__ __volatile__("vor 0,0,0");
}
sigaction(SIGILL, &saved_act, NULL);
sigprocmask(SIG_SETMASK, &saved_set, NULL);
printf("fp %d vmx %d\n", have_fp, have_vmx);
return 0;
}
|
|
From: Aniruddha S. <sh...@cs...> - 2005-11-15 00:04:42
|
Hi, Can you please help me in correcting the following error that I am encountering while trying to install KCachegrind? if g++ -DHAVE_CONFIG_H -I. -I. -I.. -I/usr/include/kde -I/usr/lib/qt-3.1/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -Wchar-subscripts -Wall -W -Wpointer-arith -Wwrite-strings -O2 -Wformat-security -Wmissing-format-attribute -fno-exceptions -fno-check-new -fno-common -MT callgraphview.o -MD -MP -MF ".deps/callgraphview.Tpo" -c -o callgraphview.o callgraphview.cpp; \ then mv -f ".deps/callgraphview.Tpo" ".deps/callgraphview.Po"; else rm -f ".deps/callgraphview.Tpo"; exit 1; fi callgraphview.cpp: In constructor `PannerView::PannerView(QWidget*, const char*)': callgraphview.cpp:955: `WNoAutoErase' undeclared (first use this function) callgraphview.cpp:955: (Each undeclared identifier is reported only once for each function it appears in.) make[2]: *** [callgraphview.o] Error 1 make[2]: Leaving directory `/a/osu4005/Valgrind/kcachegrind-0.4.6/kcachegrind' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/a/osu4005/Valgrind/kcachegrind-0.4.6' make: *** [all] Error 2 Thanks, Aniruddha On Sun, 13 Nov 2005, Josef Weidendorfer wrote: > On Saturday 12 November 2005 20:01, Aniruddha Shet wrote: > > On Fri, 11 Nov 2005, Josef Weidendorfer wrote: > > Hi, > > > > As you have indicated, I too want to use the --simulate-hwpref option to > > determine the performance benefit with and without prefetcher. It serves > > as a measure of spatial locality in the profiled code. > > The P4-like hardware prefetcher gives you a benefit if you do > large sequential accesses into memory, ie. streams. This is a special > kind of spatial locality. > > But note that this hardware prefetcher (at least my simulation) detects > streams of accessed *cache lines*, i.e. it will work even with a stride size > of 64 bytes (if your cache line size is 64 bytes). > This does not give you inner-cache-line spatial locality. > > The best to see spatial locality is to change the cache line size of the > simulator and compare the miss results. If you have no spatial locality at > all, you should get the same number of misses independent on the cache line > size. If the misses go down with larger cache line size, your program exhibits > spatial locality. > > You can set the cache parameters with cachegrind/callgrind. > Compare e.g. the usual result with a result with 8 byte line size. > > > I am yet to view the output of --cacheuse option. Again, the objective is > > to understand the extent of spatial locality in the profiled code. > > Yes. "No spatial locality" would mean: only one byte or word accessed per cache > line before eviction. And this should be visible via cache use. > > Josef > > > > > Thanks, > > Aniruddha > > > > > > > On Friday 11 November 2005 05:05, you wrote: > > > > On Wed, 9 Nov 2005, Josef Weidendorfer wrote: > > > > Hi, > > > > > > > > I am running Callgrind with the options -v --log-file=summary --simulate-cache=yes > > > > --simulate-hwpref=yes --cacheuse=yes. The summary log file contains the > > > > lines: > > > > > > > > Prefetch Up: 0 > > > > Prefetch Down: 0 > > > > > > Oh, someone which is using the more advanced (and probably not that much tested), > > > code! Very good. It would be nice if you can tell me if these features are > > > useful for you. > > > > > > You can not use --simulate-hwpref=yes and --cacheuse=yes at once. This is > > > separated simulator code. > > > I will change the code to give out a warning regarding this, thanks. > > > > > > If I do e.g. > > > > > > callgrind -v --simulate-hwpref=yes ls > > > > > > This option also switches on cache simulation. I get > > > > > > --12922-- Prefetch Up: 1507 > > > --12922-- Prefetch Down: 36 > > > > > > so I think this still works fine. > > > > > > > What do these lines mean? From what I understand, --simulate-hwpref=yes > > > > simulates a hardware prefetcher, as is found in the Intel Pentium 4 > > > > processor. > > > > > > Yes. The P4 (and P-M) automatically detects upward and downward streaming, > > > stopping at 4kB boundaries (streams on virtual addresses get a disrupted > > > stream of physical addresses at 4kB boundaries because of VM). > > > > > > A nice thing is that the Pentium-M has hardware performance counters exact > > > for the Prefetch/Up and Prefetch/Down events, i.e. you can observe the > > > hardware prefetcher on the Pentium-M in action by using OProfile/Perfex/PAPI, > > > and compare the results with that from Callgrind. > > > > > > By using --simulate-hwpref=yes I add this heuristic, and presume that > > > every line loaded by the hardware prefetcher will give a hit when accessed > > > later on. > > > > > > Note that this is not always the case: the real access could come that > > > early that you still would get a miss in reality, even if the hardware > > > prefetcher has catched the line. > > > Unfortunately, callgrind has no way to get a simulated wall clock time, > > > which would be needed to detect such cases. > > > > > > So callgrind --simulate-hwpref will give the best case possible for the > > > prefetcher. In reality, it is between the results without and with this > > > option. > > > > > > The usage is to compare results with and without the prefetcher. > > > For functions where you see a big difference, the prefetcher is working > > > quite good, i.e. any microoptimizations to bring down the usual callgrind > > > results (without prefetcher) will not lead to any real improvements. > > > > > > But in the code regions, where the results are not really different, you > > > see that the prefetching heuristic of the P4/PM is not working, and you > > > can try to add software prefetch instructions (or otherwise change the code). > > > > > > A drawback is that callgrind does not take software prefetch instructions > > > into account, as Valgrind does not feed these instructions to the tool, but > > > ignores them. But if there really are users for this simulator enhancement, > > > we can try to include them into VG core (e.g. cachegrind). > > > > > > To make the comparision of the two runs more easy, I should include a compare > > > mode in KCachegrind. > > > > > > > Also, does --cacheuse=yes collect cache line utilization statistics i.e. > > > > what percentage of a line is utilized after being brought into cache and > > > > before being evicted from the cache? Where can this information viewed? > > > > > > Yes. The number of bytes never used in a cache line will be attributed to > > > the instruction which triggered the load. This is event SpLoss1 (for L1) > > > and more important SpLoss2 (for L2). > > > > > > The full amount of bytes loaded by an instruction is given by the number of > > > L1 or L2 misses this instruction gets attributed, multiplied with the cache > > > line size. In KCachegrind, add new derived events with the formula > > > "64 L1m" and "64 L2m" to directly get the numbers to compare. > > > > > > You can view this information with KCachegrind. Unfortunately, there was a > > > a hardcoded maximum of 10 event types in KCachegrind found till KDE 3.4.x. > > > And --cacheuse=yes gives you 12 event types, leading to a load error. > > > This changes in the version in KDE 3.5, or use the newest one from the > > > website (kcachegrind.sf.net). > > > > > > Theoretically, callgrind_annotate should be able to show these results, too. > > > For it to cope with the format, you have to additionally provide > > > --compress-pos=no --compress-strings=no > > > on the callgrind line. Even then, it fails with > > > Line xxxx: summary event and total event mismatch > > > > > > Oh yeah, it is time to provide a better command line tool... > > > > > > Josef > > > > > > > > > > > > > > Thanks, > > > > Aniruddha > > > > > > > > > ------------------------------------------------------- > > > SF.Net email is sponsored by: > > > Tame your development challenges with Apache's Geronimo App Server. Download > > > it for free - -and be entered to win a 42" plasma tv or your very own > > > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > > > _______________________________________________ > > > Valgrind-users mailing list > > > Val...@li... > > > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > > > > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. Download > it for free - -and be entered to win a 42" plasma tv or your very own > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > -- ----------------------------------------------------------------------------------------- Aniruddha G. Shet | Project webpage: http://forge-fre.ornl.gov/molar/index.html Graduate Research Associate | Project webpage: http://www.cs.unm.edu/~fastos Dept. of Comp. Sci. & Engg | Personal webpage: http://www.cse.ohio-state.edu/~shet The Ohio State University | Office: DL 474 2015 Neil Avenue | Phone: +1 (614) 292 7036 Columbus OH 43210-1277 | Cell: +1 (614) 446 1630 ----------------------------------------------------------------------------------------- |
|
From: Iserovich, L. <lis...@ci...> - 2005-11-14 21:04:32
|
Hi Julian, Thanks, I've gotten the latest SVN (5131 I believe) and it works! I do see one issue with running ls on my box - apparently syscall 107 = (sys_newlstat) is not handled by valgrind, so it complains. Is the syscall code cross-platform, or = machine specific? I guess I can see if I can add a handler according to the recommended = readme. Thanks for the help! --Lev -----Original Message----- From: Julian Seward [mailto:js...@ac...] Sent: Sat 11/12/2005 9:50 PM To: val...@li... Cc: Iserovich, Lev Subject: Re: [Valgrind-users] illegal instruction in ppc (4xx board) =20 > I found that place (in dispatch-ppc32.S) which sets the FPU and = AltiVec > to default modes, Try updating to r5113. I have just been running successfully on a PPC440GX (no FPU). J |
|
From: Eduardo M. <ea...@us...> - 2005-11-14 20:54:08
|
I did what you told to do. Checkout the latest code from svn, rebuilt ,=20 and run valgrind -d -d -v -v --tool=3Dnone ls. I am running valgrind in a ppc nfu system IBM 4xx system. >>>:~ # valgrind -d -d -v -v --tool=3Dnone ls --32569:1:debuglog DebugLog system started by Stage 1, level 2 logging=20 requested --32569:1:launcher tool 'none' requested --32569:1:launcher no platform detected, defaulting platform to=20 'ppc32-linux' --32569:1:launcher launching /usr/lib/valgrind/ppc32-linux/none --32569:1:debuglog DebugLog system started by Stage 2 (main), level 2=20 logging rd --32569:1:main Welcome to Valgrind version 3.1.SVN debug logging --32569:1:main Checking current stack is plausible --32569:1:main Checking initial stack was noted --32569:1:main Starting the address space manager --32569:2:aspacem sp=5Fat=5Fstartup =3D 0x007FFFF880 (supplied) --32569:2:aspacem minAddr =3D 0x0004000000 (computed) --32569:2:aspacem maxAddr =3D 0x007FFFEFFF (computed) --32569:2:aspacem cStart =3D 0x0004000000 (computed) --32569:2:aspacem vStart =3D 0x0042000000 (computed) --32569:2:aspacem suggested=5Fclstack=5Ftop =3D 0x007EFFFFFF (computed) --32569:2:aspacem <<< SHOW=5FSEGMENTS: Initial layout (5 segments, 0=20 segnames) --32569:2:aspacem 0: RSVN 0000000000-0003FFFFFF 64m ----- SmFixed --32569:2:aspacem 1: 0004000000-0041FFFFFF 992m --32569:2:aspacem 2: RSVN 0042000000-0042000FFF 4096 ----- SmFixed --32569:2:aspacem 3: 0042001000-007FFFEFFF 991m --32569:2:aspacem 4: RSVN 007FFFF000-00FFFFFFFF 2048m ----- SmFixed --32569:2:aspacem >>> --32569:2:aspacem Reading /proc/self/maps --32569:2:aspacem <<< SHOW=5FSEGMENTS: With contents of /proc/self/maps = (11 se) --32569:2:aspacem ( 0) /usr/lib/valgrind/ppc32-linux/none --32569:2:aspacem 0: RSVN 0000000000-0003FFFFFF 64m ----- SmFixed --32569:2:aspacem 1: 0004000000-0041FFFFFF 992m --32569:2:aspacem 2: RSVN 0042000000-0042000FFF 4096 ----- SmFixed --32569:2:aspacem 3: 0042001000-006FFFFFFF 735m --32569:2:aspacem 4: FILE 0070000000-0070131FFF 1253376 r-x-- d=3D0x00= B=20 i=3D18) --32569:2:aspacem 5: 0070132000-0070140FFF 61440 --32569:2:aspacem 6: FILE 0070141000-0070141FFF 4096 rw--- d=3D0x00= B=20 i=3D18) --32569:2:aspacem 7: ANON 0070142000-007075FFFF 6414336 rwx-- --32569:2:aspacem 8: 0070760000-007FFFEFFF 248m --32569:2:aspacem 9: ANON 007FFFF000-007FFFFFFF 4096 rw--- --32569:2:aspacem 10: RSVN 0080000000-00FFFFFFFF 2048m ----- SmFixed --32569:2:aspacem >>> --32569:1:main Address space manager is running --32569:1:main Starting the dynamic memory manager --32569:1:mallocfr newSuperblock at 0x42001000 (pszB 1048560) owner=20 VALGRIND/tol --32569:1:main Dynamic memory manager is running --32569:1:main Getting stage1's name --32569:1:main Get hardware capabilities ... Illegal instruction >>>:~ # This is all I get. Regards, Eduardo A. Mu=F1oz Julian Seward <ju...@va...>=20 11/14/2005 12:20 PM To Eduardo Munoz/Austin/IBM@IBMUS cc Subject Re: [Valgrind-users] Support questions: ppcnf, helgrind,and addrcheck On Monday 14 November 2005 17:27, you wrote: > I checked out the latest valgrind code from 3.0 development trunk and I > still get the illegal Instruction running valgrind. > valgrind-3.1.SVN (revision 1458.) > > >valgrind --version > > Illegal instruction Do you want to explain how you think we can debug your problem if you supply us with zero information? Delete your entire build tree and check out again from svn. Rebuild. Run this valgrind -d -d -v -v --tool=3Dnone ls and send the results. J |
|
From: Eduardo M. <ea...@us...> - 2005-11-14 17:26:36
|
I checked out the latest valgrind code from 3.0 development trunk and I=20 still get the illegal Instruction running valgrind. valgrind-3.1.SVN (revision 1458.) >valgrind --version Illegal instruction > Regards, Eduardo A. Mu=F1oz Julian Seward <ju...@va...>=20 11/14/2005 10:38 AM To Eduardo Munoz/Austin/IBM@IBMUS cc Subject Re: [Valgrind-users] Support questions: ppcnf, helgrind,and addrcheck On Monday 14 November 2005 16:28, you wrote: > When I check out the code from 3.0 valgrind development trunk: > > svn co svn://svn.valgrind.org/valgrind/trunk valgrind > > it says: > > " Checked out revision 1458." Did you miss this? Updated to revision 5129. Try it and see how far you get now. J |
|
From: Tom H. <to...@co...> - 2005-11-14 16:31:08
|
In message <OF0...@us...>
Eduardo Munoz <ea...@us...> wrote:
> When I check out the code from 3.0 valgrind development trunk:
>
> svn co svn://svn.valgrind.org/valgrind/trunk valgrind
>
> it says:
>
> " Checked out revision 1458."
>
> is it the right version that your said it works now??
That is the current VEX revision - the revision Julian gave was
the valgrind revision.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Eduardo M. <ea...@us...> - 2005-11-14 16:27:36
|
When I check out the code from 3.0 valgrind development trunk: svn co svn://svn.valgrind.org/valgrind/trunk valgrind it says: " Checked out revision 1458." is it the right version that your said it works now?? Regards, Eduardo A. Mu=F1oz MCP - Linux Technology Center=20 IBM Corporation ea...@us... 512-838-8219 Julian Seward <ju...@va...>=20 Sent by: val...@li... 11/12/2005 08:47 PM To val...@li... cc Eduardo Munoz/Austin/IBM@IBMUS, ce...@va... Subject Re: [Valgrind-users] Support questions: ppcnf, helgrind,and addrcheck On Friday 11 November 2005 17:59, Eduardo Munoz wrote: > Can somebody tell me an estimated dates for the following: I need to > make some projects decision based on these dates. > This is for valgrind from the current development line (valgrind=20 -3.1.SVN) > > 1. Support( or fix) for 4XX boards or ppcnf (ppc no floating point) > systems. ppcnf now works (svn rev 5113 on a PPC440GX running MontaVista 3.1). J ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server.=20 Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F= =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F Valgrind-users mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-users |
|
From: Josef W. <Jos...@gm...> - 2005-11-13 11:11:26
|
On Saturday 12 November 2005 20:01, Aniruddha Shet wrote: > On Fri, 11 Nov 2005, Josef Weidendorfer wrote: > Hi, > > As you have indicated, I too want to use the --simulate-hwpref option to > determine the performance benefit with and without prefetcher. It serves > as a measure of spatial locality in the profiled code. The P4-like hardware prefetcher gives you a benefit if you do large sequential accesses into memory, ie. streams. This is a special kind of spatial locality. But note that this hardware prefetcher (at least my simulation) detects streams of accessed *cache lines*, i.e. it will work even with a stride size of 64 bytes (if your cache line size is 64 bytes). This does not give you inner-cache-line spatial locality. The best to see spatial locality is to change the cache line size of the simulator and compare the miss results. If you have no spatial locality at all, you should get the same number of misses independent on the cache line size. If the misses go down with larger cache line size, your program exhibits spatial locality. You can set the cache parameters with cachegrind/callgrind. Compare e.g. the usual result with a result with 8 byte line size. > I am yet to view the output of --cacheuse option. Again, the objective is > to understand the extent of spatial locality in the profiled code. Yes. "No spatial locality" would mean: only one byte or word accessed per cache line before eviction. And this should be visible via cache use. Josef > > Thanks, > Aniruddha > > > > On Friday 11 November 2005 05:05, you wrote: > > > On Wed, 9 Nov 2005, Josef Weidendorfer wrote: > > > Hi, > > > > > > I am running Callgrind with the options -v --log-file=summary --simulate-cache=yes > > > --simulate-hwpref=yes --cacheuse=yes. The summary log file contains the > > > lines: > > > > > > Prefetch Up: 0 > > > Prefetch Down: 0 > > > > Oh, someone which is using the more advanced (and probably not that much tested), > > code! Very good. It would be nice if you can tell me if these features are > > useful for you. > > > > You can not use --simulate-hwpref=yes and --cacheuse=yes at once. This is > > separated simulator code. > > I will change the code to give out a warning regarding this, thanks. > > > > If I do e.g. > > > > callgrind -v --simulate-hwpref=yes ls > > > > This option also switches on cache simulation. I get > > > > --12922-- Prefetch Up: 1507 > > --12922-- Prefetch Down: 36 > > > > so I think this still works fine. > > > > > What do these lines mean? From what I understand, --simulate-hwpref=yes > > > simulates a hardware prefetcher, as is found in the Intel Pentium 4 > > > processor. > > > > Yes. The P4 (and P-M) automatically detects upward and downward streaming, > > stopping at 4kB boundaries (streams on virtual addresses get a disrupted > > stream of physical addresses at 4kB boundaries because of VM). > > > > A nice thing is that the Pentium-M has hardware performance counters exact > > for the Prefetch/Up and Prefetch/Down events, i.e. you can observe the > > hardware prefetcher on the Pentium-M in action by using OProfile/Perfex/PAPI, > > and compare the results with that from Callgrind. > > > > By using --simulate-hwpref=yes I add this heuristic, and presume that > > every line loaded by the hardware prefetcher will give a hit when accessed > > later on. > > > > Note that this is not always the case: the real access could come that > > early that you still would get a miss in reality, even if the hardware > > prefetcher has catched the line. > > Unfortunately, callgrind has no way to get a simulated wall clock time, > > which would be needed to detect such cases. > > > > So callgrind --simulate-hwpref will give the best case possible for the > > prefetcher. In reality, it is between the results without and with this > > option. > > > > The usage is to compare results with and without the prefetcher. > > For functions where you see a big difference, the prefetcher is working > > quite good, i.e. any microoptimizations to bring down the usual callgrind > > results (without prefetcher) will not lead to any real improvements. > > > > But in the code regions, where the results are not really different, you > > see that the prefetching heuristic of the P4/PM is not working, and you > > can try to add software prefetch instructions (or otherwise change the code). > > > > A drawback is that callgrind does not take software prefetch instructions > > into account, as Valgrind does not feed these instructions to the tool, but > > ignores them. But if there really are users for this simulator enhancement, > > we can try to include them into VG core (e.g. cachegrind). > > > > To make the comparision of the two runs more easy, I should include a compare > > mode in KCachegrind. > > > > > Also, does --cacheuse=yes collect cache line utilization statistics i.e. > > > what percentage of a line is utilized after being brought into cache and > > > before being evicted from the cache? Where can this information viewed? > > > > Yes. The number of bytes never used in a cache line will be attributed to > > the instruction which triggered the load. This is event SpLoss1 (for L1) > > and more important SpLoss2 (for L2). > > > > The full amount of bytes loaded by an instruction is given by the number of > > L1 or L2 misses this instruction gets attributed, multiplied with the cache > > line size. In KCachegrind, add new derived events with the formula > > "64 L1m" and "64 L2m" to directly get the numbers to compare. > > > > You can view this information with KCachegrind. Unfortunately, there was a > > a hardcoded maximum of 10 event types in KCachegrind found till KDE 3.4.x. > > And --cacheuse=yes gives you 12 event types, leading to a load error. > > This changes in the version in KDE 3.5, or use the newest one from the > > website (kcachegrind.sf.net). > > > > Theoretically, callgrind_annotate should be able to show these results, too. > > For it to cope with the format, you have to additionally provide > > --compress-pos=no --compress-strings=no > > on the callgrind line. Even then, it fails with > > Line xxxx: summary event and total event mismatch > > > > Oh yeah, it is time to provide a better command line tool... > > > > Josef > > > > > > > > > > Thanks, > > > Aniruddha > > > > > > ------------------------------------------------------- > > SF.Net email is sponsored by: > > Tame your development challenges with Apache's Geronimo App Server. Download > > it for free - -and be entered to win a 42" plasma tv or your very own > > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > > _______________________________________________ > > Valgrind-users mailing list > > Val...@li... > > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > |
|
From: Julian S. <js...@ac...> - 2005-11-13 02:49:13
|
> I found that place (in dispatch-ppc32.S) which sets the FPU and AltiVec > to default modes, Try updating to r5113. I have just been running successfully on a PPC440GX (no FPU). J |
|
From: Julian S. <ju...@va...> - 2005-11-13 02:46:51
|
On Friday 11 November 2005 17:59, Eduardo Munoz wrote: > Can somebody tell me an estimated dates for the following: I need to > make some projects decision based on these dates. > This is for valgrind from the current development line (valgrind -3.1.SVN) > > 1. Support( or fix) for 4XX boards or ppcnf (ppc no floating point) > systems. ppcnf now works (svn rev 5113 on a PPC440GX running MontaVista 3.1). J |
|
From: Paul P. <ppl...@gm...> - 2005-11-13 00:10:13
|
On 11/12/05, Tom Hughes <to...@co...> wrote: > > In message <2a2...@ma...> > you wrote: > > I wonder if the environment we construct is dodgy. Not sure. The error has moved a little bit today: =3D=3D29870=3D=3D Process terminating with default action of signal 11 (SIG= SEGV) =3D=3D29870=3D=3D at 0x4EACC1EC: dl_main (in /lib/ld-2.3.2.so <http://2.3.2= .so>) =3D=3D29870=3D=3D by 0x4EADA3F7: _dl_sysdep_start (in /lib/ld-2.3.2.so<http= ://2.3.2.so> ) =3D=3D29870=3D=3D by 0x4EACC082: _dl_start (in /lib/ld-2.3.2.so <http://2.3= .2.so>) =3D=3D29870=3D=3D by 0x4EACBC46: (within /lib/ld-2.3.2.so <http://2.3.2.so>= ) =3D=3D29870=3D=3D Is that using glibc-2.3.2-95.20.i686.rpm or a later update? $ rpm -qf /lib/ld-2.3.2.so <http://2.3.2.so> glibc-2.3.2-95.20 Also, I tried to see if I can debug core. Default "ulimit -c" is zero; setting it to unlimited changes the message: =3D=3D29946=3D=3D Process terminating with default action of signal 11 (SIG= SEGV): dumping core but no core* is produced :( I had better luck with --db-attach though: $ /usr/local/valgrind-3.0.svn/bin/valgrind --trace-syscalls=3Dyes --db-attach=3Dyes --db-command=3D"/usr/bin/gdb -q %f %p" ./a.out =3D=3D30076=3D=3D Memcheck, a memory error detector. =3D=3D30076=3D=3D Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward = et al. =3D=3D30076=3D=3D Using LibVEX rev 1450, a library for dynamic binary trans= lation. =3D=3D30076=3D=3D Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. =3D=3D30076=3D=3D Using valgrind-3.1.SVN, a dynamic binary instrumentation framework. =3D=3D30076=3D=3D Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward = et al. =3D=3D30076=3D=3D For more details, rerun with: -v =3D=3D30076=3D=3D SYSCALL[30076,1](122) sys_newuname ( 0xFEF130A4 )[sync] --> Success(0x0) SYSCALL[30076,1]( 45) sys_brk ( 0x0 ) --> [pre-success] Success(0x804A000) =3D=3D30076=3D=3D =3D=3D30076=3D=3D Process terminating with default action of signal 11 (SIG= SEGV): dumping core =3D=3D30076=3D=3D at 0x4EACC1EC: dl_main (in /lib/ld-2.3.2.so <http://2.3.2= .so>) =3D=3D30076=3D=3D by 0x4EADA3F7: _dl_sysdep_start (in /lib/ld-2.3.2.so<http= ://2.3.2.so> ) =3D=3D30076=3D=3D by 0x4EACC082: _dl_start (in /lib/ld-2.3.2.so <http://2.3= .2.so>) =3D=3D30076=3D=3D by 0x4EACBC46: (within /lib/ld-2.3.2.so <http://2.3.2.so>= ) =3D=3D30076=3D=3D =3D=3D30076=3D=3D ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- Y =3D=3D30076=3D=3D starting debugger with cmd: /usr/bin/gdb -q /proc/30077/f= d/1014 30077 Using host libthread_db library "/lib64/tls/libthread_db.so.1". Attaching to program: /proc/30077/fd/1014, process 30077 0x4eacc1ec in ?? () (gdb) x/i $pc 0x4eacc1ec: mov %ecx,(%esp) (gdb) x/x $esp 0xfef12ec0: Cannot access memory at address 0xfef12ec0 (gdb) shell cat /proc/30077/maps 0000000004000000-000000000435a000 rwxp 0000000000000000 00:00 0 000000000435a000-000000000435c000 ---p 000000000035a000 00:00 0 000000000435c000-000000000436c000 rwxp 000000000035c000 00:00 0 000000000436c000-000000000436e000 ---p 000000000036c000 00:00 0 000000000436e000-000000000437e000 rwxp 0000000000000000 00:00 0 0000000004577000-0000000005f69000 rwxp 0000000000000000 00:00 0 0000000008048000-0000000008049000 r-xp 0000000000000000 03:41 200779 /tmp/a.out 0000000008049000-000000000804a000 rwxp 0000000000000000 03:41 200779 /tmp/a.out 000000000804a000-000000000804b000 rwxp 0000000000000000 00:00 0 000000004eacb000-000000004eae0000 r-xp 0000000000000000 03:41 510225 /lib/ld-2.3.2.so <http://2.3.2.so> 000000004eae0000-000000004eae1000 rwxp 0000000000014000 03:41 510225 /lib/ld-2.3.2.so <http://2.3.2.so> 0000000070000000-0000000070135000 r-xp 0000000000000000 03:41 772191 /usr/local/valgrind-3.0.svn/lib/valgrind/x86-linux/memcheck 0000000070135000-0000000070136000 rwxp 0000000000135000 03:41 772191 /usr/local/valgrind-3.0.svn/lib/valgrind/x86-linux/memcheck 0000000070136000-00000000707c2000 rwxp 0000000000000000 00:00 0 00000000fef13000-00000000fef14000 rwxp 0000000000000000 00:00 0 00000000fff13000-00000000ffffe000 rw-p fffffffffff16000 00:00 0 (gdb) quit =3D=3D30076=3D=3D =3D=3D30076=3D=3D Debugger has detached. Valgrind regains control. We conti= nue. =3D=3D30076=3D=3D =3D=3D30076=3D=3D ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 fr= om 0) =3D=3D30076=3D=3D malloc/free: in use at exit: 0 bytes in 0 blocks. =3D=3D30076=3D=3D malloc/free: 0 allocs, 0 frees, 0 bytes allocated. =3D=3D30076=3D=3D For counts of detected errors, rerun with: -v =3D=3D30076=3D=3D No malloc'd blocks -- no leaks are possible. Segmentation fault Hmm; %esp is just below the stack page, and the kernel refuses to grow stack? In fact, increasing environment size moves the place where it coredumps, and invreasing it sufficiently makes it run: $ /usr/local/valgrind-3.0.svn/bin/valgrind -q ./a.out && echo ok =3D=3D30158=3D=3D =3D=3D30158=3D=3D Process terminating with default action of signal 11 (SIG= SEGV): dumping core =3D=3D30158=3D=3D at 0x4EACC1EC: dl_main (in /lib/ld-2.3.2.so <http://2.3.2= .so>) =3D=3D30158=3D=3D by 0x4EADA3F7: _dl_sysdep_start (in /lib/ld-2.3.2.so<http= ://2.3.2.so> ) =3D=3D30158=3D=3D by 0x4EACC082: _dl_start (in /lib/ld-2.3.2.so <http://2.3= .2.so>) =3D=3D30158=3D=3D by 0x4EACBC46: (within /lib/ld-2.3.2.so <http://2.3.2.so>= ) Segmentation fault $ PATH=3D$PATH:$PATH /usr/local/valgrind-3.0.svn/bin/valgrind -q ./a.out && echo ok =3D=3D30159=3D=3D =3D=3D30159=3D=3D Process terminating with default action of signal 11 (SIG= SEGV): dumping core =3D=3D30159=3D=3D at 0x4EADA172: _dl_sysdep_start (in /lib/ld-2.3.2.so<http= ://2.3.2.so> ) =3D=3D30159=3D=3D by 0x4EACC082: _dl_start (in /lib/ld-2.3.2.so <http://2.3= .2.so>) =3D=3D30159=3D=3D by 0x4EACBC46: (within /lib/ld-2.3.2.so <http://2.3.2.so>= ) Segmentation fault $ PATH=3D$PATH:$PATH:$PATH /usr/local/valgrind-3.0.svn/bin/valgrind -q ./a.= out && echo ok ok Cheers, |
|
From: Paul P. <ppl...@gm...> - 2005-11-12 19:57:36
|
On 11/12/05, Tom Hughes <to...@co...> wrote: > > > Try this patch That fixes it :) --10895-- Max kernel-supported signal is 64 --10895-- signal 11 arrived ... si_code=3D1, EIP=3D0x4EACC1EC, eip=3D0x477D= E15 --10895-- SIGSEGV: si_code=3D1 faultaddr=3D0xFEFCBEA0 tid=3D1 ESP=3D0xFEFCB= EA0 seg=3D0xFE7CD000-0xFEFCBFFF --10895-- -> extended stack base to 0xFEFCB000 ... Cheers, |
|
From: Aniruddha S. <sh...@cs...> - 2005-11-12 19:01:40
|
On Fri, 11 Nov 2005, Josef Weidendorfer wrote: Hi, As you have indicated, I too want to use the --simulate-hwpref option to determine the performance benefit with and without prefetcher. It serves as a measure of spatial locality in the profiled code. I am yet to view the output of --cacheuse option. Again, the objective is to understand the extent of spatial locality in the profiled code. Thanks, Aniruddha > On Friday 11 November 2005 05:05, you wrote: > > On Wed, 9 Nov 2005, Josef Weidendorfer wrote: > > Hi, > > > > I am running Callgrind with the options -v --log-file=summary --simulate-cache=yes > > --simulate-hwpref=yes --cacheuse=yes. The summary log file contains the > > lines: > > > > Prefetch Up: 0 > > Prefetch Down: 0 > > Oh, someone which is using the more advanced (and probably not that much tested), > code! Very good. It would be nice if you can tell me if these features are > useful for you. > > You can not use --simulate-hwpref=yes and --cacheuse=yes at once. This is > separated simulator code. > I will change the code to give out a warning regarding this, thanks. > > If I do e.g. > > callgrind -v --simulate-hwpref=yes ls > > This option also switches on cache simulation. I get > > --12922-- Prefetch Up: 1507 > --12922-- Prefetch Down: 36 > > so I think this still works fine. > > > What do these lines mean? From what I understand, --simulate-hwpref=yes > > simulates a hardware prefetcher, as is found in the Intel Pentium 4 > > processor. > > Yes. The P4 (and P-M) automatically detects upward and downward streaming, > stopping at 4kB boundaries (streams on virtual addresses get a disrupted > stream of physical addresses at 4kB boundaries because of VM). > > A nice thing is that the Pentium-M has hardware performance counters exact > for the Prefetch/Up and Prefetch/Down events, i.e. you can observe the > hardware prefetcher on the Pentium-M in action by using OProfile/Perfex/PAPI, > and compare the results with that from Callgrind. > > By using --simulate-hwpref=yes I add this heuristic, and presume that > every line loaded by the hardware prefetcher will give a hit when accessed > later on. > > Note that this is not always the case: the real access could come that > early that you still would get a miss in reality, even if the hardware > prefetcher has catched the line. > Unfortunately, callgrind has no way to get a simulated wall clock time, > which would be needed to detect such cases. > > So callgrind --simulate-hwpref will give the best case possible for the > prefetcher. In reality, it is between the results without and with this > option. > > The usage is to compare results with and without the prefetcher. > For functions where you see a big difference, the prefetcher is working > quite good, i.e. any microoptimizations to bring down the usual callgrind > results (without prefetcher) will not lead to any real improvements. > > But in the code regions, where the results are not really different, you > see that the prefetching heuristic of the P4/PM is not working, and you > can try to add software prefetch instructions (or otherwise change the code). > > A drawback is that callgrind does not take software prefetch instructions > into account, as Valgrind does not feed these instructions to the tool, but > ignores them. But if there really are users for this simulator enhancement, > we can try to include them into VG core (e.g. cachegrind). > > To make the comparision of the two runs more easy, I should include a compare > mode in KCachegrind. > > > Also, does --cacheuse=yes collect cache line utilization statistics i.e. > > what percentage of a line is utilized after being brought into cache and > > before being evicted from the cache? Where can this information viewed? > > Yes. The number of bytes never used in a cache line will be attributed to > the instruction which triggered the load. This is event SpLoss1 (for L1) > and more important SpLoss2 (for L2). > > The full amount of bytes loaded by an instruction is given by the number of > L1 or L2 misses this instruction gets attributed, multiplied with the cache > line size. In KCachegrind, add new derived events with the formula > "64 L1m" and "64 L2m" to directly get the numbers to compare. > > You can view this information with KCachegrind. Unfortunately, there was a > a hardcoded maximum of 10 event types in KCachegrind found till KDE 3.4.x. > And --cacheuse=yes gives you 12 event types, leading to a load error. > This changes in the version in KDE 3.5, or use the newest one from the > website (kcachegrind.sf.net). > > Theoretically, callgrind_annotate should be able to show these results, too. > For it to cope with the format, you have to additionally provide > --compress-pos=no --compress-strings=no > on the callgrind line. Even then, it fails with > Line xxxx: summary event and total event mismatch > > Oh yeah, it is time to provide a better command line tool... > > Josef > > > > > > Thanks, > > Aniruddha > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. Download > it for free - -and be entered to win a 42" plasma tv or your very own > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > -- ----------------------------------------------------------------------------------------- Aniruddha G. Shet | Project webpage: http://forge-fre.ornl.gov/molar/index.html Graduate Research Associate | Project webpage: http://www.cs.unm.edu/~fastos Dept. of Comp. Sci. & Engg | Personal webpage: http://www.cse.ohio-state.edu/~shet The Ohio State University | Office: DL 474 2015 Neil Avenue | Phone: +1 (614) 292 7036 Columbus OH 43210-1277 | Cell: +1 (614) 446 1630 ----------------------------------------------------------------------------------------- |
|
From: Tom H. <to...@co...> - 2005-11-12 18:50:51
|
In message <414...@lo...>
Tom Hughes <to...@co...> wrote:
> In message <2a2...@ma...>
> Paul Pluzhnikov <ppl...@gm...> wrote:
>
> > $ /usr/local/valgrind-3.0.svn/bin/valgrind -q --trace-signals=yes ./a.out
> > --30170-- Max kernel-supported signal is 64
> > --30170-- signal 11 arrived ... si_code=196609, EIP=0x4EACC1EC,
> > eip=0x477DC95
> > --30170-- SIGSEGV: si_code=196609 faultaddr=0xFEF88EC0 tid=1 ESP=0xFEF88EC0
> > seg=0xFE78A000-0xFEF88FFF
>
> The si_code value is bogus (0x30001) so it doesn't realise it
> needs to extend the stack. This is the bug we discussed on the
> developer list last night but we thought it was a ppc specific
> bug. Obviously it affects all 2.4 kernels of a certain vintage.
Try this patch - it tweaks our signal handlers to discard anything
in the top half of the si_code value on linux as the kernel is
supposed to have masked that off already.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|