You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(1) |
2
(4) |
3
(2) |
4
|
|
5
(1) |
6
|
7
(1) |
8
(2) |
9
(3) |
10
(1) |
11
(6) |
|
12
|
13
(2) |
14
(4) |
15
(2) |
16
(1) |
17
(1) |
18
(24) |
|
19
(1) |
20
(4) |
21
(1) |
22
|
23
|
24
(5) |
25
(2) |
|
26
(6) |
27
(3) |
28
(5) |
|
|
|
|
|
From: Julian S. <js...@ac...> - 2017-02-09 18:01:33
|
Hi Rashawn,
Thank you for the offer of adding AVX-512 support, and sorry for the
slow response. Some of the Valgrind developers discussed this briefly
at Fosdem in Brussels last weekend and there was general agreement
that this would be a good thing to do.
I would be happy to be a point of contact for technical and process
assistance. I have both technical and process comments regarding
your proposal.
>From a process point of view:
* This is likely to take several months and may involve more than one
round of review and iteration. That's based on experience from
other large chunks of instruction-set development work.
* As an example, have a look the following 5 bugs, which show a staged
approach to implementation of the recent POWER ISA 3.0 extensions:
https://bugs.kde.org/show_bug.cgi?id=359767
https://bugs.kde.org/show_bug.cgi?id=361207
https://bugs.kde.org/show_bug.cgi?id=362329
https://bugs.kde.org/show_bug.cgi?id=363858
https://bugs.kde.org/show_bug.cgi?id=364948
* Patches should go on the bug tracker, as per the examples above, and
will be reviewed there.
* All contributions to the tree need to be licensed "GNU GPL 2 or
later". Are you OK with that? GPL 2-only is not possible.
* There is a general, although largely unstated, expectation that parties
who contribute large chunks of code continue afterwards to provide at
least some minimal level of support/bugfixing, especially around
release-time. We've had problems in the past with large bits of the
code going into the tree and the developers later simply disappearing,
and would prefer to avoid that in future. Would you be able to
provide that level of support going forward?
* Similarly, there is an expectation that you have some machine which
can run nightly tests (from our framework) and send results to the
valgrind-testresults mailing list. Since none of the developers
(AFAIK) have AVX512 capable hardware, we have no other way to know
whether the support is working.
* VEX is basically a mini-compiler for basic blocks. Not essential,
but it will help if your developer(s) have a bit of basic background
in compiler internals.
Regarding your proposed implementation steps, they sound plausible.
However:
* You need a step zero, which is to extend Valgrind's HW capabilities
detection (coregrind/m_machine.c) to detect AVX512 support and tell
VEX about it. That has to happen before any insns get implemented.
* Also, you will need to extend the implementation of XSAVE and XRSTOR
to cover the new register state. Given the inflexibility of VEX's
IR (intermediate representation), the current AVX2-level XSAVE and
XRSTOR was difficult to implement and is hard to understand, so this
is likely to be a challenge. I suggest you deal with it sooner
rather than later, since we've found that runtime libraries rely on
XSAVE and XRSTOR and so you won't be able to run any real code with
AVX512 until those two are working.
* I assume (although you didn't say this) that you are doing this for
the 64-bit instruction set only. Our 32 bit insn set support is
essentially legacy, having stopped at SSSE3, and doesn't have a
proper prefix decoder in the same way that the 64 bit front end
does.
* Write test cases for the insns first, and make sure they are
comprehensive enough and work well. This reduces the general stress
and difficulty of implementing the instructions. Bear in mind that
incorrect instruction emulation can corrupt program state in a way
that isn't apparent until hundreds of millions of instructions
later, by which time it is impossible to figure out what went wrong.
So a good test suite is essential. See for example
none/tests/amd64/avx2-1.c and many others in the same directory.
* Some of the existing AVX256 insn implementations are less than
ideal, in the sense that they generate very verbose IR that performs
operations a lane at a time, rather than as a vector as a whole.
That gives rise to problems like
https://bugs.kde.org/show_bug.cgi?id=375839
The practical consequence is that (often) you won't be able to just
implement a 512-bit variant of an existing 256-bit insn by doubling
up the IR -- we'll have to do something better (wider and shallower)
here.
* If -- as seems likely -- you need to add new IROps to facilitate
this support, then you will also need to add support for them in
memcheck/mc_translate.c.
* Since you are adding register state, you'll need to futz with
memcheck/mc_machine.c too.
* You will need to be careful to ensure that the back end provides
SIMD integer support capable of supporting Memcheck's instrumentation
of the front end's SIMD FP IR. Without that, you'll wind up in a
situation where you can run AVX512 code with the 'none' tool but not
with 'memcheck'. This is an arcane but important detail. We can
come back to it later.
J
|
|
From: Mike L. <mik...@gm...> - 2017-02-09 01:04:05
|
Adding some example output:
With default flags:
==3382== Callgrind, a call-graph generating cache profiler
==3382== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et
al.
==3382== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==3382== Command:
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
==3382==
==3382== For interactive control, run 'callgrind_control -h'.
PARSEC Benchmark Suite Version 3.0-beta-20150206
Num of Options: 4
Num of Runs: 100
Size of data: 160
*Thread switched to: 4*
*Thread switched to: 3*
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 5*
*Thread switched to: 4*
*Thread switched to: 1*
==3382==
==3382== Events : Ir
==3382== Collected : 569502
==3382==
==3382== I refs: 569,502
With --fair-sched=yes:
==3375== Callgrind, a call-graph generating cache profiler
==3375== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et
al.
==3375== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==3375== Command:
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
==3375==
==3375== For interactive control, run 'callgrind_control -h'.
PARSEC Benchmark Suite Version 3.0-beta-20150206
Num of Options: 4
Num of Runs: 100
Size of data: 160
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 3*
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 4*
*Thread switched to: 3*
*Thread switched to: 1*
*Thread switched to: 4*
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 2*
*Thread switched to: 1*
==3375==
==3375== Events : Ir
==3375== Collected : 569505
==3375==
==3375== I refs: 569,505
On Wed, Feb 8, 2017 at 7:56 PM Mike Lui <mik...@gm...> wrote:
> I'm working on a project that leverages Callgrind to generate VEX IR
> traces. I'm using Valgrind 3.12.0.
> I also use Callgrind's infrastructure to detect when Valgrind switches
> thread contexts, however I'm getting unexpected behavior.
>
> It looks like the best place to detect a thread context switch in
> Callgrind is in CLG_(setup_bbcc) in bbcc.c (line 561):
>
> /* This is needed because thread switches can not reliable be tracked
> * with callback CLG_(run_thread) only: we have otherwise no way to get
> * the thread ID after a signal handler returns.
> * This could be removed again if that bug is fixed in Valgrind.
> * This is in the hot path but hopefully not to costly.
> */
> tid = VG_(get_running_tid)();
> #if 1
> /* CLG_(switch_thread) is a no-op when tid is equal to CLG_(current_tid).
> * As this is on the hot path, we only call CLG_(switch_thread)(tid)
> * if tid differs from the CLG_(current_tid).
> */
> if (UNLIKELY(tid != CLG_(current_tid)))
> CLG_(switch_thread)(tid);
>
> The above is called every instrumented basic block.
> I've noticed strange behavior, where* a thread switch would not always be
> detected.*
> I detected the unexpected behavior with the following modifications:
>
> To investigate further, I modified the above:
> - if (UNLIKELY(tid != CLG_(current_tid)))
> + if (UNLIKELY(tid != CLG_(current_tid))) {
> CLG_(switch_thread)(tid);
> + VG_(printf)("Thread switched to: %d\n", tid);
> + }
>
>
> - With this change, I run the parsec 3.0 benchmark blackscholes with 4
> threads, input_test.tar, and expect to see *5 *threads (numbered 1-5,
> 1 master and 4 worker threads) printed.
> - Under default flags, I'm seeing all 5 threads printed
> - when I add --fair-sched=yes, often I'd see the last thread (5) *not
> printed*.
> - I confirmed this behavior by printing VG_(get_running_tid)() every
> instrumented basic block.
> - I know that the thread switch happened or else the application would
> have failed.
>
> This does not happen all the time but it happens on the majority of runs. I
> also noticed that if I put a print statement in the blackscholes worker
> thread, the unexpected behavior manifests far less often. I conclude it
> must have something to do with the thread exiting too quickly and not
> having enough work to do.
>
> *Is this considered a bug? If not, how do I detect every time the Valgrind
> thread context changes. I saw this thread
> <http://valgrind-developers.narkive.com/ualztznb/thread-change-callback>from
> a long time ago but I'm not sure if there's been any progress.*
>
> $ uname -a
> Linux ubuntu-VirtualBox 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul
> 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> *Steps to reproduce:*
> mkdir detect_thread_switch && cd detect_thread_switch
> curl -L http://parsec.cs.princeton.edu/download/3.0/parsec-3.0-core.tar.gz
> | tar xz
> parsec-3.0/bin/parsecmgmt -a build -p blackscholes -c gcc-pthreads
> tar xf parsec-3.0/pkgs/apps/blackscholes/inputs/input_test.tar
>
> curl -L http://valgrind.org/downloads/valgrind-3.12.0.tar.bz2 | tar xj
> *# MAKE THE CHANGE TO bbcc.c TO PRINT THREAD ID ON THREAD SWITCH*
>
> cd valgrind-3.12.0 && ./autogen.sh && ./configure
> make -j4 && cd ..
>
> *# WILL SHOW THREADS 1-5*
> valgrind-3.12.0/vg-in-place --tool=callgrind
> parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
> 4 in_4.txt prices.txt
>
> *# MAY HAVE TO RUN SEVERAL TIMES IN SUCCESSION, WILL EVENTUALLY BE MISSING
> THREAD 5*
> valgrind-3.12.0/vg-in-place --fair-sched=yes --tool=callgrind
> parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
> 4 in_4.txt prices.txt
>
> Thanks!
> Mike
>
|
|
From: Mike L. <mik...@gm...> - 2017-02-09 00:56:23
|
I'm working on a project that leverages Callgrind to generate VEX IR
traces. I'm using Valgrind 3.12.0.
I also use Callgrind's infrastructure to detect when Valgrind switches
thread contexts, however I'm getting unexpected behavior.
It looks like the best place to detect a thread context switch in Callgrind
is in CLG_(setup_bbcc) in bbcc.c (line 561):
/* This is needed because thread switches can not reliable be tracked
* with callback CLG_(run_thread) only: we have otherwise no way to get
* the thread ID after a signal handler returns.
* This could be removed again if that bug is fixed in Valgrind.
* This is in the hot path but hopefully not to costly.
*/
tid = VG_(get_running_tid)();
#if 1
/* CLG_(switch_thread) is a no-op when tid is equal to CLG_(current_tid).
* As this is on the hot path, we only call CLG_(switch_thread)(tid)
* if tid differs from the CLG_(current_tid).
*/
if (UNLIKELY(tid != CLG_(current_tid)))
CLG_(switch_thread)(tid);
The above is called every instrumented basic block.
I've noticed strange behavior, where* a thread switch would not always be
detected.*
I detected the unexpected behavior with the following modifications:
To investigate further, I modified the above:
- if (UNLIKELY(tid != CLG_(current_tid)))
+ if (UNLIKELY(tid != CLG_(current_tid))) {
CLG_(switch_thread)(tid);
+ VG_(printf)("Thread switched to: %d\n", tid);
+ }
- With this change, I run the parsec 3.0 benchmark blackscholes with 4
threads, input_test.tar, and expect to see *5 *threads (numbered 1-5, 1
master and 4 worker threads) printed.
- Under default flags, I'm seeing all 5 threads printed
- when I add --fair-sched=yes, often I'd see the last thread (5) *not
printed*.
- I confirmed this behavior by printing VG_(get_running_tid)() every
instrumented basic block.
- I know that the thread switch happened or else the application would
have failed.
This does not happen all the time but it happens on the majority of runs. I
also noticed that if I put a print statement in the blackscholes worker
thread, the unexpected behavior manifests far less often. I conclude it
must have something to do with the thread exiting too quickly and not
having enough work to do.
*Is this considered a bug? If not, how do I detect every time the Valgrind
thread context changes. I saw this thread
<http://valgrind-developers.narkive.com/ualztznb/thread-change-callback>from
a long time ago but I'm not sure if there's been any progress.*
$ uname -a
Linux ubuntu-VirtualBox 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24
21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
*Steps to reproduce:*
mkdir detect_thread_switch && cd detect_thread_switch
curl -L http://parsec.cs.princeton.edu/download/3.0/parsec-3.0-core.tar.gz
| tar xz
parsec-3.0/bin/parsecmgmt -a build -p blackscholes -c gcc-pthreads
tar xf parsec-3.0/pkgs/apps/blackscholes/inputs/input_test.tar
curl -L http://valgrind.org/downloads/valgrind-3.12.0.tar.bz2 | tar xj
*# MAKE THE CHANGE TO bbcc.c TO PRINT THREAD ID ON THREAD SWITCH*
cd valgrind-3.12.0 && ./autogen.sh && ./configure
make -j4 && cd ..
*# WILL SHOW THREADS 1-5*
valgrind-3.12.0/vg-in-place --tool=callgrind
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
*# MAY HAVE TO RUN SEVERAL TIMES IN SUCCESSION, WILL EVENTUALLY BE MISSING
THREAD 5*
valgrind-3.12.0/vg-in-place --fair-sched=yes --tool=callgrind
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
Thanks!
Mike
|